Gene Expression Bioinformatics Team
Canada's Michael Smith Genome Sciences Centre
BC Cancer Research Centre
BC Cancer Agency
Vancouver, BC, Canada
Copyright©2003
Version | Date | Author | Comment |
---|---|---|---|
0.1 | 20-Aug-2003 | Scott Zuyderduyn (scottz@bcgsc.ca) | Initial draft. |
1.0 |
9-Sept-2003 |
Chris Fjell (cfjell@bcgsc.ca) |
Final Draft. Includes some text kindly contributed by Greg Vatcher. |
This document describes the functionality and usage of the SAGE Plugin. This is an independent add-on for the DISCOVERYspace software that provides specific features for serial analysis of gene expression (SAGE) data. The intended audience is the end-user.
(not yet compiled)
You may see these icons marking text within the documentation:
If you are already familiar with the DISCOVERYspace software, then you can jump right into this document. However, if you are new to DISCOVERYspace, it's a good idea to read the DISCOVERYspace Users' Manual before reading this document. This manual will assume you have a working knowledge of the DISCOVERYspace software.
The DISCOVERY Platform is a comprehensive set of software tools to store, visualize, and manipulate genomic data. The Platform consists of several components:
This document is organized around the feature set of the DISCOVERY Platform, not from a perspective of use for a specific purpose. The user is encouraged to read through Chapter 11 How-To (page 43).
6.1 The SAGE Plugin
Minimum
The SAGE Plugin requires a copy of DISCOVERYspace 3.0 to be installed on the user's computer. Hardware requirements for DISCOVERYspace can be found in the DISCOVERYspace User's Manual.
Recommended
>1Gb RAM if the number of SAGE libraries to be analyzed concurrently is large (>10).
A common question is why DISCOVERYspace requires so much memory, especially for SAGE analyses. The Java language is pseudo-compiled. This means that while the code is compiled for optimization, it is not fully translated into machine language (the .exe files you normally run on the Windows OS are executable machine language files). This is because Java is cross-platform, and in order to maintain compatibility across operating systems, the code can not be compiled to the native language of the OS. The pseudo-compiled code (in .class files) is interpreted by the Java Virtual Machine (JVM), which provides a layer between the pseudo-compiled code and the operating system. The JVM requires processing power to do this, and so Java programs typically run a little slower. In order for this type of approach to work, objects in the system require a lot of information to be associated with them. This increases the minimum memory size that an object must take in order to exist. Without getting into too much detail, the result is that a string of length 10 (a SAGE tag, for example), would normally occupy 10 bytes; however, a Java string requires about 80 bytes of initial memory, plus the size of the string itself, for a total of 90 bytes. Thus, at a minimum, 9 times more memory is required to store a SAGE library in Java. Additional memory is also required in order to optimize further operations on these objects within DISCOVERYspace (linking to other objects, etc.). The trade-off of these increased hardware requirements is that Java development is typically much quicker, and the software is capable of running on any OS (Windows/Mac/Linux/etc.) that has a JVM available.
The following terms are used in this document to refer to DISCOVERY Platform features. These include application-specific terms to clarify software terms for the general user and administrator.
7.1 Dataset
A specific selection of data from a single datasource. This is typically the resulting items matching a search against a datasource.
7.2 Datasource
A set of related data from a single source. For example, LocusLink is a datasource, containing a set of data with relationships between the data fields determined by the administrators of LocusLink.
7.3 Right click (Left click) actions
User actions corresponding to using the left mouse button and right mouse button. Users having other pointing devices will be configured differently: left-click is also sometimes called simply select or primary select; while right-click is also called alternate select. The terms, left- and right-click are sometimes used in this document for brevity. Left-click and right-click buttons may also differ with certain operating systems.
8.1 SAGE Plugin Installation
The SAGE Plugin is installed by adding the distribution files to the plugins/ directory, which resides in the directory where DISCOVERYspace is installed. For example, if DISCOVERYspace has been installed to C:\Program Files\DISCOVERYspace3, then the SAGE Plugin distribution files would be placed in the C:\Program Files\DISCOVERYspace3\plugins directory.
8.2 Confirming Installation
When DISCOVERYspace is executed, the application will look for installed plugins. You can confirm that the SAGE Plugin is correctly installed and loaded by noting it's appearance on the initial loadup screen (Figure 8.2).
Figure 8.2 The startup splash screen will show which plugins have been installed.
This section describes the different components of the SAGE Plugin. If you are familiar with the SAGE Plugin and are primarily interested in learning how to do a specific task, you may want to jump to the next section.
9.1 SAGE Plugin Application
On start-up, the main application window will be displayed. When the SAGE Plugin is installed, several modifications to the interface can be noted. These are described in detail below:
9.1.1 Status Bar
Located at the bottom left of the main frame (on the right side of the status bar), the SAGE Control Panel toggle button is available (Figure 9.1.1). Clicking this button will toggle the visible state of the SAGE Control Panel, which provides easy access to all saved or defined SAGE library definitions and expression profiles (see SAGE Control Panel).
Figure 9.1.1 The SAGE Control Panel toggle button (depicted with three blue cartoon SAGE tags).9.1.2 Menu Bar
When the SAGE Plugin is installed, additional options become available on the application's Menu Bar.
9.1.2.1 Tools > SAGE Menu
The Tools > SAGE submenu contains the following menu items:
Icon Menu Item Shortcut Description
Define a GEO Library
Open the widget to define a new GEO library.
View Defined Libraries and Comparisons
Toggles the visibility of the SAGE Control Panel. Define New Library
Opens the widget to define a new SAGE library.
Define New Library Comparison
Opens the widget to define a new pair-wise expression profile. Multi-Library Set Manipulations
Opens the "Venn Table" widget for set manipulations of multiple libraries.
Search For Tag In Libraries
Opens the "Search For Tag In Libraries" widget that looks for the expression of a given tag amongst available SAGE libraries. 9.1.3 SAGE Toolbar
The SAGE toolbar provides shortcut access to common operations for SAGE analysis. This toolbar is only visible when the View > Toolbars > SAGE option is selected. By default, the SAGE toolbar is not visible.
Toolbar
ButtonName Shortcut Description Define New Library
Opens the widget to define a new SAGE library. View Comparison Details
View an overview of the mathematical properties of the currently loaded expression profile. 2D Expression Profile
View the current expression profile using a 2D graph. 3D Expression Profile
View the current expression profile using an interactive 3D graph. Multi-Library Set Manipulations
Opens the "Venn Table" widget for set manipulations of multiple libraries. Table 9.4 SAGE Toolbar Buttons
9.2 Defining and Loading Libraries
SAGE libraries are loaded using the dialogs accessible from
the Tools > SAGE > Define New Library menu or the Tools > SAGE
> GEO > Define a GEO Library menu. The following dialog
appears when the Tools > SAGE > Define New Library is selected
(the
dialog for GEO Libraries is similar),
listing all of the available
libraries sorted by Identifier. As with all DS windows clicking on a
column header sorts the table by that column (for example: clicking
on the ‘NCBI Taxonomy’ header will sort the table by species).
Select libraries by clicking on them. Select multiple libraries by holding down the Ctrl key, select a range of libraries by holding the Shift key while selecting the top and bottom of the range. In this case a lung cancer library and a normal lung library are selected.
The Options available here are:
Mask Duplicate Ditags - removes
tags contributed from duplicate ditags (a potential source of
introduced PCR bias)
Mask Singletons - clicking this allows more options:
Minimum Quality - removes tags according to quality (if it is reported in the library), with associated quality value. The option "Ignore quality for sequence" includes tags of lower quality tags where some tags (2 in the example) of the same sequence meet the specified quality.
To view the SAGE data, open the SAGE Control Panel to continue with
the SAGE data, by selecting the menu item, Tools > SAGE >
View Defined Libraries and Comparisons or the icon at the bottom left of the screen.
Right-clicking on one or more selected Libraries table rows shows the options
Save - saves to local disk
Load - loads data from database or local disk
Remove - removes items from list
View - opens display to view the data (enabled after data is loaded)
Right-clicking on one or more selected Comparisons table rows shows the options
Save - saves to local disk
Load - loads data from database or local disk and performs the
comparison calculations
2D Expression Profile - View the current expression profile using a 2D graph.
3D Expression Profile - View the current expression profile using an interactive 3D graph.
Multi-Library Set Manipulations - Opens the "Venn Table" widget for set manipulations of multiple libraries.
Comparisons between libraries are performed by first defining the
library comparison. Select from the main menu,
Tools -> SAGE -> Define New Library Comparison, to
open the ‘Define
Expression Comparison’ window which lists the previously
defined libraries.
Figure 9.4.1 Expression Comparison Dialog
Figure 9.6.2 2D Expression Profile
Display
The Select menu contains options for selecting tags: Select Upregulated
Datapoints, Select Downregulated Datapoints, Select Insignificant
Datapoints, Deselect All Datapoints, Select Datapoints Based On
Criteria.
For example, selecting upregulated
datapoints results in the following display.
Figure 9.6.3 2D Expression Profile
Display With Up-Regulated Tags Selected
The tool bar buttons at the top of the display perform the following
actions.
Icon |
Action |
![]() |
Select data points |
![]() |
Zoom into selected region |
![]() |
Drag selected data points |
![]() |
Capture graph to image file |
![]() |
Export graph data to file |
![]() |
Print graph |
![]() |
Change to default graph view |
Represent counts (number of tags represented by a single data point) by rings arount data points: | |
![]() |
Differentially expressed |
![]() |
Similarily expressed |
![]() |
Selected data points |
Selected points can be identified by selecting Drag Mode (select the button), then
right click a point and drag it to the background of the workspace. A
Data Viewer display similar to the following figure will be displayed.
From the Data Viewer, the genes represented by the tags can be
identified. (See the DISCOVERY Platform User's Manual for description
of the Data Viewer display.)
Figure 9.6.4 Data Viewer Listing
Up-Regulated Tags
Figure 9.6.5 Data Viewer Listing
Up-Regulated Tags and Genes
Selecting Multi-Library Set
Manipulations from the SAGE Control Panel displays the following
display. The left pane displays the list of libraries for this
display. Tags may be dragged from Data Viewer tables to the Auxilliary
Tags panel accessible from the corresponding tab. The right panel
displays the tags in each library and/or comparisons between tags in
the libraries.
Figure 9.7.1 Multi-Library Set
Manipulations Display
The tool bar buttons have the
following meanings. On the libraries panel:
Icon |
Description |
![]() |
New Venn table |
![]() |
Select all |
![]() |
Toggle row selection |
![]() |
Tag count cutoff. Tags with
fewer counts than this are ignored in all calculations. |
![]() |
Ignore all tags from this library |
![]() |
Include all tags from this
library |
![]() |
Include all tags from this
library that are in common |
![]() |
Exclude all tags from this library |
On the tags panel:
Icon |
Description |
![]() |
Save data to file |
![]() |
Copy data to clipboard |
![]() |
Select all |
![]() |
Toggle row selection |
![]() |
Display mode selection. Options
are Count - tag counts Frequency - frequency of tags in library Ratio - ratio of tags between libraries Ratio, multi ref - ratio of tags between libraries, calculated differently (???) P Value - p value for tag P Value, multi ref - p value for tag, calculated differently (???) |
![]() |
Enable value cutoff. Tags with
counts below this are ignored and removed from the table. |
![]() |
Cutoff for displayed values |
![]() |
Filter super-singleton tags |
The Tools > SAGE > Search For Tag In Libraries menu item
raises a new display containing a list of all available SAGE libraries
on the left-hand pane (titled Available), and a table showing the
abundance of the tags in each library in the right-hand pane (titled
Results). Tags are added to the top of the Results table after being
added to the display. Tags may be added one at a time using the dialog
at the bottom of the left-hand pane
( in the figure
below) and pressing the button (
).
The tags may also be added to the Search For Tag display by dragging
rows from a Data Viewer displays.
The menus available from the Search For Tag display are the following: