Our widgets for functional genomics use Orange, a data mining
and machine learning suite. Orange can be accessed through scripting in
Python,
or by visual programming in Orange Canvas.
In functional genomics, we have also designed a web-based tool for mutant data analysis called GenePath (also featured
in Science's NetWatch).
The microarray data file
"dicty_800_genes_from_table07.tab" is loaded into the
schema with the "File" widget. Recently opened files
are listed in the pull-down combo box.
Data Format
Among other formats, Orange can read tab-delimited files. Looking
at the snapshot below one can see the required format for the
tab-delimited files that can be read into schemas. The first three
rows contain information about attributes (columns). The first row
gives the names of attributes (20hr, 22hr,...), second row states the
attribute types ("continuous" for microarray measurements,
"discrete" or "string" for additional information
such as annotation that can be displayed, for example in the Heat Map
widget), and the third row states how the attributes should be
treated. Notice that cluster ID is a special attribute called class,
and will be used as such when displaying the information. Meta
attributes are those that will not be considered in visualization
directly, but only indirectly when displaying additional information
about genes. For details on other Orange formats and for a concise
definition of keywords and syntax, see Orange
File Formats and a section in Orange
For Beginners.
The remaining rows contain the data. Each contains an expression
profile of a gene, plus any additional information about the gene. The
annotation is optional, since it can be added to microarray data with
the GO Term Finder widget. The only required column is the one with
gene names (in our case "DDB"), because this attribute is
also the key in the data files with Gene Ontology annotation and
genome mapping. It does not matter which name one assigns to such
attribute, as widgets that handle the gene annotation or display
genome map allow users to define which of the attributes holds the
information on gene names. In most cases, the identification of a name
attribute will be done in such widgets automatically.