This classification model is built with probably the most famous gene expression cancer dataset (Golub et al.), containing information on gene-expression in samples from human acute myeloid (AML) and acute lymphoblastic leukemias (ALL). The original research is one of the first to show a new approach to cancer classification based on gene expression monitoring by DNA microarrays.
Number of genes: 5147 Number of samples: 72 Note: From the originally measured 6817 probe sets we removed genes that were not present (P) in at least one sample
Predictive accuracy with 10-fold cross validation (classifying using the best projection with eight attributes):
Following are the three best-ranked visualization with eight, six and four attributes in respect to the visualization score, that is, visualizations where examples from different diagnostic classes are best separated: