For this data set we induced models for two different classification problems. With the first model we are simply trying to distinguish between normal tissue samples from the head and neck region and hypopharyngeal cancer samples.
Platform: Affymetrix GeneChip Human Genome U95 Version [1 or 2] Set HG-U95A
- hypopharyngeal cancer (T): 34 examples (89.5%)
- normal tissue (N): 4 examples (10.5%)
Number of genes: 9021 Number of samples: 38 Note: From the originally measured 12625 probe sets we removed genes that were not present (P) in at least one sample
Predictive accuracy with 10-fold cross validation (classifying using the best projection with eight attributes):
Following are three best-ranked visualization with eight, six and four attributes in respect to the visualization score, that is, visualizations where examples from different diagnostic classes are best separated:
Score: 100.00% Genes: 160042_s_at: homeo box B6, HOXB6 160041_at: protein tyrosine phosphatase, non-receptor type 18 (brain-derived), PTPN18 160024_at: cyclin-dependent kinase (CDC2-like) 10, CDK10 160020_at: matrix metalloproteinase 14 (membrane-inserted), MMP14 37762_at: epithelial membrane protein 1, EMP1 38051_at: mal, T-cell differentiation protein, MAL 37920_at: paired-like homeodomain transcription factor 1, PITX1 36569_at: C-type lectin domain family 3, member B, CLEC3B
Score: 100.00% Genes: 39333_at: collagen, type IV, alpha 1, COL4A1 160044_g_at: aconitase 2, mitochondrial, ACO2 160026_at: protein kinase, X-linked, PRKX 37762_at: epithelial membrane protein 1, EMP1 36890_at: periplakin, PPL 36569_at: C-type lectin domain family 3, member B, CLEC3B
Score: 100.00% Genes: 160029_at: protein kinase C, beta 1, PRKCB1 160027_s_at: insulin-like growth factor 2 receptor, IGF2R 36890_at: periplakin, PPL 36569_at: C-type lectin domain family 3, member B, CLEC3B
Following is the histogram of genes showing how often are they present in one of the top 100 radviz visualizations with 8 attributes.