FRI > Biolab > Supplements

Data set name: lung


Original data set (Bhattacharjee et al.)
Data set for Orange
Brief description:
This classification model was built to distinguish between four different lung tumors (adenocarcinomas, small-cell lung carcinomas, squamous cell carcinomas and carcinoids) and normal lung tissue on the basis of DNA expression signatures.

Platform: Affymetrix Human Genome U95Av2 Array

Diagnostic classes:
- adenocarcinoma (AD): 139 examples (68.5%)
- normal lung (NL): 17 examples (8.4%)
- small cell lung cancer (SMCL): 6 examples (3.0%)
- squamous cell carcinoma (SQ): 21 examples (10.3%)
- pulmonary carcinoid (COID): 20 examples (9.9%)
Number of genes: 12600
Number of samples: 203
Predictive accuracy with 10-fold cross validation (classifying using the best projection with eight attributes):
Classification accuracy: 94.07%
Area under curve (AUC): 0.995
Following are the three best-ranked visualization with eight, six and four attributes in respect to the visualization score, that is, visualizations where examples from different diagnostic classes are best separated:

Score: 92.28%
Genes:
37741_at: pyrroline-5-carboxylate reductase 1, PYCR1
35276_at: claudin 4, CLDN4
613_at: "keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Cockayne types)", KRT5
36133_at: desmoplakin, DSP
40619_at: ubiquitin-conjugating enzyme E2S, UBE2S
37398_at: Platelet/endothelial cell adhesion molecule (CD31 antigen), PECAM1
36160_s_at: "protein tyrosine phosphatase, receptor type, N polypeptide 2", PTPRN2
40825_at: "microtubule-associated protein, RP/EB family, member 3", MAPRE3
Score: 92.80%
Genes:
37741_at: pyrroline-5-carboxylate reductase 1, PYCR1
36105_at: carcinoembryonic antigen-related cell adhesion molecule 6 (non-specific cross reacting antigen), CEACAM6
1814_at: "transforming growth factor, beta receptor II (70/80kDa)", TGFBR2
36160_s_at: "protein tyrosine phosphatase, receptor type, N polypeptide 2", PTPRN2
39990_at: "ISL1 transcription factor, LIM/homeodomain, (islet-1)", ISL1
613_at: "keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Cockayne types)", KRT5
Score: 85.15%
Genes:
36105_at: carcinoembryonic antigen-related cell adhesion molecule 6 (non-specific cross reacting antigen), CEACAM6
32542_at: four and a half LIM domains 1, FHL1
39990_at: "ISL1 transcription factor, LIM/homeodomain, (islet-1)", ISL1
31791_at: tumor protein p73-like, TP73L

Attribute ranking

Following is the histogram of genes showing how often are they present in one of the top 100 radviz visualizations with 8 attributes.

Genes:
613_at: "keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Cockayne types)", KRT5
41231_f_at: high-mobility group nucleosomal binding domain 2, HMGN2
37741_at: pyrroline-5-carboxylate reductase 1, PYCR1
31791_at: tumor protein p73-like, TP73L
35868_at: advanced glycosylation end product-specific receptor, AGER
893_at: ubiquitin-conjugating enzyme E2S, UBE2S
35276_at: claudin 4, CLDN4
268_at: platelet/endothelial cell adhesion molecule (CD31 antigen), PECAM1
36105_at: carcinoembryonic antigen-related cell adhesion molecule 6 (non-specific cross reacting antigen), CEACAM6
894_g_at: ubiquitin-conjugating enzyme E2S, UBE2S
40237_at: "pleckstrin homology-like domain, family A, member 2", PHLDA2
1814_at: "transforming growth factor, beta receptor II (70/80kDa)", TGFBR2
36119_at: "caveolin 1, caveolae protein, 22kDa", CAV1
37398_at: Platelet/endothelial cell adhesion molecule (CD31 antigen), PECAM1
32542_at: four and a half LIM domains 1, FHL1
36569_at: "C-type lectin domain family 3, member B", CLEC3B
39990_at: "ISL1 transcription factor, LIM/homeodomain, (islet-1)", ISL1
40304_at: dystonin, DST
38261_at: "ATP-binding cassette, sub-family C (CFTR/MRP), member 3", ABCC3
33693_at: desmoglein 3 (pemphigus vulgaris antigen), DSG3