FRI > Biolab > Supplements

Data set name: GSE1987


Original data set (GSE1987)
Data set for Orange
Brief description:
The classification model for the lung_SQ_AD data set was built with Adenocarcinoma, Squamous cell carcinoma and normal lung tissue samples from the original dataset (GSE1987). The Adenosquamous sample was added to the Adenocarcinoma class. The metastasis samples were excluded from our analysis, because of their small number (3 samples).

Platform: Affymetrix GeneChip Human Genome U95 Version [1 or 2] Set HG-U95A

Diagnostic classes:
- Squamous cell carcinoma (Squamous): 17 examples (50.0%)
- Adenocarcinoma (Adenocarcinoma): 8 examples (23.5%)
- Normal lung tissue (Normal): 9 examples (26.5%)
Number of genes: 10541
Number of samples: 34
Note: From the originally measured 12625 probe sets we removed genes that were not present (P) in at least one sample
Predictive accuracy with 10-fold cross validation (classifying using the best projection with eight attributes):
Classification accuracy: 65.83%
Area under curve (AUC): 0.763
Following are the three best-ranked visualization with eight, six and four attributes in respect to the visualization score, that is, visualizations where examples from different diagnostic classes are best separated:

Score: 99.14%
Genes:
35454_at: phospholipase C-like 4, PLCL4
36280_at: granzyme K (serine protease, granzyme 3; tryptase II), GZMK
34301_r_at: keratin 17, KRT17
36186_at: RNA binding protein S1, serine-rich domain, RNPS1
34800_at: leucine-rich repeats and immunoglobulin-like domains 1, LRIG1
34928_at: zinc finger protein 205, ZNF205
36939_at: glycoprotein M6A, GPM6A
37983_at: angiotensin II receptor, type 1, AGTR1
Score: 98.06%
Genes:
36280_at: granzyme K (serine protease, granzyme 3; tryptase II), GZMK
34301_r_at: keratin 17, KRT17
36244_at: zinc finger protein 239, ZNF239
38582_at: serine protease inhibitor, Kazal type 1, SPINK1
37196_at: cadherin 5, type 2, VE-cadherin (vascular epithelium), CDH5
37983_at: angiotensin II receptor, type 1, AGTR1
Score: 93.84%
Genes:
35454_at: phospholipase C-like 4, PLCL4
34301_r_at: keratin 17, KRT17
41453_at: discs, large homolog 3 (neuroendocrine-dlg, Drosophila), DLG3
1186_at: growth differentiation factor 10, GDF10

Attribute ranking

Following is the histogram of genes showing how often are they present in one of the top 100 radviz visualizations with 8 attributes.

Genes:
41453_at: discs, large homolog 3 (neuroendocrine-dlg, Drosophila), DLG3
36883_at: keratin 13, KRT13
37196_at: cadherin 5, type 2, VE-cadherin (vascular epithelium), CDH5
36021_at: lymphoid enhancer-binding factor 1, LEF1
38379_at: glycoprotein (transmembrane) nmb, GPNMB
37983_at: angiotensin II receptor, type 1, AGTR1
37582_at: keratin 15, KRT15
34301_r_at: keratin 17, KRT17
1186_at: growth differentiation factor 10, GDF10
40092_at: bromodomain adjacent to zinc finger domain, 2A, BAZ2A
34342_s_at: secreted phosphoprotein 1 (osteopontin, bone sialoprotein I, early T-lymphocyte activation 1), SPP1
34194_at: Chloride intracellular channel 5, CLIC5
32971_at: chromosome 9 open reading frame 61, C9orf61
35630_at: lethal giant larvae homolog 2 (Drosophila), LLGL2
1343_s_at: serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3, SERPINB3
2027_at: S100 calcium binding protein A2, S100A2
37160_at: small proline-rich protein 1B (cornifin), SPRR1B
37473_at: keratin 16 (focal non-epidermolytic palmoplantar keratoderma), KRT16
32633_at: islet cell autoantigen 1, 69kDa, ICA1
575_s_at: tumor-associated calcium signal transducer 1, TACSTD1