Introduction Papillary thyroid carcinoma (PTC) is the most common endocrine malignancy and encompasses a variety of morphological/architectural variants, all of which are characterized by a distinctive nuclear appearance. In recent years, PTC has become an important paradigm of solid tumour molecular pathogenesis principally arising from intensive investigation prompted by the Chernobyl accident. 12 13 35 42 21 30 22 32 11 39 42 2 6 9 10 20 28 25 45 It is clear that this is a complex and contentious area, and that further work needs to be done to ascertain the underlying molecular biology of this particular variant. Recently, inroads into elucidation of molecular pathways underpinning PTC have been carried out using microarray studies. The overriding objective of these investigations was to identify clinically useful biomarkers. However, the majority of these studies have analysed PTC as though it were a homogenous singular entity without deference in a detailed manner to sub-variants and, in particular, the most common variant (FVPTC). The identification of specific biomarkers of FVPTC and a deeper understanding of its origins are clearly warranted. The aim of this expression microarray study using a novel microarray platform was twofold: to identify markers that distinguish PTC from benign thyroid tissue and lesions and, secondly, to identify potential markers and further explore the molecular pathology of FVPTC. Materials and methods Patients and tissue samples 24 1 Table 1 Sample cohort Identifier Diagnosis N1 Normal thyroid tissue N2 Normal thyroid tissue N3 Lymphocytic thyroiditis N4 Nodular hyperplasia N5 Follicular adenoma N6 Nodular hyperplasia with focal lymphocytic thyroiditis N7 Nodular hyperplasia N8 Follicular adenoma N9 Nodular hyperplasia N10 Follicular adenoma N11 Grave’s thyroiditis T1 Solid/FVPTC T2 FVPTC T3 PTC classic morphology T4 FVPTC-oxyphil T5 FVPTC T6 FVPTC T7 PTC classic morphology T8 FVPTC T9 PTC classic morphology T10 FVPTC T11 PTC classic morphology T12 FVPTC T13 PTC classic morphology T14 PTC classic morphology List of the 11 benign and 14 malignant lesions that were used in this study. FV PTC RNA isolation and characterization Samples were ground in liquid nitrogen and homogenised in RLT buffer (Qiagen, UK). RNA was then extracted using the RNeasy Mini Kit with optional on-column RNase-free DNase digestion (Qiagen) according to the manufacturer’s instructions. RNA quantity was determined using UV spectrophotometry. RNA quality was assessed using the RNA 6000 Nano LabChip® Kit in conjunction with the Agilent 2100 Bioanalyzer (Agilent Technologies, Waldbronn, Germany). Microarray analysis Applied Biosystems Human Genome Survey Arrays were used to analyse the transcriptional profiles of thyroid RNA samples in this study. Digoxigenin-UTP-labelled cRNA was generated and linearly amplified from 5 μg of total RNA using Applied Biosystems Chemiluminescent RT-IVT Labelling Kit v 2.0 using manufacturer’s protocol. 10 μg of labelled cRNA were hybridized to each pre-hybed microarray in a 1.5-ml volume at 55°C for 16 h. Array hybridization and chemiluminescence detection were performed using Applied Biosystems Chemiluminescence Detection Kit following manufacturer’s protocol. Images were collected for each microarray using the 1700 analyser. Images were auto-gridded and the chemiluminescent signals were quantified, corrected for background and spot, and spatially normalized. TaqMan® PCR validation 23 Statistical analysis Microarrays were analysed using Spotfire DecisionSite™ for Functional Genomics (Spotfire AB, Goteborg, Sweden) and R version 1.9.1, a free language and environment for statistical computing and graphics (R Development Core Team, 2004). Arrays were initially normalized, and genes were deemed undetectable and, therefore, excluded from final gene lists if they had a signal-to-noise ratio of less than three (S/N < 3) in greater than 18 of the 25 arrays. p p 3 p http://www.pantherdb.org Results 1 p 2 Table 2 Genes differentially expressed in malignant vs benign thyroid tissue Gene Name Gene Symbol p 1700 probe ID Genes up-regulated in malignant vs benign  Active BCR-related gene ABR 0.014482 154399  Adaptor-related protein complex 2, alpha 1 subunit AP2A1 0.0312 115368  Apoptosis, caspase activation inhibitor AVEN 0.030998 203738  BCL2-associated X protein BAX 0.009782 146510  BH3 interacting domain death agonist BID 0.021424 131216  Brain abundant, membrane attached signal protein 1 BASP1 0.024214 198318  Brain acyl-CoA hydrolase BACH 0.014566 133876  Bromodomain adjacent to zinc finger domain, 1A BAZ1A 0.03538 209809  Calcium/calmodulin-dependent protein kinase I CAMK1 0.041887 157712  Cathepsin S CTSS 0.046544 105790  CD44 antigen (homing function and Indian blood group system) CD44 0.044181 133604  Chemokine (C-X-C motif) ligand 16 CXCL16 0.043234 199059  Chromosome 1 open reading frame 38 C1orf38 0.049072 202924  CLIP-170-related protein CLIPR-59 0.023629 102205  Docking protein 1, 62 kDa (downstream of tyrosine kinase 1) DOK1 0.041898 204989  Epidermodysplasia verruciformis 1 EVER1 0.003348 175569  FXYD domain containing ion transport regulator 5 FXYD5 0.01444 154607  FXYD domain containing ion transport regulator 5 FXYD5 0.023629 112771  Galactose-4-epimerase, UDP GALE 0.047363 141143  Genethonin 1 GENX-3414 0.016836 124360  Hypothetical gene BC008967 BC008967 0.015683 108526  Hypothetical protein FLJ10849 FLJ10849 0.013822 224983  Hypothetical protein FLJ22531 FLJ22531 0.024391 145918  Hypothetical protein MGC4607 MGC4607 0.006507 211836  Intercellular adhesion molecule 1 (CD54), human rhinovirus receptor ICAM1 0.028746 109070  Jun dimerization protein p21SNFT SNFT 0.043301 144215  Lectin, galactoside-binding, soluble, 3 (galectin 3) LGALS3 0.034491 179836  Major vault protein MVP 0.0312 212354  Matrix metalloproteinase 14 (membrane-inserted) MMP14 0.038682 152076  Milk fat globule-EGF factor 8 protein MFGE8 0.02392 144588  Mst3 and SOK1-related kinase MST4 0.042028 112198  neuronal cell adhesion molecule NRCAM 0.011178 106462  Phospholipase D3 PLD3 0.034491 143388  Promyelocytic leukemia PML 0.016018 217558  Protein inhibitor of activated STAT protein PIASy PIASY 0.00536 153434  Protein tyrosine phosphatase, receptor type, E PTPRE 0.048653 221568  Rho GDP dissociation inhibitor (GDI) beta ARHGDIB 0.043853 143589  S100 calcium binding protein A11 (calgizzarin) S100A11 0.019933 145550  Similar to rat tricarboxylate carrier-like protein BA108L7.2 0.025387 179870  SP110 nuclear body protein SP110 0.0312 113484  Stimulated by retinoic acid gene 6 FLJ12541 0.043234 193986  Syndecan 3 (N-syndecan) SDC3 0.048804 143980  Tax interaction protein 1 TIP-1 0.006673 119665  TBC1 domain family, member 2 TBC1D2 0.029907 205982  Tenascin C (hexabrachion) TNC 0.032355 143831  Thymosin, beta 4, Y chromosome TMSB4Y 0.040937 193911  Tissue inhibitor of metalloproteinase 1 (erythroid potentiating activity, collagenase inhibitor) TIMP1 0.032306 134692  Topoisomerase (DNA) II alpha 170 kDa TOP2A 0.040937 135302  Transforming growth factor, beta 1 TGFB1 0.016836 170749  Transgelin 2 TAGLN2 0.031971 172296  Tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, eta polypeptide YWHAH 0.039009 188379  v-yes-1 Yamaguchi sarcoma viral related oncogene homolog LYN 0.016175 194134 Genes down-regulated in malignant vs benign  Aldehyde oxidase 1 AOX1 0.003227 106573  Ankyrin 2, neuronal ANK2 0.035357 155780  Aspartate beta-hydroxylase ASPH 0.002064 221656  Aspartate beta-hydroxylase ASPH 0.019666 114180  ATPase, Cu++ transporting, beta polypeptide ATP7B 0.044511 198852  Brain-specific protein p25 alpha p25 0.023629 120622  Casein kinase LOC149420 0.022382 149347  Cellular retinoic acid binding protein 1 CRABP1 0.008315 100295  Centromere protein J CENPJ 0.0312 164563  Ceroid-lipofuscinosis, neuronal 5 CLN5 0.011021 205999  Chloride channel Kb CLCNKB 0.040418 176266 N ChGn 0.013148 101140  Chromosome 11 open reading frame 8 C11orf8 0.001977 174025  Chromosome 11 open reading frame 8 C11orf8 0.004148 108279  Chromosome 21 open reading frame 4 C21orf4 0.0042 156895  Clusterin-like 1 (retinal) CLUL1 0.019631 186062  Component of oligomeric golgi complex 3 COG3 0.003664 129212  Coxsackie virus and adenovirus receptor CXADR 0.004648 108284  Crystallin, alpha B CRYAB 0.030418 190274 O CSE-C 0.040993 213856 l DCXR 0.001977 103350  DnaJ (Hsp40) homolog, subfamily B, member 4 DNAJB4 0.043853 103618 S. cerevisiae ERO1LB 0.013962 207998  Extracellular link domain containing 1 XLKD1 0.039738 195865  Family with sequence similarity 13, member A1 FAM13A1 0.019631 116936  Fatty acid binding protein 4, adipocyte FABP4 0.014832 150137  Fc fragment of IgG binding protein FCGBP 0.001965 118361  Fibroblast growth factor receptor 2 FGFR2 0.0042 110548  FLJ35740 protein FLJ35740 0.020224 101102  Friedreich ataxia region gene X123 X123 0.032602 133505  Glutamate-ammonia ligase (glutamine synthase) GLUL 0.014649 175147 l GATM 0.013962 111904  Glycoprotein M6A GPM6A 0.011739 215326  Growth hormone receptor GHR 0.017721 190306  HLA complex group 4 HCG4 0.025807 191199  Hypothetical protein BC009561 LOC119710 0.003227 211319  Hypothetical protein BC019238 LOC120379 0.013438 201200  Hypothetical protein FLJ13204 FLJ13204 0.003227 145066  Hypothetical protein FLJ13842 FLJ13842 0.016448 208504  Hypothetical protein FLJ14054 FLJ14054 0.049072 202017  Hypothetical protein FLJ20154 FLJ20154 0.014428 143310  Hypothetical protein FLJ20513 FLJ20513 0.019493 154130  Hypothetical protein FLJ32110 FLJ32110 0.015507 229492  Hypothetical protein FLJ32343 FLJ32343 0.012208 116902  Hypothetical protein FLJ33516 FLJ33516 0.03965 224600  Hypothetical protein FLJ37549 FLJ37549 0.001956 218577  Hypothetical protein FLJ39378 FLJ39378 0.005853 163223  Hypothetical protein FLJ40021 FLJ40021 0.023629 174198  Hypothetical protein LOC134285 LOC134285 0.018694 163671  Hypothetical protein MGC10946 MGC10946 0.022382 195982  Hypothetical protein MGC14425 MGC14425 0.015445 161569  Hypothetical protein MGC17299 MGC17299 0.026062 168452  Hypothetical protein MGC17943 MGC17943 0.0042 147296  Hypothetical protein MGC23980 MGC23980 0.018694 224619  Hypothetical protein MGC24047 MGC24047 0.001956 138122  Hypothetical protein MGC33607 MGC33607 0.033547 100645  Ionized calcium binding adapter molecule 2 IBA2 0.0042 179489  KIAA0390 gene product KIAA0390 0.014832 119936  KIAA0703 gene product KIAA0703 0.032602 146652  Lectin, mannose-binding, 1 LMAN1 0.031092 179632  Leiomodin 1 (smooth muscle) LMOD1 0.022352 120404  Likely ortholog of rat SNF1/AMP-activated protein kinase SNARK 0.044605 157942  LIM domain kinase 2 LIMK2 0.002409 151439  Low density lipoprotein-related protein 1B (deleted in tumors) LRP1B 0.00536 209464  Low density lipoprotein-related protein 2 LRP2 0.040937 114919  Matrilin 2 MATN2 0.0042 167316  Metallothionein 1A (functional) MT1A 0.013822 204773  Metallothionein 1A (functional)|metallothionein 1E (functional)|metallothionein 1K| metallothionein 2A MT1A|MT2A|MT1E|MT1K 0.027037 146368  Metallothionein 1A (functional)|metallothionein 1E (functional)|metallothionein 2A| metallothionein 1K MT1A|MT2A|MT1K|MT1E 0.043841 182305  Metallothionein 1A (functional)|metallothionein 2A|metallothionein 1K| metallothionein 1E (functional) MT1A|MT1K|MT1E|MT2A 0.011739 223856  Metallothionein 1B (functional) MT1B 0.019631 174119  Metallothionein 1F (functional) MT1F 0.024726 144569  Metallothionein 1G MT1G 0.0192 164525  Metallothionein 1G MT1G 0.03965 171539  Metallothionein 1J MT1J 0.008315 227956  Metallothionein 1X MT1X 0.008335 119685  Metallothionein 1X MT1X 0.010748 173072  Metallothionein IV MT4 0.007447 223241  Methionine adenosyltransferase II, alpha MAT2A 0.014428 158350  Mitogen-activated protein kinase 4 MAPK4 0.042306 131252  Myc-induced nuclear antigen, 53 kDa MINA53 0.011959 130284  NIMA (never in mitosis gene a)- related kinase 11 NEK11 0.001965 194628  Otospiralin LOC150677 0.018694 182360  PDZ/coiled-coil domain binding partner for the rho-family GTPase TC10 PIST 0.013822 103651  Phospholipase A2 receptor 1, 180 kDa PLA2R1 0.004029 134379  Phospholipase C-like 1 PLCL1 0.022657 206894  Phosphotidylinositol transfer protein, beta PITPNB 0.014428 122698  Polycystic kidney and hepatic disease 1 (autosomal recessive)-like 1 PKHD1L1 0.0042 199896  Polymerase (DNA directed) iota POLI 0.003227 167492  Potassium channel, subfamily K, member 9 KCNK9 0.000849 108648  Potassium channel-interacting protein 4 KCNIP4 0.011447 147058  Potassium inwardly-rectifying channel, subfamily J, member 13 KCNJ13 0.008972 124187  pp21 Homolog LOC51186 0.004326 127636  Pre-B cell leukemia transcription factor 4 PBX4 0.030311 199118  Protein kinase, cAMP-dependent, catalytic, beta PRKACB 0.023863 198878  Protein phosphatase 4, regulatory subunit 2|hypothetical protein LOC151987 PPP4R2|LOC151987 0.011338 200919  RAB23, member RAS oncogene family RAB23 0.000659 122394  Ras association (RalGDS/AF-6) domain family 6 RASSF6 0.048804 119072  Sarcoglycan, delta (35 kDa dystrophin-associated glycoprotein) SGCD 0.011178 120415  Serum deprivation response (phosphatidylserine binding protein) SDPR 0.011021 156433  SH3 and multiple ankyrin repeat domains 2 SHANK2 0.043996 193906  Solute carrier family 26, member 7 SLC26A7 0.0042 225067  Solute carrier family 26, member 7 SLC26A7 0.005853 213530  Solute carrier family 5 (iodide transporter), member 8 SLC5A8 0.031284 231731  SPARC related modular calcium binding 2 SMOC2 0.021505 135930  Syndecan 2 (heparan sulfate proteoglycan 1, cell surface-associated, fibroglycan) SDC2 0.001258 209676  Syntaxin 12 STX12 0.01117 199949  T-box 22 TBX22 0.024297 177517  Thioredoxin-like, 32 kDa TXNL 0.001102 192552  Thyroid stimulating hormone receptor TSHR 0.02176 108606  Tissue inhibitor of metalloproteinase 4 TIMP4 0.023629 184795  Trefoil factor 3 (intestinal) TFF3 0.004648 114445  Trefoil factor 3 (intestinal) TFF3 0.014428 100949 N d N GALNT9 0.031284 161042 S. pombe WEE1 0.024101 123533  WW domain containing oxidoreductase WWOX 0.012208 224298  WW domain containing oxidoreductase WWOX 0.024101 135080  Zinc finger protein 258 ZNF258 0.013962 225961  Zinc finger protein 36, C3H type-like 2 ZFP36L2 0.018837 210469 p p 1 Fig. 1 Hierarchical clustering of samples. This heat map shows the clustering of the 25 samples based on the 236 probes found to be differentially regulated in benign vs malignant thyroid tissue. Clustering was performed using the unweighted pair group method with arithmetic mean, with Euclidian distance as the similarity measure. Average value was used as the ordering function p p ANOVA tests were used to determine which genes were differentially regulated in the FVPTC cohort only. Fifteen genes were identified, including cluster of differentiation 14 (CD14), CD74, CTSC, CTSH, CTSS, DPP6, ETHE1, human leucocyte antigen A (HLA-A), HLA-DMA, HLA-DPB1, HLA-DQB1, HLA-DRA, osteoclast stimulating factor 1 (OSTF1), TDO2, and a previously uncharacterized gene (noname). 23 2 3 Fig. 2 n n Table 3 Correlation between TaqMan® and microarray data Gene r p Genes differentially expressed in FVPTC vs classic PTC CD14 0.83 <0.0001 a 0.87 <0.0001 0.74 0.0002 CTSC 0.53 0.0170 CTSH 0.71 0.0005 CTSS 0.62 0.0037 DPP6 0.76 0.0001 ETHE1 0.30 0.2008 HLA-A 0.77 <0.0001 HLA-DMA 0.75 0.0001 HLA-DPB1 0.96 <0.0001 a 0.70 0.0006 0.63 0.0031 HLA-DRA 0.82 <0.0001 NONAME −0.02 0.9323 OSTF1 0.15 0.5171 TDO2 0.68 0.0009 Genes differentially expressed in benign vs malignant BAX 0.20 0.3997 CAMK1 0.17 0.4771 CD44 0.57 0.0094 CTSS 0.62 0.0037 CXADR 0.36 0.1152 FGFR2 0.84 <0.0001 GALE 0.54 0.0138 ICAM1 0.62 0.0038 LYN 0.45 0.0483 MAPK4 0.76 <0.0001 MMP14 0.36 0.1175 MT1F 0.92 <0.0001 a 0.68 0.0011 0.69 0.0007 0.70 0.0006 a 0.88 <0.0001 0.88 <0.0001 RAB23 0.70 0.0006 S100A11 0.61 0.0041 SDC2 0.34 0.1415 SDC3 0.57 0.0087 a 0.97 <0.0001 0.96 <0.0001 TGFB1 0.56 0.0098 TIMP1 0.73 0.0003 TIMP4 0.80 <0.0001 TOP2A 0.07 0.7769 TSHR 0.18 0.4440 Gene expression profiles for TaqMan® PCR and microarray results were compared using the Pearson coefficient. a Discussion The primary aim of this study was to generate an overview of molecular markers of malignancy in PTC with a view to identifying discriminators between common sub-types (classic PTC and FVPTC), using genome-wide expression microarray technology validated by TaqMan® RT-PCR. To this end, lesions that were well characterized histologically were selected. 27 26 8 2 4 7 33 2 9 16 31 36 17 LGALS3 9 16 20 LYN 46 TFF3 2 9 16 CRABP1 9 16 BAX 2 MAPK4 28 CD44 16 TIMP1 20 FGFR2 9 S100A11 20 40 29 41 Correlation of highlighted features with the current state of knowledge of the molecular pathology of thyroid neoplasia goes some way towards providing an external validation of the data obtained from the AB1700 system. However, additional validation using TaqMan® RT-PCR was performed. p 15 38 2 3 3 34 34 19 47 18 37 19 43 44 1 14 5 14 A particular focus of this study was to compare transcriptome profiles for PTCs with classic morphology and FVPTC given the propensity for FVPTC lesions to prove problematic from a diagnostic perspective. Although the study confirms the close relationship between the two most common variants of PTC, a narrow portfolio of genes and, in particular, gene functions was elucidated in the FVPTC cohort. The targets identified are easily amenable to analysis by more established techniques such as TaqMan® RT-PCR, with associated potential as additional markers for application in the FNAC setting. Clearly, the potential biomarkers identified in this study will require prospective evaluation in the context of real clinical diagnostic situations in the future to consolidate their merit as adjunctive tests in the diagnostic setting and to validate their altered expression states in the pathobiology of PTC development.