Introduction 6 17 25 39 11 22 24 26 38 28 27 To provide regulatory and sponsoring agencies with confidence that reported data are generated following necessary standards and rigor that supports product licensure. To provide an external validation tool for individual labs. 3 5 32 33 2 33 32 5 25 31 In contrast, immune monitoring approaches in the cancer vaccine field are more heterogeneous, based on the vast variety of vaccine design, type of cancer, and availability of antigen presenting cells. Standardization of the entire Elispot protocol across laboratories is therefore not feasible. We set out to devise a strategy to identify issues and deficiencies in current Elispot practices, and to identify common sources of assay variability within and between laboratories, with the extended goal of standardizing the identified factors in an assay harmonization effort across laboratories. In 2005, the Cancer CVC/SVI initiated an Elispot proficiency panel program to achieve this goal. In addition to offering an external validation program, the CVC addressed the need for such strategy by comparing assay performance across the field, identifying critical protocol choices and gaining an overview of training and validation practices among participating laboratories. For this program, predefined PBMCs from four donors with different ranges of reactivity against two peptide pools were sent to participants for Elispot testing. Laboratories had to further provide cell recovery and viability data, as well as respond to surveys describing their protocol choices and training and validation status. In response to the survey results, the CVC/SVI established requirements for laboratories to participate in future proficiency panels, which included the existence of a Standard Operating Procedure (SOP) prior to joining the program. Further, individualized assay performance assessment was offered to all laboratories, together with suggestions for implementation of protocol optimization steps. 1 Fig. 1 Initial guidelines for harmonization of the Elispot assay to optimize assay performance and reproducibility derived from two international proficiency panels, based on their findings and trends observed Materials and methods Participants and organizational setup All participants were members of the Cancer Vaccine Consortium or its affiliated institutions, the Ludwig Institute for Cancer Research (LICR) and the Association for Immunotherapy of Cancer (C-IMT). Laboratories were located in ten countries (Australia, Belgium, Canada, Germany, Italy, Japan, France, Switzerland, UK, and USA). Each laboratory received an individual lab ID number. Panel leadership was provided by a scientific leader experienced in Elispot, in collaboration with the CVC Executive office. SeraCare BioServices, Gaithersburg, MD, served as central laboratory, providing cells, pretesting and shipping services, as well as logistical services like blinding of panelists. IDs were not revealed to panel leader, CVC or statistician during the panel. Thirty-six laboratories including the central lab participated in panel 1, 29 including the central lab in panel 2. Twenty-three laboratories participated in both panels. Six new panelists were added to the second testing round. Thirteen dropped out after the first panel. Main reasons for drop out were switch of assay priorities and not meeting criteria for panel participation. Various groups stated that one time participation fulfilled their need for external validation. PBMCs and peptides 20 Each vial of PBMCs contained enough cells to ensure a recovery of 10 million cells or more under Seracare’s SOP. 7 23 7 23 Peptide pools were resuspended in DMSO and further diluted with PBS to a final concentration of 20 μg/ml. Aliquots of 150 μl of peptide pool were prepared for final shipment to participants. Corresponding PBS/DMSO aliquots for medium controls were also prepared. Participants were blinded to the content of these vials, which were labeled as “Reagent 1, 2 or 3”. All cells and reagents sent to participants in both panels were obtained from the same batches. Cells and reagent vials were shipped to all participants for overnight delivery on sufficient dry ice for 48 h. Shipping was performed by Seracare under their existing SOPs. Elispot Participants received a plate layout template and instructions for a direct IFNγ Elispot assay which had to be performed in one Elispot plate. Each donor was tested in six replicates against three reagents (medium, CEF and CMV peptide pool). Further, 24 wells were tested for the occurrence of false positive spots by the addition of T cell medium only. About 200,000 PBMC/well were tested against 1 μg/ml peptide pool or the equivalent amount of PBS/DMSO. All other protocol choices were left to the participants, including choices about: Elispot plate, antibodies, spot development, use of DNAse, resting of cells, T cell serum, cell counting and spot counting method. All plates were reevaluated at ZellNet Consulting (Fort Lee, NJ) with a KS Elispot system (Carl Zeiss, Thornwood, NY), software KS Elispot 4.7 (panel 1) and KS Elispot 4.9 (panel 2) in blinded fashion. Each well in each plate was audited. 5 5 8 Statistical analysis t Results Feasibility In the first proficiency panel, shipping and Elispot testing among 36 laboratories from 9 countries were conducted without delays. The success of this panel demonstrates the feasibility of such large international studies, the biggest of such format as of today, under the described organizational setup. The second panel with 29 participating laboratories from 6 countries followed the approach of panel 1. However, customs delays of dry ice shipments to some international sites required repeated shipments of cells and antigens to these destinations. Based on this experience, the use of dryshippers with liquid nitrogen is being implemented for international destinations in the third CVC panel round in 2007. Recovery and viability of PBMCs in panel 1 and 2 1 Table 1 Cell recovery and viability in both proficiency panels a b Median P1/P2 Minimum P1/P2 Maximum P1/P2 6 1 12.5/11.5 12.0/12.0 6.8/1.0 26.4/18.8 2 13.3/12.3 13.2/12.6 5.5/0.3 28.4/22.4 3 13.1/12.8 13.6/12.8 6.7/1.7 25.1/24.3 4 12.8/12.4 11.9/12.9 5.6/6.2 34.4/22.7 Viability (%) 1 88/85 89/89 48/58 100/98 2 87/88 89/90 57/67 100/98 3 91/87 93/90 54/69 100/100 4 86/89 90/92 43/75 100/98 a b Interestingly, only 4/10 laboratories in panel 1, and 4/7 laboratories in panel 2 with recoveries below 8 million cells were from international locations. Similar, only 1/5 laboratories in panel 1 and 2/5 in panel 2 reporting viabilities less than 70% belonged to international sites. This clearly demonstrates that location for dry ice shipment had no effect on overall cell recovery and viability. We also investigated whether low (<8 million) or very high (>20 million) cell recovery had an influence on spot counts, assuming that these were potential erroneous cell counts, leading to too low or too high cell dilutions, respectively, what in turn would lead to too high (in case of underestimating cell number) or too low (in case of overestimating cell number) spots counts. However, except for a few sporadic incidents, there was no correlation between cell counts and spot counts (data not shown) in either direction. Only one laboratory with low recovery and low viability was found to have peptide pool-specific spot counts for all donors much below the panel median. P 5 Elispot results in panel 1 All 36 laboratories completed testing of all 4 donors against medium, CEF and CMV peptide pool. Spot appearance and size as well as occurrence of artifacts differed dramatically among laboratories (not shown). Four outlier laboratories were identified, which detected less than half of the responses correctly. In all four cases, detected responses were well below the panel median, and often, there was high background reactivity (up to 270 spots/well) in medium controls. No obvious protocol choices could be identified which could have been responsible for the suboptimal performance. One out of the four laboratories had little experience at the date of the panel. Another group reported a less experienced scientist performing the assay. A third outlier repeated the assay, and was able to perform adequately. No feed back was available from the fourth group. 2 Fig. 2 Panel 1 panel 2 box plots box triangle horizontal line upper lower mark horizontal line 3 Fig. 3 Lab X Y Z A similar scenario was found for response detection against the CEF peptide pool. Thirteen laboratories missed to detect the response, one of which due to high reactivity against medium, and one due to inaccurate spot counting. An interesting observation was that 23 groups reported false positive spots in a range of 1–26 spots/well. Reevaluation revealed that the actual number of groups with false positive spots was lower (12), and the false positive spot number range per well fell between 0 and 8. Survey results about protocol During the first panel, participants had to provide information about their protocol choices: plates, potential prewetting of PVDF, antibodies, enzyme, substrate, use of DNAse during PBMC thawing, resting of cells, serum used, cell counting, and plate reader. There was a wide range of protocol choices across the panel participants. The most common choices for the parameters listed above were as follows: use of PVDF plates (64%) prewetted with Ethanol (52%), coated with Mabtech antibodies (67%) at 0.5–0.75 μg/well (33%); spot development with HRP (53%) from Vector Laboratories (33%) using AEC (44%) from Sigma (42%); no use of DNAse when thawing cells (83%), and no resting period for cells prior to the assay (53%); use of human serum (64%); cell counting with trypan blue exclusion using a hemocytometer (78%); and plate evaluation with a Zeiss reader (36%). There were some clear trends for international sites with preferred use of the AP/BCIP/NBT development system (83 via 29% in the US), the use of nitrocellulose plates (75 via 21% in the US), and the use of Mabtech antibodies (83 via 58% in the US). On the other hand, 7/8 laboratories using an automated cell counter were located in the US. 4 Fig. 4 a b box plots box triangle horizontal line upper lower mark triangle a b 5 Fig. 5 a b D1–4 med 6 P P Fig. 6 P P t Survey results about validation and training practices During the lively discussion of the results of panel 1 and its protocol survey at the Annual CVC meeting in Alexandria in November 2005, it was suggested that the level of experience, standardization and validation of participating laboratories might have been the cause for the variability and performances observed. In response, we conducted a survey among panelists, in which 30 laboratories participated. As expected, the experience and Elispot usage varied significantly. Some laboratories had the Elispot assay established less than one year before panel testing, whereas others used the assay for more than 10 years. The experience of the actual performer of the panel assay also varied widely. Interestingly, even though 2/3 of participants reported to have specific training guidelines for new Elispot performers in place, more than 50% never or rarely checked on the scientist’s performance after the initial training. Almost all laboratories indicated that they use an SOP that had been at least partially qualified and/or validated. Validation tools and strategies varied widely. Only 12 groups monitored variability, whereas 23 reported the use of external controls of some kind (e.g., T cell lines, predefined PBMC, parallel tetramer testing). Thirteen groups were found to have some kind of criteria implemented for assay acceptance. Among the 20 different criteria reported, not one was described by more than one lab. Mirroring these survey results, 20 laboratories believed that they need to implement more validation steps. All except one group expressed their strong interest in published guidelines for validation and training strategies for Elispot. Elispot results in panel 2 P 2 Of the 23 labs repeating the panel, 8 had changed their protocol before panel 2, 3 of which as an immediate response to results from the first panel. One outlier lab from panel 1 participated in panel 2, and improved its performance by detecting all responses correctly as per reevaluation counts. Only their lab-specific evaluation did not detect the low CMV responder. Overall, 4/23 panel-repeating labs (17%) did not detect the low CMV-responder, 3 of which also did not detect this response in panel 1. About 10/23 groups missed the low CMV responder in panel 1, but 7 of these laboratories were able to detect it in panel 2. Only one repeater detected the low CMV response in panel 1, but not panel 2. This is a clear performance improvement for that group (47% missed this response in panel 1), and highlights the usefulness of multiple participation in panel testing as an external training program. We ran an in-depth analysis of the results and previous survey responses, where available, from participants who missed the weak responder, as well as from laboratories with marginal detection of response, including personal communication. We were able to narrow down the possible sources for these performances. Two laboratories missed responses due to inaccurate evaluation, during which they either included artifacts into spot counts or simply did not count the majority of true spots, as central reevaluation revealed. One laboratory did not follow the assay guidelines. The majority of laboratories, however, followed common protocol choices, but had either very low response detection across all donors and antigens, or detected very high background reactivity in some or all donors. This pattern pointed to serum as the possible cause for suppressed reactivity or non-specific stimulation. Three of these laboratories shared with the CVC that retesting their serum choice indicated that they had worked with a suboptimal serum during the panel; and that they now successfully introduced a different serum/medium to their protocol with improved spot counts. Serum choices included human AB serum, FCS, FBS, and various serum-free media. There was no difference in assay performance detectable between these groups. Discussion 2 14 4 3 5 15 4 21 35 16 18 6 2 16 P Initial Elispot Harmonization Guidelines for Optimizing Assay Performance 1 Cancer Vaccine Clinical Trial Working Group 12 30 1 19 25 34 9 13 37 10 16 29 36 Furthermore, acceptance criteria for assay performance were only used by a limited number of laboratories, and each criterion was unique for the laboratory that used it. These observations should be a wake-up call for the immune monitoring community, which does not only include the cancer vaccine field, but also the infectious disease and autoimmunity field and others. General assay practices for the detection of antigen-specific T cells are comparable across all fields. The CVC as part of the Sabin Vaccine Institute is intending to develop and tighten collaborations with groups from other research and vaccine development areas. Published documents with specific criteria for Elispot assay validation, assay acceptance criteria and training guidelines will be most valuable for the immune monitoring field, and are now being established as CVC guidelines as a result of the described studies. Continuous external validation programs need to be a part of these efforts in order to check upon the success of inter-laboratory harmonization including assay optimization, standardization and validation as well as of laboratory-specific implementation of guidelines and protocol recommendations. These efforts are essential to establish the Elispot assay and other immune assays as standard monitoring tools for clinical trials.