Background 1 2 3 4 5 6 7 8 9 10 11 12 13 Methodology Functional profiling of Array-CGH experiments under this new perspective would require of three steps: 1) detection of regions with copy number variations (the origin of the disease), 2) detection of regional alterations in gene expression (the causes of the disease) and 3) analysis of enrichment in functional terms in the detected regions (the consequences of the alteration or the functional basis of the disease). While copy number alterations can be detected by means of different methods, alterations in the levels of gene expression are not always easy to be detected using the typical methods (t-test or similar) due several factors such as small sample sizes. For this reason here we will only use plots to visualize the effect of one variable (copy number) into the other one (expression level). The third step, the functional profiling, becomes then the most important aspect of the analysis given that it will provide a functional explanation of the molecular basis of the disease caused by copy number alterations. Detection of copy number alterations 14 The isowindow method tries to identify boundaries between regions with a significant change in the values of intensity of hybridisation of the probes by some consecutive steps. Firstly a t-test is used to determine differences between regions around all possible boundary points. Once all the candidate boundaries have been selected (a liberal p-value is used at this stage) there are sorted from small to high minimum p-values. In a second step the boundary candidates in the list with overlapping neighbourhoods are filtered to obtain a refined list of optimal non-overlapping boundary candidates. All the p-values are recalculated for the redefined neighbourhoods and a more stringent threshold is applied here. Finally, regions at both sides of each boundary candidate are again compared with a t-test. If they are not significantly different in their average hybridisation values, then they are merged as a unique region. Otherwise they define two regions with different copy number value. This is a simple and quick procedure that allows for easily changing from fine to coarse resolution by modifying the thresholds for the p-values. 15 14 4 An approximation to the relative performances of the methods used was obtained by means of simulated data sets. Such datasets were generated by means of a piecewise constant function plus random alterations normally distributed with mean value and three different levels for the standard deviation (corresponding to noise levels 0.2, 0.5 and 1). A mean value of 0 would correspond to a normal region, without copy number alterations, while mean values lower and higher would correspond to deletions or amplifications at different degrees, respectively. Amplified and deleted regions of different sizes are randomly situated within the simulated normal chromosome and the methods have to locate them at different noise levels. The method proposed here performs at least as well as the GLAD and CBS (Table 1) while being more efficient in terms of runtimes. Isowindow shows a better performance in finding small amplicons. Functional profiling of regions with copy number alterations 11 17 Discussion In Silico 18 13 19 20 21 Using the segmentation method as implemented in the ISACGH we could detect a putative amplicon in the chromosome 18 (which remained undetected with both GLAD and CBS, because of the low density of the array, although the effect would have been the same in a high density arrays with a small amplicon) The figure shows the region (left) and the slight, although appreciable, differences in gene expression levels within the amplicon (right). 16 17 8 10 22 19 21 23 Conclusion 4 8 10 The methods for functional profiling have proven in many scenarios its usefulness. An obvious challenge is to increase our knowledge in different aspects of function and cooperation between genes in order to be able of applying this methods in a way that allows us to unravel new unknown functional aspects of the biology of the cell and their connections to pathologies.