To investigate the molecular changes underlying the proliferative effects of PACAP we performed a gene expression analysis using a 16 k cDNA microarray. The experimental design depicted in Figure 1 was utilised. NSCs were isolated from a pool of lateral ventricular wall tissue from 15 mice and grown in culture as neurospheres, after which they were dissociated and cultured to form secondary neurospheres. Secondary neurospheres were dissociated and split into eight parallel cultures. The cells were used to study the culture variance and three different treatments, each replicated in two individual cultures (indicated by letters a and b in Figure 1.) We used amplification and culture replicates to control for technical and biological variation respectively, and controls for differentiation and proliferation activation to verify our results.
Figure 1
Neurosphere culturing and experimental design. Neural stem/progenitor cells were isolated from the lateral ventricle wall region of brains from a pool of mice and grown as neurospheres. RNA was isolated from different treatments as indicated and used for subsequent microarray hybridisations. a and b indicate biological replicates. Each arrow represents the number of hybridisations, arrowhead represents labelling with Cy5 and arrow tail represents labelling with Cy3. Groups 1, 2 and 3 indicate hybridisations grouped in the data analysis, to optimise the variance estimates for each gene. NS = neurosphere control, PACAP = pituitary adenylate cyclase-activating polypeptide treated samples, Prol Cont = proliferation control (transmembrane receptor agonist treated samples), Diff Cont = differentiation control (fetal calf serum treated, and solid support plated samples).
In all comparisons undifferentiated neurospheres (NS), maintained in culture medium supplemented with epidermal growth factor (EGF), was used as a reference control sample. In the first of three treatment regimes neurospheres were induced to proliferate in response to PACAP by replacing the EGF supplemented culture medium with medium supplemented with PACAP. In the proliferation control, the cells were induced to proliferate in response to a transmembrane receptor agonist (TMR agonist) by replacing EGF with the agonist in the neurosphere culture medium. In the third treatment, the differentiation control, neurospheres were induced to differentiate into neurons, astrocytes and oligodendrocytes by replacing the EGF supplemented culture medium with medium supplemented with fetal calf serum, and plating the cells onto poly-D-lysine plates.
mRNA was isolated from all cell cultures and used in a series of cDNA microarray hybridisations. To avoid extensive passaging of the neurospheres, a limited amount of cells were obtained, and the generated RNA was not sufficient for labelling and subsequent microarray analysis. The RNA was therefore amplified using a previously described protocol [22, 23], that has recently also been evaluated for neurosphere analysis [21]. The principle relies on incorporating a biotin moiety into the cDNA during the first-strand cDNA synthesis, by using a biotinylated oligo(dT) primer. The population of cDNAs is fragmented and the biotinylated 3'ends captured onto a streptavidin-coated solid support. The isolated cDNA tags are released from the support, amplified by PCR and labelled for subsequent microarray hybridisation.
Amplified and labelled cDNA from each treated sample was hybridised against the neurosphere control sample (NS). In order to measure the technical variation self-to-self hybridisations were made with material from NS sample a. In addition, to measure the variation between two identical cultures, the two NS replicates (a and b) were hybridised against each other. For each comparison two replicate and two dye-swap hybridisations were performed.
We have previously shown that neurosphere culture passaging or prolonged culturing per se, is sufficient to induce differential expression and that this should be taken into account in the design of the experiment [21]. To address these issues, i.e. to get a variance measure from as many slides as possible, the results from all hybridisations were divided into three groups prior to data analysis, as indicated in Figure 1. Group 1 contains eight hybridisations comparing the technical amplification replicates (NS a-NS a) and the biological culturing replicates (NS a-NS b). Groups 2 and 3 consist of sixteen hybridisations each and include the NS a-NS b hybridisations as well as one replicate of each treatment comparison (NS vs. PACAP treated, NS vs. differentiation control and NS vs. proliferation control). This scheme allows for estimation of technical and biological noise (Group 1). Also, by using the NS a and NS b samples as reference samples in each group, the "contrasts" (i.e. the calculated differences between two treatments) can be calculated and compared between the groups (data not shown). The data in all three groups was filtered (for details see Methods) and print-tip lowess normalised using identical criteria. For each group individual gene-wise variances were calculated, and taken into account in the identification of differentially expressed genes using the empirical Bayes moderated t-test [24–26]. For each comparison the log-odds ratio (B-value) was used to rank the genes in order of evidence for differential expression. Higher B-value indicates higher probability of differential expression.
In order to investigate the magnitude of differential expression in each comparison the M-value (log2(sample X intensity/sample Y intensity)) for each gene was compared to the corresponding B-value (Figure 2). Genes with a high M-value usually receive a high B-value, which gives the plots the characteristic volcano shape. Figure 2 shows that the number of differentially expressed (DE) genes and the magnitude of differential expression is much lower for the technical and biological replicates than for the treated samples. This indicates that the RNA amplification and the biological fluctuations during culturing do not contribute substantially to the observed differential expression for PACAP and control treatments. Noteworthy, the distributions of B- and M-values are very similar for all treated samples, implicating that the magnitude of gene expression changes are similar for all treatments. The correlation between the replicated comparisons in Groups 2 and 3 was investigated by visualising the M-values from a comparison in Group 2 against the corresponding M-values in Group 3 (Figure 3). This shows that there is a high correlation between the replicates, with correlation coefficients ranging from 0.85 to 0.88. For the A-values (intensity values) the correlations are even higher ranging from 0.996 to 0.997 (data not shown). The M-value correlations between the contrasts were analysed in a similar fashion, but yielded much lower correlation coefficients (0.10 for PACAP treatment vs. proliferation control, 0.33 for differentiation control vs. proliferation control and 0.56 for differentiation control vs. PACAP treatment, data not shown). These low correlations are a consequence of the small differences between the contrasts as shown in the M-value distributions for the different comparisons (figure 4)
Figure 2
B-value distributions for each comparison. The x-axis shows the M-value (log2(Cy5/Cy3)) for each gene and the y-axis the corresponding B-value (calculated by an empirical Bayes moderated t-test).The B-value scores the genes according to their probability of differential expression. Higher B-value means higher probability of differential expression. Dotted lines are drawn at M-values 0.6 and -0.6, i.e. at a 1.5-fold difference in signal intensity between the compared samples, and at B = 9.3, corresponding to a Holm adjusted p-value of 0.0001. These values correspond to the thresholds set for differential expression in this study.
Figure 3
Graphs displaying the correlation between replicated samples. The x-axis and y-axis display M-values (log2(Cy5/Cy3)) for replicated samples. The values of the Pearson correlation coefficient (r) and the coefficient of determination (R2) are also included.
Figure 4
Box plots displaying the M-value (log2(Cy5/Cy3)) distribution for each comparison.
To further explore the overlap between DE genes in the replicated treatments Venn diagrams shown in Figure 5 were created. The comparisons include genes with a Holm adjusted p-value < 0.0001 and an M-value > +/- 0.6 (corresponding to a fold change > 1.5). Also included in the comparison are 29 DE genes (corresponding to 40 redundant probes on array) identified in the NS a-NS b comparison that could be considered as technical noise. Figure 5 shows that the overlap between the biological replicates is high. For all three treatments the majority of the genes (60–70%) are differentially expressed in both replicates; 814 genes (1109 probes) for the NS vs. PACAP treatment, 741 genes (986 probes) for the NS vs. proliferation control and 604 genes (797 probes) for the NS vs. differentiation control. Also, a large proportion of the non-overlapping genes showed a similar M-value in the two groups, indicating that the true overlap is even higher. The genes shared by both replicates for a certain treatment were further compared between the different treatments to visualise the effects of the different treatments on gene expression levels. Surprisingly, the great majority of the genes (435, 579 probes) fall within the overlap of all treatments, further suggesting that the different stimuli results in very similar effects on gene expression level.
Figure 5
Correlation between biological replicates and between treatments. The number of differentially expressed genes in each comparison are presented and compared. Genes with an M-value > | 0.6 | and a p-value < 0.0001, calculated by empirical Bayes moderated t-test and a Holm's adjustment for multiple testing are included. Figures without parentheses show the number of probes included. Figures within parentheses show the number of corresponding genes. NS = neurosphere control, PACAP = pituitary adenylate cyclase-activating polypeptide treated samples, Prol Cont = proliferation control (transmembrane receptor agonist treated samples), Diff Cont = differentiation control (fetal calf serum treated, and solid support plated samples).
A likely explanation is that the removal of growth factor (EGF) from the neurosphere culture medium, coinciding with the treatment initiation, masks the effect on gene expression changes caused by the different stimuli. To further investigate whether the remaining unique DE genes were true differences or related to the EGF effect we performed additional analysis of the 213 (265 probes), 135 (168 probes) and 79 (92 probes), non-overlapping genes. A comparison was made by taking the lists of the unique genes for one treatment and visualising their corresponding M-values in the other treatments (Figure 6). The results depicted in Figure 6 demonstrate that the majority of genes are clustered around the threshold values of the criteria for differential expression, either with an M-value just above or below 0.6 or with a p-value greater than 0.0001. Thus, the majority of unique (non-overlapping) genes are borderline cases, nearly included in the category of overlapping genes. Genes that would have been truly unique to the treatment in question would have had M-values in both replicates that were centred round zero. No such genes can be found when comparing the results from NS vs. PACAP treatment and NS vs. proliferation control. A few genes can be found when comparing NS vs. differentiation control to either one of the other two treatments, indicating that the serum treatment gives a somewhat more different gene expression profile, as expected. To facilitate further analysis of the results, annotated gene lists corresponding to the genes that were considered differentially expressed only in the NS vs. Diff Cont comparison, or shared between NS vs. PACAP and NS vs. Prol Cont comparisons are provided [see additional file 1 and additional file 2, respectively]. The complete results for all comparisons are available in ArrayExpress using experiment accession number E-MEXP-322.
Figure 6
M-value analysis of non-overlaping genes from the treatment comparisons. M-values (log2(Cy5/Cy3)) for genes identified as DE in either only neurosphere vs. PACAP treatment (A), neurosphere vs. proliferation control (B) or neurosphere vs. differentation control (C) are shown using data from the other two treatment comparisons. Colouring indicates if a gene reaches the statistical significance required for differential expression (black, both replicates have p-values < 0.0001; red, p-value in replicate a > 0.0001: green, p-value in replicate b > 0.0001). Dotted lines depict the cut-off values for scoring differential expression (M-value > | 0.6 |).
These findings indicate that the differentially expressed genes in the different treatments are due to the withdrawal of EGF rather than to the treatment itself. A list of the 435 genes identified as the EGF treatment/withdrawal genes (see Figure 5) is provided [see additional file 3]. A short version of the list, with the top 40 genes, is shown in Table 1. The genes were further grouped and ranked according to their Gene Ontology annotation. We focused on the 'Biological Processes' branch of the GO theme structure and analysed the functional categories represented in the data. In total, 241 genes received a functional annotation. The results of the analysis, which was carried out at the detail level 3 (intermediate level that gives a general overview of the data), are provided [see additional file 4]. The themes with the highest representation were 'cell growth and/or maintenance' (118 genes), 'nucleobase, nucleoside, nucleotide and nucleic acid metabolism' (65), 'protein metabolism' (46), 'signal transduction' (29), 'catabolism' (20) and 'organogenesis' (20). In general, the list of expressed functional categories is enriched for various metabolism-related themes, but further down also contains themes such as 'cell adhesion', 'cell death', 'response to external stimulus' and 'cell-cell signalling' Next we analysed the overrepresentation of functional categories by using the genes represented on the array as background for the significance calculations. The top GO terms of the biological process class overrepresented in the data are shown in Table 2. A large proportion of these GO themes are related to the cell cycle and/or DNA replication, as expected for EGF-related effects. Many of the transcripts found and classified within "mitotic cell cycle" and "DNA replication and chromosome cycle" are down-regulated in the treated samples lacking EGF. In contrast many genes within "neurogenesis", "organogenesis" and "development" are up-regulated in these samples, probably reflecting the removal of inhibitory regulation on differentiation as EGF is withdrawn from the culture medium.
Table 1 Top differentially expressed genes in the overlap between all treatments.
Table 2 Gene ontology analysis.