Association of a Murine Leukaemia Stem Cell Gene Signature Based on Nucleostemin Promoter Activity with Prognosis of Acute Myeloid Leukaemia in Patients

Acute myeloid leukaemia (AML) is a heterogeneous neoplastic disorder in which a subset of cells function as leukaemia-initiating cells (LICs). In this study, we prospectively evaluated the leukaemia-initiating capacity of AML cells fractionated according to the expression of a nucleolar GTP binding protein, nucleostemin (NS). To monitor NS expression in living AML cells, we generated a mouse AML model in which green fluorescent protein (GFP) is expressed under the control of a region of the NS promoter (NS-GFP). In AML cells, NS-GFP levels were correlated with endogenous NS mRNA. AML cells with the highest expression of NS-GFP were very immature blast-like cells, efficiently formed leukaemia colonies in vitro, and exhibited the highest leukaemia-initiating capacity in vivo. Gene expression profiling analysis revealed that cell cycle regulators and nucleotide metabolism-related genes were highly enriched in a gene set associated with leukaemia-initiating capacity that we termed the 'leukaemia stem cell gene signature'. This gene signature stratified human AML patients into distinct clusters that reflected prognosis, demonstrating that the mouse leukaemia stem cell gene signature is significantly associated with the malignant properties of human AML. Further analyses of gene regulation in leukaemia stem cells could provide novel insights into diagnostic and therapeutic approaches to AML.


Introduction
Acute myeloid leukaemia (AML) is a heterogeneous neoplastic disorder characterized by substantial cellular heterogeneity. Only a rare subset of cells is assumed to have the ability to self-renew and to initiate and sustain the disease [1]. Understanding the functional regulatory machinery of these leukaemia initiating cells (LICs) is critically important for designing an efficient therapeutic approach, but these cells have not yet been characterized in detail.
Nucleostemin (NS, also known as GNL3) encodes a GTP-binding protein that is mainly located in rRNA-free sites of the nucleolus [2]. NS contributes to ribosomal biogenesis by interacting with nucleolar proteins involved in pre-rRNA processing [3].
Moreover, NS is essential for the maintenance of nucleolar architecture and the integrity of nucleolar RNA-protein complexes [4]. NS also interacts with telomeric repeat-binding factor 1 (TRF1) to facilitate its degradation, which enhances the maintenance of telomere length [5]. NS and GNL3L form a complex with the telomerase catalytic subunit, human telomerase reverse transcriptase (hTERT) [6]. NS was originally discovered during research on genes that are highly expressed in neural stem cells, embryonic stem (ES) cells, and developing organs during embryogenesis [7].
NS functions as a reprogramming factor for induced pluripotent stem (iPS) cells [8], indicating that NS is essential for the maintenance of the undifferentiated status of ES/iPS cells. In neural stem cells, NS contributes to genomic stability [9], and NS deficiency in neural stem cells causes replication-dependent DNA damage that is associated with defective self-renewal. In haematopoietic stem cells, NS knockdown impairs the long-term reconstitution capacity and induces apoptosis as a result of accumulated DNA damage [10].
We previously generated transgenic mice in which green florescent protein (GFP) is expressed under the control of a particular region of the NS promoter (NS-GFP). In this system, we successfully identified the stem cell population among neonatal testicular cells, liver cells, and foetal brain tissues [11,12,13] as well as tumour-initiating cells, conceptually termed 'cancer stem cells,' in brain tumours and germ cell tumours [13,14]. Consistent with our studies, NS-enriched mammary tumour cells are highly tumourigenic [15]. Although the mechanisms are still unclear, there are particular programs that control NS expression and functions that are also commonly involved in maintaining stem cell properties in normal and malignant tissues. Recently, a study with newly diagnosed AML patients reported that high expression of NS is closely correlated with the percentage of immature blast cells and with haematopoietic stem/progenitor cell surface markers and that high NS transcript levels are found in AML patients with poor prognosis [16]; this suggests that NS expression may be an indicator of immature properties of AML cells. In this study, we prospectively evaluated the leukaemia-initiating capacity of subpopulations of AML cells with varying NS expression by using NS-GFP transgenic mice. We found that a leukaemia stem cell gene signature containing genes differentially expressed in cells with high and low NS promoter activity could also be used to stratify human AML patients into distinct clusters that reflected prognosis. 5

Generation of an AML model
NS-GFP transgenic mice were generated as described previously [11]. All procedures were performed in accordance with the animal care guidelines of Kanazawa University, Kanazawa, Japan. A murine AML model was generated as previously reported [17].

Leukaemia initiating capacity in vitro and in vivo.
The leukaemia-initiating capacity of fractionated NS-GFP cells was evaluated with a colony formation assay (MethoCult GF M3434, Stem Cell Technologies Inc., Vancouver, Canada) and by transplantation into lethally irradiated recipient mice, as previously described [18].

Preprocessing of microarray data
Gene expression profiling of mouse AML cells was performed as described in the Supplementary Methods. Briefly, total RNA was isolated from (a) c-Kit -NS-GFP low , (b) c-Kit -NS-GFP middle , (c) c-Kit + NS-GFP middle , and (d) c-Kit + NS-GFP high cells, followed by cDNA synthesis. cDNA microarray analysis was performed with a GeneChip Mouse 430 2.0 (Affymetrix Inc., High Wycombe, UK). The complete microarray data set is available at the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo/, accession number GSE 58032). Unsupervised hierarchical clustering was performed on the normalized and log-transformed data of 17,419 probes. Similarity was measured by Euclidean distance metric, and the average linkage was used to define the linking distance between clusters. To select probes whose expression changed progressively among the 4 subpopulations, we extracted 3,494 probes that showed gradual increases or decreases in expression as AML differentiation increased. Among the 3,494 probes, 500 probes that exhibited in the largest ratios between c-Kit -NS-GFP low and c-Kit + NS-GFP high were extracted. The 500 probes corresponded to 382 mouse gene symbols, forming the 'leukaemia stem cell signature' shown in Supplementary

Statistical analyses
For analysis of mouse AML samples, statistical differences among multiple groups were evaluated by a one-way ANOVA and Tukey as a post-hoc test. Statistical differences in Kaplan-Meier survival curves were determined using the log-rank test. For analysis of human AML data, age of patients, initial white blood cell (WBC) number of peripheral blood, and percent of blasts among mononuclear BM cells were compared among the clusters with Student's t-test or a one-way ANOVA/Tukey. Patients' sex, French-American-British (FAB) classification, and molecular risk classification were compared with Fisher's exact test and Bonferroni as a post-hoc test. Overall survival was plotted using the Kaplan-Meier estimator and compared with the log-rank test. The statistical analyses were performed using R software, and a two-sided significance level of α < 0.05 was used.

Fractionation of living AML cells based on NS expression level
We first investigated whether our NS-GFP reporter system could be used to monitor NS mRNA in AML cells in vivo. To do so, we infected c-Kit + Sca-1 + Lineage -BM mononuclear cells from NS-GFP transgenic mice with a retrovirus expressing both the Hoxa9 and Meis1a genes. This procedure generates a well-characterized AML once the transduced cells are transplanted into lethally irradiated syngeneic hosts [20]. As expected, all recipients of NS-GFP transgenic cells transduced with Hoxa9 and Meis1a developed leukaemia. Morphological analysis showed that the recipient BM was filled with leukaemia cells, including many blast-like cells and some more mature myeloid-like cells such as neutrophils and monocytes (data not shown), which is categorized as AML with maturation under the WHO classification (FAB Classification M2) [20]. We fractionated the approximately 95% of BM cells that were positive for NS-GFP into 3 subpopulations depending on GFP fluorescence intensity: NS-GFP +++ (approximately top 5%), NS-GFP ++ (next 45%), and NS-GFP + (next lower 45%); the remaining 5% were NS-GFP - (Fig. 1A). GFP fluorescence intensity in all 4 fractions closely paralleled the endogenous NS mRNA expression as measured by RT-PCR ( Fig.   1B).

Correlation of maturation stages of leukaemia cells with NS-GFP level
To characterize the fractionated AML cells, we analysed several surface markers for haematopoietic differentiation (Fig. 1C). As expected, all 4 fractions expressed the myeloid lineage marker Mac-1 but not markers for the T, B, or erythroid lineages. Sca-1, which is expressed in haematopoietic stem cells but not myeloid progenitor cells, was 9 not expressed in any of the fractionated cells. Most NS-GFP +++ leukaemia cells expressed c-Kit, consistent with the positive correlation between NS and c-KIT [16].
The expression of c-Kit decreased with decreasing GFP intensity, so that the NS-GFP ++ and NS-GFP + cell populations consisted of both c-Kit + and c-Kitcells ( Fig. 2A), and NS-GFPcells did not express c-Kit. Therefore, we designated these populations as

Characterization of undifferentiated leukaemia cells
We evaluated the cell cycle status of the fractionated NS-GFP leukaemia cells in vivo by measuring BrdU incorporation and DNA content (Fig. 3A). The proportion of BrdU-positive cells was significantly higher in c-Kit + NS-GFP high cells more than the other subpopulations (Fig. 3B), indicating that these immature blast-like cells were actively cycling. To characterise the gene signature of undifferentiated leukaemia cells, we selected 3,494 probes whose expression gradually increased or decreased along with the progressive changes among the 4 subpopulations. Among these 3,494 probes, we further selected 500 probes (382 genes) that exhibited the largest ratios between differentiated (c-Kit -NS-GFP low ) and undifferentiated (c-Kit + NS-GFP high ) cells (Supplementary Table 1 Table 2). These data are consistent with the dramatic differences in cell cycle status between the cell fractions (Fig. 3B).

Enrichment of LICs in cells expressing high levels of NS-GFP in vitro and in vivo
To investigate properties of the fractionated cells, we performed in vitro colony-forming assays with the 4 types of fractionated AML cells shown in Fig. 1A (NS-GFP +++ , NS-GFP ++ , NS-GFP + , and NS-GFP -). We found that NS-GFP +++ cells exhibited the highest colony-forming ability among the subpopulations. Both NS-GFP +++ and NS-GFP ++ leukaemia cells readily formed colonies, whereas NS-GFPdid not (Fig. 3C), indicating that NS expression is correlated with colony-forming ability.
Next, we evaluated leukaemia-initiating capacity in vivo with a transplantation assay. When we transplanted 100 cells of each of the 4 fractionated leukaemia cell populations into separate groups of mice, recipient mice in all groups developed leukaemia (data not shown), indicating that the frequency of LICs was very high. Therefore, to distinguish between the cell populations, we transplanted 10 cells of each.
In this condition, we found that only the NS-GFP +++ cells generated a secondary leukaemia (Fig. 3D). Thus, LICs are enriched in cells with highest level of NS-GFP.

Association of a murine leukaemia stem cell gene signature with prognosis of human AML patients
Finally, we investigated whether NS is associated with the properties of human AML cells. To do so, we first evaluated NS gene expression in human AML patients by using the public database developed by the Cancer Genome Atlas (TCGA) Research Network [19]. Unexpectedly, we did not find a correlation between human NS expression and FAB classification, molecular risk, or prognosis ( Supplementary Fig. 1). We therefore evaluated the relationship between the leukaemia stem cell gene signature shown above and characteristics of AML patients. We selected 318 human genes corresponding to the 500 mouse probes described above (Supplementary Table 1). Using this human gene set, we performed a clustering algorithm related to the k-medoids algorithm (Partitioning Around Medoids; PAM) for 179 AML patient samples. Because the clustering analysis showed that the average silhouette width was higher for k = 2 or 3 than for k = 4 or 5, we first stratified AML into two clusters (Cluster 1, n = 34; Cluster 2, n = 145) (Fig. 4A). Patients in the two clusters showed significant differences in age, number of WBCs, FAB classification, and molecular risks (Supplementary Table 3A). 12 Cluster 1 consisted of younger patients who had lower WBC numbers than patients in Cluster 2. All patients with M3 FAB classification were included in Cluster 1, whereas all patients whose leukaemia was classified as M0, M6, or M7, which are relatively poor prognostic types, were included in Cluster 2 (Fig. 4B) Table 3A). Consistent with these findings, the overall survival rate was significantly higher for Cluster 1 patients than for Cluster 2 patients (Fig. 4D). When we stratified AML patients into three clusters, we found that one cluster mainly consisted of patients whose leukaemia was classified as M3 and that there were significant differences in prognosis between the clusters (Supplementary Fig. 2

Discussion
We successfully identified LICs by monitoring NS expression in a mouse AML model.
In this system, endogenous NS correlated well with the undifferentiated status of leukaemia cells in vivo. In normal haematopoiesis, NS is more highly expressed in myeloid progenitors and haematopoietic stem cells than in differentiated haematopoietic cells [10]. Because a recent study indicated that AML stem cells are immunophenotypically similar to progenitors, including lymphoid-primed multipotent progenitors and granulocyte-macrophage progenitors [21], the higher expression of NS in myeloid progenitor cells may be conserved in leukaemia stem cells. Our data in the AML mouse model suggest that NS plays a critical role in maintaining undifferentiated status in AML cells, as has been shown in ES/iPS cells [8].
There are critical differences in the characteristics of AML between mouse and human. We found that NS gene expression itself was not correlated with the prognostic characteristics of AML in patients in the TCGA database, which was inconsistent with a previously reported result [16]. Although mouse NS was included in the leukaemia stem cell gene signature, NS expression levels were not significantly different among AML patient clusters in our study (data not shown), presumably due to heterogeneity among patients in the regulation of human NS expression. In addition, LICs from patient samples and the experimental mice also exhibited critical differences. For example, although human LICs are rare and cycle slowly [22], c-Kit + NS-GFP high leukaemia cells were actively cycling in our mouse model. Therefore, the gene signature identified in this experiment may not be directly associated with the endogenous pattern of NS expression. Even though NS itself may not function as a potent driver of malignant 14 status in AML patients, our data suggest that a network of genes associated with NS promoter activity affects the malignant status of human and mouse AML cells.
In clinical investigations, high NS expression is associated with poor prognosis in breast cancer [23], oesophageal cancer [24], and gastric and liver cancer [25]. Therefore, the core network of the leukaemia stem cell signature may be conserved in stem cells in solid tumours. In this study, we identified a gene set associated with AML that included cell cycle and nucleotide metabolism-related genes. These genes may coordinate stem cell properties with response to DNA damage and telomere stability to control self-renewal activity in normal and malignant cells, including solid tumours.
Further analysis of NS-related genes may contribute to the understanding of the factors that regulate a variety of cancers.

Conflict of interest
The authors declare no conflict of interest.