Integrative single-cell and bulk transcriptome analyses identify a distinct pro-tumor macrophage signature that has a major prognostic impact on glioblastomas

GBM single-cell atlas

Following the application of rigorous quality control criteria, we isolated 93,811 cells. For PCA, we selected 3,000 genes with high variability, and then proceeded to perform UMAP analysis leveraging the top 30 principal components (PCs) in the GSE182109 dataset. This analysis resulted in the categorization of all cells into 43 distinct clusters (Fig. 1B). Utilizing marker gene expression for reference, we annotated cell types and merged analogous cell subpopulations, ultimately identifying eight unique cell types. These comprised 17 clusters of glioma cells, 3 clusters of oligodendrocytes, 14 clusters of myeloid cells, 4 clusters of T cells, and single clusters of endothelial cells, B cells, pericytes, and miscellaneous cells (Fig. 1C). Figure 1D displays the cell distribution across different patients, highlighting the ubiquity of immune cells and the unique distribution of tumor cells per patient. Marker gene expression for each cell type is presented in Fig. 1E, with Fig. 1F underscoring the elevated expression of signature genes PTPRC, PDGFRA, MBP, PDGFRB, PECAMQ, CD68, CD3E, CD79A, MKI67 in immune cells, myeloid cells, oligodendrocytes, endothelial cells, macrophages, T cells, B cells, and proliferative cells. Across various patients, glioma and myeloid cells were identified as predominant cell types (Fig. 1G, H), whereas the distribution of other cell types remained consistent across individuals, suggesting an absence of batch effects in this study.

Tumor cell heterogeneity in glioma

From an initial dataset of 40,864 glioma cells, reclustering was conducted to segregate these cells into 12 subgroups (Fig. 2A), labeled GM_1 to GM_12. These subgroups demonstrated patient-specific patterns, with 7 clusters representing over 90% of samples, 3 clusters representing over 60%, and 2 clusters representing over 40% (Fig. 2B). To further explore the biological traits of these glioma cell clusters, we utilized FindAllMarkers to pinpoint differential genes for each cluster, which were then visualized with Manhattan plots (Fig. 2C). This analysis of differential genes uncovered significant functional enrichments across the cell clusters, encompassing tumor proliferation, invasion, extracellular matrix remodeling, response to oxidative stress, cell cycle regulation, and immune modulation. Specifically, Fig. 2D illustrates GM_2 enriched in pathways associated with the tumor microenvironment and oxidative stress, such as the response to toxic substances; GM_3 showed predominant enrichment in pathways governing proliferation and apoptosis, notably the insulin-like growth factor receptor signaling pathway; and GM_5 to GM_12 displayed enrichment in pathways related to signal transduction activation, immune regulation, and cell cycle governance. The CytoTRACE score highlighted a notably higher stemness score for the GM_6 group (Fig. 2E). Furthermore, Hallmarker function enrichment analysis (Fig. 2F) showed consistent patterns, with GM_6 notably enriched in pathways including angiogenesis, the Notch pathway, the TGF-beta pathway, and the Wnt/beta-catenin pathway.

Fig. 2figure 2

Glioblastoma Cell Subpopulations. A. UMAP analysis delineates GBM cells into 12 distinct subpopulations, identified as GM_1 to GM_12, with each exhibiting unique patient-derived characteristics. B. The bar graphs detail the proportional representation of various glioblastoma cell subpopulations across different patient samples. C. Manhattan plots elucidate the variances in gene expression profiles among the glioblastoma tumor cell subpopulations. D. Heatmaps highlight the differential gene expression across glioblastoma cell clusters and their associated functional enrichment within the GO (Gene Ontology), KEGG (Kyoto Encyclopedia of Genes and Genomes), and WikiPathways databases. E. Violin plots depict cytoTRACE scores to evaluate the stemness attributes of the subpopulations. F. Heatmaps illustrate the enrichment scores of hallmark gene sets across GBM cell clusters 1–12, providing insights into their molecular signatures

S100A9 macrophages promote tumor development through angiogenesis via interactions with endothelial cells

From an analysis of 38,784 macrophages, we performed further dimensionality reduction and clustering to delineate 7 distinct cell clusters (Fig. 3A), identifying 4 clusters as microglia and 3 as macrophages based on unique gene expression profiles (Fig. 3B) (Supplementary Table S2). The expression patterns of key marker genes such as CD163, P2RY12, MIF, VEGFA, CCL3, HLA-DRB1, and MKI67 were meticulously charted (Figs. 3C–D). Functional enrichment analysis (Fig. 3E) and volcano plots (Fig. 3F) pinpointed the Ma_1 cluster’s enrichment in pathways pertinent to hypoxia and epithelial-mesenchymal transition, categorizing this subgroup as S100A9 Ma, linked to tumor advancement. The Ma_2 cluster was found to be enriched in immune-related pathways, like interleukin γ and the complement system, leading to its classification as TXNIP Ma, associated with tumor resistance. The functional profile of Ma_3, accentuating cell proliferation pathways, defined this cluster as TOP2A Ma, related to cellular proliferation. GSEA corroborated our delineation of Ma_2 (Fig. 3G). Through pseudotime analysis, differentiation trajectories among subgroups were constructed, unearthing functions aligned with differentiation trends. Predominantly, TOP2A Ma was situated at the state and the onset of the trajectory, whereas S100A9 Ma and TXNIP Ma were located at the trajectory’s conclusion (Fig. 3H). Pathway communication analysis by CellChat revealed S100A9 Ma’s interactions with other cell types, engaging in the EGF pathway with glioma cells and the VEGF pathway with endothelial cells (Fig. 3I). In S100A9 Ma, the upstream transcription factors FOSL2 and SOX2 exhibited notable regulatory activity, modulating downstream matrix metalloproteinases and cell adhesion molecules, indicators of tumorigenic activity (Fig. 3J). Further validation of S100A9 Ma’ pro-tumor traits in bulk data, using single-sample gene set enrichment analysis (ssGSEA) in TCGA-GBM to calculate S100A9 Ma scores, showed that patients with high infiltration scores had significantly poorer prognoses than those with low scores (Fig. 3K).

Fig. 3figure 3

Characterization and Functional Assessment of Macrophage Clusters. A. UMAP delineates seven distinct myeloid cell clusters. B. UMAP characterizes four microglia clusters (Mi_1-4) alongside three macrophage clusters (Ma_1-3). C. Bubble charts detail the signature genes of macrophages, with the diameter indicating the number of cells expressing said gene and color reflecting mean expression levels. D. UMAP showcases the expression profiles of hallmark genes within myeloid populations. E. Bar graphs delineate the functional enrichment analyses for each cell cluster, employing varying colors to differentiate clusters and bar heights to signify generative processes. F. Volcano plots highlight the genes differentially expressed in S100A9 Ma, TXNP Ma, and TOP2A Ma, with color coding representing the logarithmic fold change (logFC). G. Gene Set Enrichment Analysis (GSEA) identifies pathways preferentially enriched in S100A9 Ma. H. Monocle trajectory plots for macrophages are sequentially colored by pseudotime, cellular state, and cell type. I. Heatmaps depict the interactive landscape of S100A9 macrophages with diverse cell types as analyzed through CellChat. J. Dot plots rank the top five transcription factors pivotal within the S100A9 macrophage subsets. K. Kaplan–Meier analysis underscores the prognostic disparities among patients distinguished by elevated versus diminished S100A9 macrophage infiltration levels

Construction and validation of a pro-tumor macrophage-associated signature (PTM)

To construct the model, differentially expressed genes from were identified the GSE108474 cohort, which comprises both normal and tumor samples. A univariate Cox regression model was utilized to sift through genes linked with prognosis (Fig. 4A). The genes of prognostic significance were then cross-referenced with those associated with S100A9 Ma, leading to the development of a prognostic risk model through LASSO-COX regression analysis. This model ultimately featured eight genes: NSUN5, ANKH, TSPAN4, GOS2, NAGLU, CCT6A, NCOA4, and WDR46. The formula for calculating the PTM score was: NSUN5 * 0.389 + ANKH * 0.310 + TSPAN4 * 0.235 + GOS2 * 0.192 + NAGLU * 0.179 − CCT6A * 0.159 − NCOA4 * 0.280 − WDR46 * 0.320 (Fig. 4B). Leveraging the TCGA dataset as a training cohort, we stratified patients into high- and low-risk groups. Kaplan–Meier analysis showed that the high-risk group exhibited significantly reduced overall survival compared to the lower-risk group (Fig. 4C). Time ROC curve analysis for 1, 3, and 5-year survival predictions in the training set yielded the area under the curves (AUCs) of 0.766, 0.826, and 0.886, respectively. This trend was consistent in the validation cohorts, where higher-scoring patients faced poorer prognoses (GSE108474 p = 0.0031, GSE13041 p = 0.00025, GSE16011 p = 0.016) (Fig. 4E–G). Univariate Cox analysis affirmed that the PTM score serves as an independent risk factor, with elevated risk scores associated with increased mortality risk and reduced survival durations (Fig. 4H).

Fig. 4figure 4

Development and Validation of a pro-tumor macrophage (PTM) prognostic model. A. A flowchart on the left delineates the development of a prognostic model associated with tumor-promoting macrophages in GBM patients, the center details the gene selection methodology utilized during the model’s formulation, and the right illustrates the lasso technique for gene filtration. B. A lollipop chart presents the eight pivotal genes incorporated into the model alongside their respective coefficients. C. Kaplan–Meier plots compare the outcomes of patients classified by high versus low-risk scores within the TCGA cohort. D. Temporal receiver operating characteristic (ROC) curves evaluate the model’s prognostic precision at 1, 3, and 5-year intervals in the TCGA training cohort. E–G. Kaplan–Meier plots validate the prognostic relevance of the PTM score across external cohorts: GSE108474 (E), GSE13041 (F), and GSE16011 (G). H. A forest plot synthesizes the univariate Cox regression outcomes for the PTM score, affirming its prognostic significance across both the foundational training set and subsequent validation cohorts

PTM signature correlation with tumor progression

In the TCGA dataset, differential analysis via limma was applied to separate low- and high-risk GBM patient groups (Fig. 5A). GSEA highlighted that the high-risk group was characterized by enrichment in functions related to tumor progression, including epithelial-mesenchymal transition, hypoxia, the TNFA signaling pathway, inflammatory responses, collagen synthesis, and the IL6-JAK-STAT3 signaling pathway (Fig. 5B). ssGSEA yielded similar results, with the high-score group showing enrichment in pathways linked to tumor progression, immune infiltration, and metabolic processes (Fig. 5C). By leveraging the apear package to amalgamate functions based on pathway similarities (Fig. 5D), it was found that upregulated pathways in the high-score group predominantly centered on immune regulation, antigen presentation, and the extracellular matrix, while downregulated functions concentrated on ribosomal components, organelles, and chromosomal remodeling. Consequently, we used MCPcounter to evaluate changes in immune cell infiltration across high- and low-risk groups (Fig. 5E). Box plots indicated enhanced infiltration of T cells, CD8 + T cells, and monocytes in the high-score group. Despite the observed increase in T cells, factors like T cell exhaustion and a dense matrix composition within the tumor microenvironment negatively impacted patient outcomes. The Estimate algorithm (Fig. 5F) showed that patients in the high-score group had elevated Estimate, immune, and stromal scores, providing a nuanced perspective on the prognosis for patients with elevated PTM scores.

Fig. 5figure 5

Biological significance of the PTM prognosis scoring system. A. A volcano plot highlights genes differentially expressed between low-risk and high-risk GBM patient cohorts within the TCGA database. B. A bar graph depicts hallmark gene set enrichment analysis outcomes, differentiating between high-risk and low-risk categories. C. A heatmap reveals the single-sample gene set enrichment analysis (ssGSEA) results, indicating functional enrichment among TCGA-GBM patient groups. D. A network graph illustrates the associations uncovered through functional enrichment analysis. E. Box plots detail the MCPcounter analysis outcomes, which quantify the extent of immune cell penetration in patient groups categorized by high and low risk. F. Box plots delineate the findings from the ESTIMATE algorithm, comparing immune score, stroma score, and tumor purity between high-risk and low-risk patient groups

Clinical value of the PTM signature

GBM encompasses a comprehensive spectrum of genomic features that have been leveraged clinically to decode molecular signatures, unravel disease mechanisms, and forecast patient prognosis [19]. In this light, we delved into the potential implications of the PTM score from a genomic standpoint. Figure 6A displays the gene mutations and copy number variations (CNVs) distinguishing the high and low-score groups. Predominantly, the high-score group manifested a higher prevalence of gene mutations (including TP53, PTEN, EGFR, MUC16, NF1, SPTA1, RB1, RYR2) and CNVs, such as amplifications in 1p36.21, 1q32.1, 1q44, and deletions in 1p32.3, 2q22.1, 2q37.1, more frequently than the low-score group. These genomic changes correlate with malignant tumor traits like increased proliferation, resistance to treatment, and evasion of apoptosis [20, 21]. Although Fig. 6B revealed no significant disparity in the mean mutation count between the groups, a closer examination uncovers a consistent genomic profile in the high-score group. Importantly, a higher mutation frequency in the TERT promoter region and reduced methylation levels in the MGMT promoter area were observed in the high-score group, indicating these genomic modifications could underpin the adverse prognosis associated with higher PTM scores (Fig. 6C). Intriguingly, alterations in chromosomes 19/20 were less common in the high-score group, and these genomic features might partly explain the favorable prognosis in the low-risk score group [22]. The PTM score aligns with prior typing, with the MES subtype making up 56% of the high-score patients, while 53% of the low-score group fell into the CL category (Fig. 6D). Additionally, the PTM score was shown to have prognostic significance in determining treatment outcomes. Utilizing data from the prism and CTRP databases, the low-score group exhibited lower AUC values, signifying a heightened sensitivity to temozolomide therapy.

Fig. 6figure 6

Clinical features and omics features correlated with the PTM score. A. Sequentially presented are the tumor mutational burden (TMB), the contributory extent of five specific mutational signatures, the leading 20 mutated genes, and copy number variations (CNAs). The distribution of high- and low-scoring cohorts relative to each genomic alteration is depicted on the right via bar charts. B. An evaluative comparison of mutation frequencies distinguishes between high- and low-scoring groups. C. Bar graphs depicting percentages delineate differences in TERT promoter mutations, TERT expression, chr19/20 amplifications, and MGMT promoter methylation across groups stratified by high and low PTM scores. D. Sankey diagram demonstrates the correlation between PTM scores and transcriptomic subtypes, accompanied by a depiction of the composition variance in transcriptomic subtypes between high and low-scoring groups on the right. E. Box plots reveal the area under the curve (AUC) for temozolomide efficacy within groups categorized by high versus low PTM scores. Significance indicated by *p < 0.05, **p < 0.01

Comments (0)

No login
gif