Data from 41 SCZ cases and 48 controls were available from the parent study, Shared Roots of Neuropsychiatric Disorders and Cardiovascular Disease Project (SR; HREC no. N13/08/115). The overarching aim of the parent study was to interrogate signatures (i.e., genomic, neural, cellular, and environmental) common to neuropsychiatric disorders and cardiovascular disease risk that could contribute to co-morbidity, symptom severity, and treatment outcomes.
Control participants were recruited through advertisements (i.e. print, radio and web), active recruitment within communities by a registered nurse, and word-of-mouth. Cases consisted of first-episode SCZ (FES total n = 24; recently diagnosed n = 8; within 5 years of diagnosis and treatment n = 16) and chronic SCZ cases (long-term and continuous presence of SCZ symptoms and/or treatment, n = 17). The FES cases were recruited from general and psychiatric hospitals and community health centres within the study catchment area following their first psychotic episode. The chronic schizophrenia cases were contacted and invited to participate in SR. Cases comprised individuals with a diagnosis of SCZ, schizophreniform, or schizoaffective disorder based on the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (SCID-IV) [34], without intellectual disability. Symptoms and severity of SCZ were assessed using the Structured Clinical Interview for DSM-IV (SCID) and Positive and Negative Syndrome Scale (PANSS) [35].
The Mini International Neuropsychiatric Interview (MINI) version 6 [36] was used to exclude participants with other major psychiatric disorders or substance abuse. Additional exclusion criteria included antibiotic use four weeks prior to stool sampling, intellectual disability, severe physical illness, any neurological disorder, and FES cases treated with a long-acting depot antipsychotic medication. Childhood trauma was assessed using the Childhood Trauma Questionnaire (CTQ) [37] and total scores were calculated by adding scores from physical abuse, sexual abuse, emotional abuse, emotional neglect, and physical neglect. Participants with a total score of > 41 (representing the lowest level score indicative of neglect or abuse for each subscale included) were classified as having childhood trauma, as previously used by our research group [38].
The WHO STEPwise approach to Surveillance (STEPS) instrument [39] was used to examine body mass index (BMI). The medical history questionnaire determined smoking and alcohol habits. Participants were considered current smokers or current alcohol consumers if they smoked or used alcohol in the past 6 months. The harmonised Joint Interim Statement (JIS) criteria were used to assess metabolic syndrome (MetS) in participants [40].
Sample collection and microbial DNA extractionThe collection of stool samples and subsequent DNA extraction was performed as previously described [41]. Briefly, stool samples were self-collected in pre-analytical sample processing (PSP) collection tubes (Stratec, Molecular, Birkenfeld, Germany) and stored at −20 °C in the Neuropsychiatric Genetics Laboratory (Department of Psychiatry, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town) prior to microbial DNA extraction.
Microbial DNA extraction was done using the PSP Spin Stool DNA Plus kit (Stratec Molecular, Birkenfeld, Germany). For each batch of microbial DNA extraction performed, a negative (non-template buffer to control for large-scale cross-contamination) and positive (ZymoBIOMICS microbial mock community standards of known composition to assess the accuracy of results, Zymo Research, Cat # D6300) control was added to the microbial DNA extraction run. Quantity and quality of the microbial DNA were determined using an ultraviolet–visible (UV–Vis) Spectrometer Nano-Drop 2000 (Thermo Scientific, USA), and the Qubit 4 Fluorometer (Invitrogen, ThermoFisher Scientific, Massachusetts, USA). The fourth hypervariable region (V4) of the 16S rRNA gene was amplified using the following primer pair [42]:
515 F (5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTGYCAGCMGCCGCGGTAA)
806 R (5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGGACTACNVGGGTWTCTAAT).
The library preparation and sequencing were done at the Centre for Proteomic and Genomic Research (CPGR, Cape Town, South Africa). Illumina sequencing adapters and the dual-index barcodes were attached using the Nextera XT v2 Index Kit (Illumina Inc., San Diego, CA, USA). Sequencing libraries were pooled and diluted to 5 pM. The libraries were then sequenced with 250 bp pair-end reads on the Illumina MiSeq sequencing instrument using a MiSeq Reagent v2 Kit. The expected fragment size of the V4 amplicon is approximately 250–300 bp. Samples (depth = 60 000 reads) were converted to FASTQ files (forward, reverse and index) after sequencing using the BCL-to-FASTQ file converter bcl2fastq (ver. 2.17.1.14, Il-lumina, Inc.).
Data preparation and taxa classificationThe data preparation, quality control and taxa classification were done as previously described [41] in R Studio [43] using the DADA2 pipeline (https://github.com/benjjneb/dada2) [44]. Quality control of the sequences consisted of assessing read quality profiles, filtering and trimming while maintaining overlap between the forward and reverse sequences. Error rates and inference of sample composition were estimated by a parametric error model, after which the sample inference algorithm was applied to the data to reduce redundancy and determine the number of unique sequences. Reads were merged with at least 12 bases in the overlap region [45]. After generating the amplicon sequencing variant (ASV) table and removing chimeras, a naïve Bayesian classifier method [44] was implemented to taxonomically classify the ASVs, using the Ribosomal Database Project (RDP) as a reference database [46]. The ASV table consisted of 3,628,011 reads, 12,204 taxa, and sparsity of 94.3%. Taxa observed in fewer than 15% of samples were eliminated from the ASV table as per a previous study [41].
Statistical analysis of metadataStatistical differences in metadata variables between cases and controls were assessed prior to diversity analysis. Categorical metadata included sex, MetS, childhood trauma, current alcohol consumption, and current smoker status. To assess the difference between cases and controls for categorical metadata variables, the Chi-square (χ2) test was used. The results were expressed in the form of numbers (N) and percentages (%). According to Shapiro–Wilk’s normality test, age and BMI were not normally distributed. Therefore, Mann–Whitney U-test (U test)[47] was used to assess differences between cases and controls with the results expressed as medians and interquartile ranges (IQR). Significance was defined as p < 0.05.
Diversity data analysisAlpha- and beta-diversity analyses were conducted using the QIIME2 q2-diversity plugin [48], and differential taxa abundance testing was done using Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC) [49] in QIIME2, using the q2-composition plugin [48]. P-values were adjusted for multiple testing according to Benjamini-Hochberg’s procedure [50] and represented as q-values. The statistical significance level was set at α = 0.05 for all tests.
Alpha-diversity analysisAlpha-diversity measures were calculated to assess the microbial diversity within individual samples in groups. The Shannon and Simpson diversity estimators [51] were used to estimate species richness and evenness, with Shannon being more sensitive to species richness whereas Simpson is more sensitive to species evenness [52].
Covariate selection for beta-diversityThe Multivariate Association with Linear Models 2 (MaAsLin2) package [53] in R Studio [43] was used to determine the association between metadata variables and microbial community abundance. All the potential covariates were assessed through the multivariate MaAslin2 function using the case–control status (“Control”) as a reference. Metadata variables were included as covariates in subsequent analyses of beta-diversity if a statistically significant result for MaAsLin2 was observed (q < 0.05).
Principal Coordinates Analysis (PCoA)A principal coordinates analysis (PCoA) was used to assess ordination and visualize the variance and dissimilarity of taxa composition between samples based on Euclidean distances [54]. The first three principal coordinates were utilized, as they represent the largest eigenvalues for a three-dimensional PCoA plot [54]. The ordination plot was used to examine any potential clusters based on case–control status and covariates.
Permutational multivariate analysis of variance (PERMANOVA) testsThe permutational multivariate analysis of variance (PERMANOVA) adonis test [55] was performed to determine if there were differences between microbial community samples (permutations = 999; α = 0.05). Pairwise comparisons between cases and controls were conducted to determine specific associations.
Differential abundance analysisTaxa associated with case–control status was assessed with ANCOM-BC [49], using control status as the reference. Microbial taxa were considered significantly differentially abundant at q < 0.05. The associated log-fold change was calculated during the ANCOM-BC diversity analysis [49].
Comments (0)