Study protocol: the 'Endoscope CRC cohort, a prospective biobank study on the development and evaluation of diagnostic and prognostic biomarker profiles for colorectal cancer and premalignant lesions

Introduction and rationaleColorectal neoplasia

Colorectal cancer (CRC) is the third most common cancer in the world.1 Due to the ageing population and global increase in westernised lifestyle, which is associated with higher CRC prevalence, the worldwide burden of CRC is expected to increase.2 The majority of CRC cases occur sporadically while 12%–35% of CRC cases are hereditary.3 Sporadic CRC develops slowly through several stages from precursor lesions through the adenoma-carcinoma and serrated pathways.3 Early detection and resection of these lesions reduces the incidence of CRC and consequently leads to a significant reduction of CRC-related mortality and morbidity.4

CRC screening

Since 2014, Dutch inhabitants between the age of 55 and 75 years are biennially screened for CRC using the faecal immunochemical test (FIT) with subsequent colonoscopy if tested positive. Colonoscopy is the gold standard diagnostic procedure for colorectal neoplasia, but it is invasive, burdensome and expensive. In 2021, approximately 65 500 colonoscopies have been performed within the Dutch national CRC screening programme.5 CRC was identified in 4.5% of these colonoscopies while in 27%, 33% and 7%, advanced adenomas (AA), non-advanced adenomas (NAA) and sessile serrated lesions (SSL) were found, respectively. In 28% of colonoscopies, no relevant abnormalities (ie, no colorectal neoplasia) were found. The CRC incidence and mortality rate has decreased since the implementation of the programme.5 While these numbers clearly provide the evidence for the benefit of a CRC screening programme, they also point to the limitations and show that a significant number of patients undergo colonoscopies while no colorectal neoplasia is present. Furthermore, after the identification and treatment of colorectal neoplasia, most patients are advised to undergo surveillance endoscopies according to current guidelines, further increasing the number of colonoscopies in the future, with even lower detection rates of colorectal neoplasia compared with the index CRC screening colonoscopy.6–8 On the other hand, progress still has to be made to reduce CRC-related mortality even further. There are several reasons for only a partial reduction in CRC-related mortality after the implementation of the screening programme, including non-compliance or non-response to the FIT, postcolonoscopy CRC but also suboptimal diagnostic accuracy of the FIT.9 10 The pooled sensitivity (SN) and specificity (SP) of FIT for CRC are around 71% and 94%, respectively, resulting in both potential false-negative and even higher false-positive results. Furthermore, at an SP of 95%, the SN of FIT is only 25% and 29% for NAA and AA, respectively.11 The high number of false positives increases the number of colonoscopies which puts growing pressure on gastroenterologists, endoscopic facilities, patient burden and healthcare costs. In turn, false negatives result in missed CRC and in particular relevant precursor lesions, that is, (advanced) adenomas. Furthermore, after (curative) treatment of precursor lesions and CRC in all stages, current guidelines indicate surveillance using endoscopic, radiological and biochemical investigations. After treatment of CRC with curative intent, the percentage of relapse is 28% during follow-up.12 Based on the current knowledge, no prognostic markers are available which can identify patients at high risk for recurrence at the moment of diagnosis with high accuracy. Such a prognostic marker could alter primary treatment decision and could guide the intensity of follow-up of subgroups of patients. This is also true for precursor lesions, for which surveillance colonoscopies are necessary according to current clinical practice. Other screening modalities such as the multitarget stool DNA test, flexible sigmoidoscopy and CT colonography come with similar or other disadvantages that still warrant the need of new non-invasive diagnostic tools.13

Volatile organic compounds

The analysis of volatile organic compounds (VOCs) in, for instance, exhaled breath has the potential to provide non-invasive biomarkers for colorectal neoplasia that will overcome the limitations in current screening procedures and possibly provide a technical platform for a novel non-invasive, risk-free point-of-care device for colorectal neoplasia diagnosis and monitoring in the future.14 VOCs are gas phase chemicals with a high vapour pressure at room temperature and low molecular weight (<1 kDa) produced by various metabolic processes. VOCs excreted from humans consist of endogenously formed metabolites and exogenous compounds originating from microbial metabolism and environmental effects.15 They are measurable in different matrices including exhaled breath, faeces, blood, urine and saliva.16 Variation in VOC profiles results from environmental factors as well as host-related factors such as disease state, genetics and the gut microbiome and external sources such as diet, lifestyle and medication usage.17 Alterations in host and gut microbiota related to colorectal neoplasia have been shown to be reflected in VOC profiles in several matrices.18 Measuring the overall VOCs excreted, the volatolome, combined with sophisticated bioinformatics tools, will enable the exploration of potential colon-specific VOC fingerprints that will provide a diagnostic window on health and disease. Ultimately, the simple and non-invasive character of faeces/breath sampling has enormous translational potential to establish future applications in large (routine) screening programmes.

However, current studies analysing VOCs for the detection of colorectal neoplasia are heterogeneous in design, not considering profiles specific for (advanced) adenomas or SSL, various CRC stages, not taking into account prebowel or postbowel cleansing and rarely consider confounding factors such as lifestyle, medication or diet.18 This is a major limitation in many studies involving VOCs analysis. Several other key challenges exist, including the standardisation of breath collection procedures. Study designs should include representative study populations and consider relevant confounding factors before VOCs can be implemented as a clinical tool. Subsequently, VOCs should be thoroughly validated in new independently sampled target populations.

In 2016, we conducted a multicentre (one university hospital and one tertiary hospital) prospective proof of principle study in an FIT-positive CRC screening population. Here, we identified significantly altered concentrations in several VOCs in exhaled breath in patients with CRC and AA when compared with controls (ie, those without colorectal neoplasia), which resulted in an SN and SP of 77% and 70%, respectively.19 However, there were a number of limitations. First, the number of patients with CRC was low and biomarker selection specifically for CRC was not possible. Second, Tedlar bags were used for breath sampling and more sophisticated sampling and analytical technologies were implemented since then that have improved detection rates and stability of VOCs.20 Third, important lifestyle factors such as alcohol usage and smoking were missing and diet was not taken into consideration. Finally, the results were not validated and no follow-up data were available.

To overcome the limitations of previous studies and identify and validate a robust and highly accurate VOC profile for CRC but also its precursor lesions, the Endoscope CRC cohort (‘Development and evaluation of diagnostic and prognostic biomarker profiles for colorectal cancer and premalignant lesions’) has been set up. The study received approval from the medical ethics committee of Maastricht University Medical Centre (NL74844.068.20). Inclusion started in January 2022 and currently 349 patients have been enrolled. This study protocol describes the study population, different objectives and hypotheses, and methods.

Methods and analysisStudy design

The Endoscope CRC cohort is a prospective observational cohort study with biobank and with 5-year follow-up.

Study objectives and hypothesis

The cohort study will be set up as an umbrella study with biobank for subresearch questions, which are in line with the umbrella hypothesis and cohort study objectives. A flow chart providing a schematic overview of the main study and substudies is presented in figure 1.

Figure 1Figure 1Figure 1

Timeline of study. CRC, colorectal cancer.

The primary objective will be answered by a cross-sectional design, the secondary objectives by longitudinal and/or subgroup analyses.

Umbrella hypothesis

We hypothesise that within subjects with positive FIT undergoing a colonoscopy as part of the Dutch national CRC screening programme, non-invasive biomarkers or biomarker profiles for colorectal neoplasia (ie, CRC, adenomas and SSL) can be identified, with colonoscopy plus histopathological evaluation as the gold standard. Biomarkers with high predictive accuracy could enhance screening, aid in surveillance, provide better insights into the prognosis of colorectal neoplasia and guide decisions regarding the timing of future surveillance colonoscopies.

Primary umbrella study objective

To identify and validate a non-invasive biomarker profile based on (volatile) metabolites and gut microbiota composition and activity for colorectal neoplasia with maximum true-negative rate and low false-positive rate, with colonoscopy plus histopathological evaluation as the gold standard reference.

Secondary umbrella study objectives

To explore potential prognostic biomarkers for colorectal neoplasia, which predict the reoccurrence of new precursor lesions within 5 years of follow-up (ie, for prognostic purposes), in subgroups of patients that undergo a surveillance colonoscopy within the 5 years of follow-up, with the surveillance colonoscopy plus histopathological evaluation as the gold standard reference.

To explore changes in measured biomarkers and biomarker profiles between baseline and the moment of surveillance colonoscopy to study the metabolic and gut microbial changes during follow-up.

In addition, the Endoscope CRC biobank will serve as a biobank of well-characterised patients for future translational studies on the diagnosis, pathophysiology and disease characteristics of colorectal neoplasia.

Main study: exhaled and faecal VOCs for the detection of colorectal neoplasiaMain study primary objective

To identify a profile of VOCs with high SP and acceptable SN for colorectal neoplasia in a selected population of subjects with positive FIT undergoing a colonoscopy as part of the Dutch national CRC screening programme (ie, cross-sectional analysis), with the endoscopic plus histopathological investigation as gold-standard reference.

Main study secondary objective

To explore the accuracy of the VOCs profiles to predict or identify newly developed colorectal neoplasia during the 5 years of follow-up, with surveillance colonoscopies plus histopathology as gold standard, which are planned as part of regular care surveillance after the initial index colonoscopy (ie, longitudinal analysis).

Substudies: detecting changes in exhaled VOCs after bowel preparation and postpolypectomyObjective of bowel preparation substudy

To compare exhaled VOC profiles of patients undergoing a colonoscopy as part of the Dutch CRC screening programme before bowel preparation and after bowel preparation. We hypothesise that changes in exhaled VOC profiles may be seen in the short term and expect normalisation (ie, return to resemble its baseline) of exhaled VOC profiles in the long term.

Objective of bowel polypectomy substudy

To compare exhaled VOC profiles prepolypectomy and postpolypectomy. We hypothesise that VOC profiles may be normalised (ie, resemble profiles seen in subjects without colorectal neoplasia) after polypectomy.

Setting of the main study

Dutch inhabitants between 55 and 75 years old are biennially invited for the Dutch national CRC screening programme to complete an FIT, followed by colonoscopy if tested positive.21 Patients are first seen at a precolonoscopy outpatient screening intake. During the screening intake at Maastricht University Medical Centre+ (Maastricht, the Netherlands) and Catharina Hospital (Eindhoven, the Netherlands), the treating physicians or specialised nurses will inform eligible patients about the current study. If interested, patients receive detailed oral and written study information and an informed consent form. After informed consent, patients undergo baseline tests, including questionnaires and collection of blood, faeces and exhaled breath. Exhaled breath and faeces are sampled prior to bowel cleansing and colonoscopy. Blood is sampled postbowel cleansing but before colonoscopy. Questionnaires are filled in digitally or on paper. Furthermore, patients also provide access to their medical records including medical history, comorbidity, medication usage, demographics, treatment, histopathology and endoscopy data. All patients with NAA, AA or CRC undergoing polypectomy or curative treatment will have surveillance colonoscopies per national guidelines. For patients with CRC, this is 1 year and 3 years after index colonoscopy. For patients with a high-risk profile, this is 3 years after index colonoscopy, and for patients with a low-risk profile, this is 5 years after index colonoscopy. All patients, including those that do not undergo surveillance colonoscopy, complete follow-up questionnaires at years 1 and 5. Additionally, before each surveillance colonoscopy, participants will repeat questionnaires and provide faeces, blood and exhaled breath and their medical data will be updated. A timeline of the study is shown in figure 1.

Setting bowel cleansing and postpolypectomy substudy

In addition to the breath sample provided by the patients during their initial prescreening visit in the E-CRC cohort, patients are asked to attain exhaled breath shortly before the colonoscopy, that is, after bowel preparation, directly after the colonoscopy and approximately 2 weeks after the colonoscopy.

Study inclusion criteria of main and substudies

Adult subjects participating in the Dutch national CRC screening programme with positive FIT undergoing colonoscopy.

Study exclusion criteria of main and substudies

A history of CRC.

History of any malignant disease ≤5 years.

Major surgery within 6 months prior to colonoscopy.

Blood transfusion during 4 weeks prior to blood collection.

Serious non-healing wound, ulcer or bone fracture.

MethodsStudy groups of the main study

Study subjects will be categorised into four subgroups based on the endoscopic plus pathological findings of the index colonoscopy, which can include CRC, AA (size ≥1 cm, (tubulo)villous histology and/or high-grade dysplasia), NAA, SSL with or without dysplasia and those with no relevant endoscopic findings (ie, controls) including hyperplastic polyps, diverticular disease, haemorrhoids and angiodysplasia. The study groups will be based on the most advanced lesion found in case of multiple neoplastic lesions in the same subject during colonoscopy.

Sample size of the main study

At least 2000 subjects with positive FIT as part of the national Dutch CRC screening programme will be asked to participate in the study. They will be recruited from the prescreening intake of Maastricht University Medical Centre+ and Catharina Hospital Eindhoven.

Based on the national Dutch CRC screening programme data of 2017, the following distribution of findings during the index colonoscopy can be anticipated, after positive FIT5:

CRC—7.2%—144 cases.

AA—39.3%—786 cases.

NAA—24.5%—490 cases.

Serrated lesions—5.4%—108 cases.

No relevant findings (ie, controls)—23.6%—472 cases.

Sample size calculation of the main study

Data from our pilot study of AA and controls and VOCs analysis showed an SN of 75% for and SP of 70% (via bootstrapping) for patients with AA.

The current study aims at SN and SP ≥85%. The number of samples required is calculated by the following formula:

Embedded ImageEmbedded Image

Here n1 and n2 are sample size, P is disease prevalence, W is width of 95% CI for SN and SP of 10% and zα/2=1.96.

Taking all this into account, at least 130 individuals with CRC will be recruited to meet the necessary SN and SP. There are rules of thumb that specify that the various numbers of subjects per variable should be around 4–10 subjects or a minimum number of 50 subjects to yield a clear, recognisable factor pattern. This suggests that we would need at least 50 subjects (for disease and controls).

This study is designed as a cohort with a biobank for future subresearch questions. The current target number of participants to be included (ie, 2,000) is set as a minimum to ensure a relevant representation of all subgroups mentioned above, with approximately 7% (equivalent to 140 individuals) of index colonoscopies expected to identify CRC. These 10 additional individuals are included because we know that after 2017 (the reference year on which we base the power calculation), the percentage of identified CRC in the Dutch CRC screening programme slowly decreases.

Sample size calculation for bowel preparation and postpolypectomy substudies

The nature of the statistical method used in the metabolomics approach (using GC-MS) makes an exact sample size calculation difficult. However, in a recent review by Drago et al, changes in intestinal microbiota after bowel preparation were seen in small study populations with a maximum number of 23 healthy donors.22 Furthermore, in the studies by van Keulen et al and Leja et al, the SN and SP to distinguish between prebowel and postbowel preparation was roughly 75%.23 ,24 Therefore, and based on the usage of more sensitive platforms that simultaneously strongly reduce background effects in our current study and the possibility to correct for intervariation (variation between subjects), we hypothesise and expect to reach a higher SN and SP of 87%, although 70% would be acceptable. Based on these numbers, we used equation 1 and below to determine the number of subjects needed. Here, n1 is the total number of samples needed, Embedded ImageEmbedded Image is a value from a standard normal table (1.96), SN is sensitivity (87%), W is the acceptable width of the 95% CI (17%), and the overall is multiplied by 2 to account for the prevalence of either one of the groups (equal group sizes). Therefore, for our first substudy on bowel cleansing, we aim to include 30 subjects. Patients without any endoscopic abnormalities are considered controls.

Embedded ImageEmbedded Image (1)

For the second substudy on postpolypectomy, data on the optimal number of patients to include are scarce. The factors involved are similar to those mentioned previously with a polypectomy as an additional factor. Therefore, we expect a bigger effect size. One of the questions is whether the VOC profile that differentiates AA from negative controls will disappear when prebowel cleansing is compared with postbowel cleansing after about 2 weeks. Based on an expected SN and SP of 85% with an acceptable window of 15% and using similar calculations, this would mean to include 44 subjects with AA. Based on a population with 120 subjects, this target is reached as in such a population we expect to include 30–33 controls, 28–36 patients with NAA, 44 patients with AA and 7 patients with CRC using the most recent national CRC screening data.25

Sampling procedures for main and substudies

The following medical information and biomaterial will be collected (see table 1).

Table 1

Overview of collected data and biomaterial in the endoscope CRC (E-CRC) cohort

Blood sample collection

Before the colonoscopy and in fasting conditions, a blood sample will be collected from an antecubital vein in the forearm. A volume of 8 mL will be collected in an SST tube to obtain serum, and one 9 mL BD Vacutainer tube containing K2EDTA will be used to collect plasma and whole blood. After collection, the 8 mL SST tubes will be kept at 20°C to allow coagulation and will be centrifuged (10 min, 4000 rpm, 4°C). Serum and plasma will be collected in 0.5 mL aliquots and stored at −80°C until analysis. Blood will be stored in the biobank for future research questions in line with the objectives of this study.

Exhaled breath sample collection

Exhaled breath will be collected at the gastroenterology outpatient clinic after the intake. Subjects measure their breath in resting conditions. In each centre, one study room is designated for breath sampling. Patients will be asked to exhale air in a ReCIVA breath sampler (Owlstone Medical, Cambridge, UK), which transports and traps VOCs of the alveolar breath fractions onto desorption tubes; stainless steel two-bed sorption tubes, filled with TENAX/Carbograph 5TD (Markes International, Llantrisant, Wales, UK). In addition, ambient air and site equipment are checked weekly. These will be stored at 4°C and dry purged within 2 weeks on subsequent analysis. On dry-purging the tubes (ie, to diminish the effects of water) GC-MS will be used for determination and identification of VOCs in breath samples as described previously.26 VOCs will be measured targeted and untargeted, and subsequently machine learning tools will be used to identify specific VOC profiles for colorectal neoplasia.

Faecal sample collection

Patients will be provided with a faecal specimen collection bucket and the day prior to colonoscopy, and prior to bowel cleansing, they will collect a single faecal donation at home, store it in the freezer at −20°C for no longer than 24 hours and deliver it when arriving at the hospital in a cool box at 4°C. Faeces is stored at −80°C until subsequent analysis. Part of it will be aliquoted and used to statically collect faecal VOCs using HiSorb probes which will be subsequently measured using GC-MS.27

Measurements in biopsy material

In case of future research questions in line with the objectives of this study, residual histological material may be used if the patient has provided permission. In case histological material has been obtained during colonoscopy or after surgery and stored at the department of pathology, permission of the study participants is asked to use residual histological material for research purposes. Potential applications include tissue microbiota and VOCs analyses.

Questionnaires

Patients are asked to fill out following questionnaires at baseline and follow-up:

Demographic characteristics, medical history, self-reported comorbidities, medication issued and lifestyle factors, including smoking, alcohol use and diets.

Euroqol-5 Dimension to assess quality of life.

Data from medical files

Patients will be asked permission to collect data from their medical files at baseline, year 1 and year 5, and if applicable, during each surveillance colonoscopy. The data collected will be related to:

Medical history and medication use in line with data collected as part of the National Dutch CRC screening programme.

Colonoscopy results.

Histological findings related to colorectal lesions or in case of malignancy to metastasis.

Radiological results of radiology reports performed for staging of colorectal lesions.

With regard to follow-up measurements during surveillance colonoscopy: data related to recurrence or new premalignant or malignant colorectal lesions or metastasis.

AnalysisBaseline statistical analysis

Clinical, anthropometric, demographical, endoscopic and pathology data will be collected in a standardised manner as part of the national Dutch CRC screening programme using hospital records and questionnaires. Clinical data such as age, body mass index (BMI), endoscopic findings, medication usage and medical history will be presented as either mean with corresponding SD, median with corresponding IQR or fixed number with relative percentage. Differences in baseline characteristics will be tested using χ2 test (dichotomous data), independent samples t-test (normally distributed continuous data) or Mann-Whitney U test (non-normally distributed continuous data). One-way analysis of variance (ANOVA) will be performed to compare differences between means in two or more groups. An α-level <0.05 was defined as statistically significant.

Breath data preprocessing

GC-MS data will be preprocessed prior to subsequent statistical analysis. This consists of noise removal via wavelets and baseline correction via P-splines. Next, chromatograms are aligned using Correlation Optimised Warping. Subsequently, probabilistic quotient normalisation together with peak picking based on retention times and mass spectra will be performed to normalise and create the functional data matrix, respectively. To diminish the number of uninformative variables, only compounds statistically confirmed as on breath and those detected in at least 20% of the samples in either one of the categorised disease classes (ie, controls, NAA, AA, SSL and CRC) will be kept for further analysis. Lastly, the data will be explored and corrected for the presence of instrumental variation over time (ie, batch effects) using Combat and Surrogate Variable Analysis.

Breath data analysis

Upon data preprocessing, the exhaled breath contents of the FIT-positive population will be used to find discriminatory markers and to create prediction models for the final outcomes of the colonoscopy procedures. FIT-positive individuals will be assigned as either control, NAA, AA, SSL or CRC. To this purpose, multiple models will be built to discriminate controls versus NAA versus AA versus CRC. Several modelling strategies exist, including partial-least squares discriminant analysis, support vector machines, tree-based techniques (ie, including random forest (RF)) and neural networks. Based on the data characteristics (ie, combined with the research question), which include sparsity, missing values, heterogeneity and a need to interpret the classification algorithm, tree-based techniques are suitable. They do not suffer from centring and scaling issues, are robust to outliers and can handle missing values. Using biplots, they can be interpreted similarly to Principal Component analysis (PCA) biplots and can be extended to unsupervised tasks (i.e. Unsupervised Random Forest (URF)). Therefore, in the following scenarios we describe the data analysis making use of RF algorithms.

Example model 1: discriminating controls and AA

To discriminate between controls and AAs in model 1, the dataset is first split up into a training set and independent test set using the isolation-forest procedure. Variable selection is based on the variable importance as assessed by an internal iterative validation procedure of RF (1000 iterations with per iteration 1000 trees). Using only the set of most discriminatory marker variables, the final model will be built and tested on the independent test set. A stepwise overview of the procedure is presented in figure 2.

Figure 2Figure 2Figure 2

Representation of the data analytics that were applied in the current study. An elaborate overview of the different preprocessing steps is available here.21 In this study, COMBAT was applied as a batch-effect correction technique and isolation forests were used to select the representative subset for the independent test set. GC-MS, gas-chromatography-mass spectrometry; RF, random forest.

Model performance

Model performance will be assessed using receiving operating characteristics curves as well as precision recall curves.

Compound identification

When the set of discriminating, VOCs is confirmed in an independent test set, they will putatively identified using the library (National Institute of Standards and Technology) in combination with expert interpretation. Next, their identities will be confirmed using standards.

Example substudy analyses

Both substudy analyses are examples of (nested) study designs. Several methods exist for the analysis of designed studies, ranging from univariate methods (eg, ANOVA) to multivariate methodologies (eg, ANOVA-simultaneous component analysis (ASCA)).28 Within the univariate framework, linear mixed models (LMM) are popular and capable of distinguishing between fixed effects (eg, polypectomy or bowel cleansing) and random effects (eg, repeated measurements of patients) while considering covariates (ie, age, BMI). Typical multivariate techniques are regularised-MANOVA and ASCA.28 29 However, recently, it has been acknowledged that regular versions of these multivariate algorithms break down when working with noisy data, unbalanced data, missing values or dealing with random effects (ie, repeated measures). To resolve these issues, a plethora of extensions have been created. ASCA+ is an extension of ASCA that uses the LMM framework instead of ANOVA, making it effective for handling imbalanced data.30 (V)ASCA+builds on this framework by adding feature selection.31 Other implementations include LMM-PCA, repeated-measures-ASCA+ (RM-ASCA+) and longitudinal RM-ASCA+.32–34 These implementations have shown success in dealing with random effects. Moreover, the matrix decompositions can be performed or combined with other non-linear algorithms such as URF. Altogether, the exact data-analytical approach will depend on missing values, missing measurements, signal-to-noise ratio, non-linearities and practical constraints with respect to the design (ie, such as the time dimension).

Faecal analysis

A part of the aliquoted faeces will be used for faecal VOCs analysis and a part will be stored in the biobank for future microbiota analysis.

Blood analysis

Whole blood, serum and plasma samples will be stored in the biobank for future research questions in line with the objectives of this study.

Patient and public involvement

Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.

Ethics and dissemination

The study will be conducted according to the principles of the Declaration of Helsinki (64th WMA General Assembly, Fortaleza, Brazil, October 2013) and in accordance with the guideline Medical Research Involving Human Subjects Act. The study was approved by the medical ethics committee at the Maastricht University Medical Center+ (NL74844.068.20) in November 2021. All participants will provide informed consent prior to their inclusion in the study, ensuring that they fully understand the purpose, procedures, risks and benefits of the study. Informed consent will be asked for the storage of biomaterial in the biobank for future studies. Data and biobank samples are handled confidentially according to the EU General Data Protection Regulation and will be coded to ensure anonymity. All samples and data will be coded in such a way that no personal information about the participants will be available. Only authorised study personnel will have access to the identifying information. Data sharing will follow the principles (Findable, Accessible, Interoperable and Reusable) to promote transparency and encourage further research. Before deposition, data will undergo thorough quality checks and will be anonymised to protect participant confidentiality. To ensure long-term accessibility and usability, data curation will be performed by the study’s data management team using Castor, an electronic data management platform. This process will include maintaining version control, ensuring proper storage formats and updating metadata as required. The data repository selected will have long-term data preservation guarantees to ensure that the data remains accessible for a minimum of 10 years. Any sharing or reuse of the study data will be subject to a data-sharing agreement that ensures compliance with ethical guidelines and regulatory standards.

No analyses have been conducted and no publications have been made using study data currently. We aim to publish at least one main publication of study results in a peer-reviewed scientific journal. Additional publications as a result of the substudies are expected. The authors of the publication will be the members of the research group if a considerable contribution is given.

Comments (0)

No login
gif