1. Data summary
What is the purpose of the data collection/generation and its relation to the objectives of the project?
What types and formats of data will the project generate/collect?
Will you re-use any existing data and how?
What is the origin of the data?
What is the expected size of the data?
To whom might it be useful ('data utility')?
Guidance – The type[s] of data that will be used in the project is[are] [insert the types of data that will be used such as experimental, observational, images, text]. The estimated size of the data is [insert data size]. The project will [collect/re-use existing/collect and re-use existing] data. The origins of the data will be [insert where data will be collected from and/or the origins of the re-used dataset].
Sample:
Origin of data:
- Image files will be recorded from a confocal microscope.
- RNA sequencing data will be generated from normal and tumor tissues from patients.
- Patient data will be acquired from the XXX Register.
- Survey responses will be acquired using the REDCap survey software.
- Measurements of markers of liver and renal function will be collected in the SMART‐TRIAL system. · Respondent data will be acquired in clinical interviews.
- Existing bioinformatics data will be used for new analyses.
Data format:
- Biomarker Data will be saved in a .csv format.
- PCR data will be saved in .csv format
- Questionnaire data will be saved in SAS format.
- Data on prescribing practices before and after pilot trial will be managed in SAS (file format: .sas7bdat) and analyzed in STATA (file format: .dta).
- Interview responses will be saved in Nvivo .nvp format.
- Survey responses will be exported from REDCap to .csv format.
- Register data will be received in spreadsheet format and will be converted to .tsv format before analysis.
- Sequencing data will be in .fastq format.
- Flow cytometry data will be saved in .fcs format.
- Confocal images will be saved in .jpeg format.
- Proteome raw data will be saved in .raw files
- Raw methylation data will be in .idat format.
- Raw genetic variation data will be in .vcf format.