Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Tallinn University of Technology


1. Data summary - questions, guidance, sample

 

What is the purpose of the data collection/generation and its relation to the objectives of the project?

...

  • Biomarker Data will be saved in a .csv format.
  • PCR data will be saved in .csv format
  • Questionnaire data will be saved in SAS format.  
  • Data on prescribing practices before and after pilot trial will be managed in SAS (file format: .sas7bdat) and analyzed in STATA (file format: .dta).
  • Interview responses will be saved in Nvivo .nvp format.
  • Survey responses will be exported from REDCap to .csv format.
  • Register data will be received in spreadsheet format and will be converted to .tsv format before analysis.
  • Sequencing data will be in .fastq format.
  • Flow cytometry data will be saved in .fcs format.
  • Confocal images will be saved in .jpeg format.
  • Proteome raw data will be saved in .raw files
  • Raw methylation data will be in .idat format.
  • Raw genetic variation data will be in .vcf format.

 FAIR data

 



2. Making data findable, including provisions for metadata - questions, guidance, sample


Questions:


Will data be identified by a persistent identifier?

...

Reusability – This sub-section should provide information on the expected documentation (e.g., explaining methodology, codebooks, variables),


Sample:

...

2. Making data findable, including provisions for metadata


  • Data will be described by rich metadata using standard or specified terminologies:

...

  • Metadata will be deposited at TalTechData and be freely searchable. There will be links to the underlying data.




3. Making data openly accessible - questions, sample


Repository:

Will the data be deposited in a trusted repository?

...

Will documentation or reference about any software be needed to access or read the data be included? Will it be possible to include the relevant software (e.g. in open source code)?



Sample:

...

3. Making data accessible 


Data and metadata will be retrievable by their unique and persistent identifier assigned by the TalTechData repository.

...

Analysis scripts and other developed code will be uploaded to TalTechData.



4. Making data interoperable - questions, sample


Are the data produced in the project interoperable, that is allowing data exchange and re-use between researchers, institutions, organisations, countries, etc. (i.e. adhering to standards for formats, as much as possible compliant with available (open) software applications, and in particular facilitating re-combinations with different datasets from different origins)?

...

Will your data include qualified references to other data (e.g. other data from your project, or datasets from previous research)?


Sample:

...

4. Making data interoperable 


We plan to make our datasets interoperable by using controlled vocabularies, keywords or ontologies where possible and by using file formats that are as open and widely used as possible.


5. Increase data re-use (through clarifying licences) - questions, sample



How will you provide documentation needed to validate data analysis and facilitate data re-use (e.g. readme files with information on methodology, codebooks, data cleaning, analyses, variable definitions, units of measurement, etc.)?

...

Describe all relevant data quality assurance processes.



Sample:

...

5. Increase data re‐use


 We plan to make our datasets reusable by assuring high data quality, by providing all documentation needed to support data interpretation and reuse and by clearly licensing the data via the repository so that others know what kinds of reuse are permitted.

...

  • Data will be quality‐checked at collection/generation by validation against controls or publicly available databases.
  • RNA seq data will be quality controlled in terms of sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity. Only high‐quality data will be included in the subsequent analysis.
  • The register holder assures data quality in terms of completeness and correctness of registration.
  • The transcribed interview material will be coded independently by two researchers.
  • Images will be inspected for artifacts and the results will be recorded in a spreadsheet file.
  • Mass spectrometry results will be quality‐checked for contamination and mass accuracy.
  • Register data will be quality controlled according to a procedure established in our group (REF).
  • Data will be checked at the point of entry in REDCap or SMART‐TRIAL for double entries, completeness, missing data and unreasonable values.  
  • To assure data quality, the study will be conducted according to the COREQ guidelines for qualitative research.

Allocation of resources



6. Allocation of resources - questions, guidance, sample


What are the costs for making data FAIR in your project?

...

Guidance - This section should include a discussion on the resources such as costs associated with compliance to the FAIR principles or who will be responsible for data management.


 Sample:Allocation of resources 


6. Allocation of resources 


  • Data management is performed by the PI / a research assistant / a postdoc / a dedicated data manager.
  • Salary of X EUR for a data manager in the group is required.
  • Access to the departmental server is required. It is expected to cost X EUR

...

Guidance - The management of other research outputs that are generated/re-used in the project (e.g., software, models, new materials) should be discussed and, when relevant, their compliance to the FAIR principles should be detailed.




7. Data security

...

- questions, guidance, sample

 

What provisions are in place for data security (including data recovery as well as secure storage and transfer of sensitive data)?

...

  • Data saved in XXX servers is backed up.  
  • Access to data saved in XXX servers requires user authentication with password.
  • Access to servers is permitted only when on TalTech premises or by VPN.
  • In OneDrive, it is possible to recover changed/deleted datasets.
  • We only work with pseudonymized data, with the key stored in a safety cabinet located at XXX (please specify location) and to which only XXX have access to (please specify the people that have access to it).
  • It has been judged that controlled access is not required for these data since the data do not contain personal information

 

Ethical aspects



8. Ethical aspects - questions, guidance, sample

 

Are there any ethical or legal issues that can have an impact on data sharing?

...

  • Sensitive personal data will be handled according to GDPR.
  • IP rights will be managed in accordance with the contract drawn up with our industrial partner organization (specify).
  • Survey and clinical data will be anonymized, i.e. all possibility to trace the data back to the study participant has been removed. The data is anonymized when the code key is destroyed and it is no longer possible to connect a person to the data.
  • Data will be pseudonymized and a key will be kept separately from the data.
  • Patient data is pseudonymized by the clinical collaborator and the code is not accessible to researchers in our research group. The material will arrive to research group coded, and the original code will be saved by the collaborators.
  • Ethical approvals/amendments and informed consent forms for the project are registered in the diary.
  • Consent has been acquired from human participants to process/share data.
  • Data Transfer/Processing agreements will be signed prior to any data sharing.
  • Results will only be presented on aggregated level without any possibility of backward identification.



9. Other - questions, guidance, sample

Do you, or will you, make use of other national/funder/sectorial/departmental procedures for data management? If yes, which ones (please list and briefly describe them)?

...