What documentation and metadata will accompany the data?

Questions to consider:

  • What information is needed for the data to be read and interpreted in the future?
  • How will you capture/create this documentation and metadata?
  • What metadata standards will you use and why?
  • What metadata will be provided to help others identify and discover the data?

Guidance:

Describe the types of documentation that will accompany the data to help secondary users to understand and reuse it. This should at least include basic details that will help people to find the data, including who created or contributed to the data, its title, date of creation and under what conditions it can be accessed. Documentation may also include details on the methodology used, analytical and procedural information, definitions of variables, vocabularies, units of measurement, any assumptions made, and the format and file type of the data. Consider how you will capture this information and where it will be recorded. Wherever possible you should identify and use existing community standards.

 

 

When select the "Metadata standards" option:

SAMPLE 1:

The clinical data collected from this project will be documented using CDASH v1.1 standards. The standard is available at CDISC website.

SAMPLE 2:

Using an electronic lab notebook, we would be generating metadata along with each notebook and postings. The metadata would include Sections, Categories and Keys which would be assigned by collaborators for reuse so as to maintain consistency in the use of terminology. We would also be using the Properties Ontology (ChemAxiomProp) when describing the chemical and materials properties.

SAMPLE 3:

We will be using some core elements from the TEI metadata standards to describe our data. We will also be adding some customised elements in the metadata to provide more details on the rights management.

SAMPLE 4:

We have our own developed software platform for data documentation and metadata storage that is published recently in PLOS Computational Biology 16(12): e1008475. The platform has full documentation of its parts. It includes details about the methodology used, analytical and procedural information, definitions of variables, units of measurement, any assumptions made, and the format and the file type of the data. In the future, when a new analysis method is developed, we immediately add corresponding documentation for future usage.



When select the "No metadata standards will be used." option:

SAMPLE 1:

I will not be using any metadata or international standard for the data collected and generated for this project. However, I will ensure each document that I have created using the Microsoft Word, Microsoft Excel and Microsoft PowerPoint has sufficient basic information such as Author’s name, Title, Subject, Keywords and etc. in the document properties. In addition, a separate readme file will be prepared to describe the details of each data. I will be applying the recommendations provided by Cornell University in the creation of readme file(s). Key elements could include: introductory information about the data, methodological, date-specific and sharing/access related information.

SAMPLE 2:

Metadata about timing and exposure of individual images will be automatically generated by the camera. GPS locations will subsequently be added by post-processing GPS track data based on shared time stamps. Metadata for the image dataset as a whole will be generated by the image management software (iMatch) and will include time ranges, locations, and a taxon list. Those metadata will be translated into Ecological Metadata Language (EML), created using the Morpho software tool, and will include location and taxonomic summaries.

The dataset will be accompanied by a README file which will describe the directory hierarchy and filenaming convention.

Each directory will contain an INFO.txt file describing the experimental protocol used in that experiment. It will also record any deviations from the protocol and other useful contextual information. Microscope images capture and store a range of metadata (field size, magnification, lens phase, zoom, gain, pinhole diameter etc) with each image. This should allow the data to be understood by other members of our research group and add contextual value to the dataset should it be reused in the future.