A readme file provides information about a data file and is intended to help ensure that the data can be correctly interpreted by yourself at a later date or by others when sharing or publishing data. Standards-based metadata is generally preferable, but where no appropriate standard exists, writing “readme” style metadata is an appropriate strategy. A template for a README file is available on ORDO.
- Create one readme file for each data file, whenever possible. It is also appropriate to describe a "dataset" that has multiple, related, identically formatted files, or files that are logically grouped together for use (e.g. a collection of Matlab scripts). When appropriate, also describe the file structure that holds the related data files
- Name the readme so that it is easily associated with the data file(s) it describes
- Write your readme document as a plain text file, avoiding proprietary formats such as MS Word whenever possible. Format the readme document so it is easy to understand (e.g. separate important pieces of information with blank lines, rather than having all the information in one long paragraph)
- Format multiple readme files identically. Present the information in the same order, using the same terminology
- Follow the scientific conventions for your discipline for taxonomic, geospatial and geologic names and keywords. Whenever possible, use terms from standardized taxonomies and vocabularies
Recommended content
To enable data sharing you should ensure that you include all elements which are labelled "recommended minimum content".
Introductory information
- For each filename, a short description of what data it contains (recommended minimum content)
- Format of the file if not obvious from the file name
- If the data set includes multiple files that relate to one another, the relationship between the files or a description of the file structure that holds them
- Name/institution/address/email information for
- Principal investigator (or person responsible for collecting the data) (recommended minimum content)
- Associate or co-investigators
- Contact person for questions
- Date of data collection (can be a single date, or a range) (recommended minimum content)
- Information about geographic location of data collection (reommended minimum content)
- Date that the file was created (recommended minimum content)
- Date(s) that the file(s) was updated and the nature of the update(s), if applicable
- Keywords used to describe the data topic
- Language information
Methodological information
- Method description, links or references to publications or other documentation containing experimental design or protocols used in data collection (recommended minimum content)
- Any instrument-specific information needed to understand or interpret the data
- Standards and calibration information, if appropriate
- Describe any quality-assurance procedures performed on the data
- Definitions of codes or symbols used to note or characterize low quality/questionable/outliers that people should be aware of
- People involved with sample collection, processing, analysis and/or submission
Data-specific information
- Full names and definitions (spell out abbreviated words) of column headings for tabular data (recommended minimum content)
- Units of measurement (recommended minimum content)
- Definitions for codes or symbols used to record missing data (recommended minimum content)
- Specialized formats or abbreviations used (recommended minimum content)
Sharing/Access information
- Licences or restrictions placed on the data
- Links to publications that cite or use the data
- Links to publicly accessible locations of the data
- Recommended citation for the data
- Information about funding sources that supported the collection of the data
A template for a README file is available on ORDO.
Acknowledgements
These guidelines have been adapted from Cornell University’s Guide to writing “readme” style metadata