The ICGC DCC is tasked with managing the data for the consortium. This data adheres to specific formats and restrictions to ensure a standard of quality and correctness. This is the case for data submitted by partner sites, but also for derived data produced by the DCC. These rules are captured in a document called a Data Dictionary (Dictionary for short).
File validation is also driven by the rules set forth in these dictionaries. Thus, they are the canonical reference for most of the validation pipeline and other downstream tools.
The basic conceptual model of the dictionary is presented below:
To view the dictionary and compare existing versions, please see the Dictionary Viewer.
To view high level notes on dictionary changes, please see Releases.