METHODS: Common data elements (CDE) were created using the National Cancer Institute’s Cancer Data Standards Registry and Repository. CDEs consist of two parts: the Data Element Concept describes the form question and the Value Domain describes how the answer should be reported. Each CDE contains the metadata required for a form question. Questions requiring the same metadata are represented by a common CDE. In addition, the metadata in FN was moved from the question level to a cross-form data dictionary and linked to CDEs to further standardize the required data and its format.
RESULTS: A review of the current baseline and follow-up forms indicates that harmonizing the data dictionary entries (DDC) with the CDEs has led to an approximately 25% reduction (583 DDCs and 437 CDEs) in the number of data points being defined multiple times. This has led to more consistent forms and data collection.
CONCLUSION: The CIBMTR's data collection forms include questions that are asked multiple times within and across forms. To facilitate data entry and analysis, form inconsistencies needed to be addressed. To help alleviate these issues, the data dictionary entry and metadata are tied to a CDE. In addition, a metadata review is now undertaken at each step in the form revision/development process to ensure questions are harmonized, terminology is used consistently, question formats are standardized, and the option values are semantically similar. Exceptions are only allowed when clinical differences and regulatory compliance dictate. The benefits of using well-defined metadata and data standards include unambiguous interpretation of data points, improved data exchange, facilitated data analysis, improved cross-form consistency, and the creation of a pool of data elements to be used for new form development.