NCPI-FHIR-IMPLEMENTATION-GUIDE\Example Study Metadata

NCPI FHIR Implementation Guide
0.2.0 - ci-build

NCPI FHIR Implementation Guide - Local Development build (v0.2.0). See the Directory of published versions

Example Study Metadata

The Data Dictionary
Harmonizations Used
Summary Data

Study Metadata provides information about the study as opposed to contents of the study itself. This is very important for the purposes of understanding the actual data itself as well as the discovery of data suitable for a given researcher’s interests. Metadata will originate from multiple forms including the data-dictionary itself, summary results and details about transformations.

The Data Dictionary

The Study Metadata can be represented using a mix of profiles and terminology resources. For this example, we have a very simple dataset consisting of a single table called “demographics” which has 5 variables: subjectid, gender, age_at_enrollment, bmi and status. The variables gender and status each represent “categorical” type variables and each have their own list of possible values.

The first step is to define the language necessary to discuss these resources. To do this we create two CodeSystems:

The Dataset CodeSystem defines a local code representing each of the tables that make up the dataset. In this case, it is a CodeSystem with only one code defined.
The Data Table CodeSystem defines codes for each of the variables contained within a specific table. These codes are used throughout, not only the metadata FHIR resources, but also appropriate resources where the original source values were used, such as additional codes in an Observation or Condition. For this example, our table CS has 5 distinct codes, one for each of the variables mentioned above.

Now that we have a “language” to use we build out the descriptive metadata to help inform users about the actual contents of the data contained within this dataset.

The Subject ID Variable Definition indicates that this variable simply contains “string” data as its permitted data type and its code maps to the code, subjectid from the data table.
The Gender Variable Definition describes the variable as permitting only CodeableConcept data, which must be one of the values defined in this ValueSet.
The Age at Enrollment Variable Definition suggests that the variable’s type should be an integer and that it should be within the range of 10957 and 14610 days.
The BMI Variable Definition describes this variable as “Quantity” and that its units are “kg/m2”.
The Status Variable Definition defines a variable whose value must be one of a locally defined ValueSet.

Finally, we pull all of these variables together using The Dataset Study Table whose property, observationResultRequirement contains a reference to each of those variables.

Harmonizations Used

Data harmonization is necessary for interoperability and is a cornerstone aspect of the FHIR ingestion process. To provide FHIR users information about the transformations employed during the ETL process, NCPI recommends accompanying the data with a StudyDataDictionaryHarmony resource. The following example provides a very brief summary of such a resource which includes translating the study’s various variable names to UMLS terms, mapping the bmi measurement to the appropriate LOINC code and mapping status to a few different terms from HPO and MONDO.

Clients can translate terms using the ConceptMap’s $translate operation. For instance, the following would result in a list of mapping for the term bmi, given the fictitious FHIR URI:

https://someserver.us/fhir/R4/ConceptMap/example-study-data-dictionary-conceptmap-1/$translate?code=bmi

This would return a bundle with the UMLS term, C1305855, as well as the LOINC term, LP35925-4.

Summary Data

Summary information can be used by researchers to inform them of the suitability of the study’s data for their own needs.

Study Summary

This example of a Study Summary lists a small number of counts indicating the number of Cohorts that can be found, the number of samples and participants as well as the size of the dataset itself.

Variable Summary

In this example of a variable summary we can see how many males and females participated in the study. In addition to simple demographic counts, a study’s summaries could also include counts for individual conditions, labs and measurements such as weight and bmi (Observations) the study’s racial and ethnicity makeup, etc. Together, this information can provide researchers with a very clear understanding of the data that can be found within to decide whether or not

IG © 2021+ NCPI FHIR Working Group. Package NCPI-FHIR-Implementation-Guide#0.2.0 based on FHIR 4.0.1. Generated 2022-12-09
Links: Table of Contents | QA Report