Any researchers who have tried to combine multiple datasets or validate findings in another dataset know how heterogeneity across datasets can make the process difficult or even impossible. At NSRR, we are working to address these challenges by standardizing and harmonizing important sleep measures and non-sleep covariates retrospectively. Standardization aims to reach uniformity in metadata across datasets, be it channel labels, annotations, variable definitions, sleep terminology, etc. Retrospective harmonization, on the other hand, aims to improve comparability of the same latent construct represented differently in different datasets, which usually require subject matter knowledge to assess the heterogeneity and to make decisions to generate inferentially equivalent (harmonized) content across datasets. As discussed in Dr. Mazzotti’s blog post, the Sleep Research Society (SRS) has recognized the importance of supporting harmonization of sleep and circadian data. The NSRR has developed various strategies for metadata standardization and data harmonization to address unique challenges in each domain (i.e., non-sleep covariates, sleep questionnaires, polysomnography summary and signal data, actigraphy data, etc.) Here, we showcase one of the approaches we developed.
As part of the data and metadata harmonization effort at NSRR, we have standardized selected survey instruments and developed a tagging system that can be used to search and to browse either aggregate scores or specific questions from a given instrument, within and across datasets. Our initial priorities were selected sleep standardized questionnaires.
Here we highlight the harmonization of information collected using the Epworth Sleepiness Scale. Previously, users needed to try different keywords to find ESS questions – such as, “epworth”, “ess”, “dozing”, etc., because none of these keywords were consistently used in variable labeling across datasets. Now that the folder structure has been standardized for all datasets, ESS variables are always located in “Sleep Questionnaires/Hypersomnia”. Alternatively, users can use standardized keywords (i.e., tags) to search for aggregate scores or individual questions.
To search for ESS variables within a dataset
To search for ESS variables across datasets
Users can use the following tags to explore other standardized survey instruments:
In addition to validated sleep survey instruments, we’ve also started standardizing and harmonizing a wide range of demographic, lifestyle, and clinical and polysomnography data (to be introduced in future blog posts).