Data harmonisation across databases

“Standardisation and harmonisation describe a corpus of practices intended to allow interoperability of data and sample collections along a continuum from absolutely uniform collection to unfettered local variation in collection. Standardisation includes practices (standards) for prospectively implementing uniform processes for collection, storage and transformation of samples and data. Harmonisation includes practices which enable the pooling of data from multiple cohorts/biobanks at a level of precision that is scientifically adequate, yet accommodates the existing heterogeneity of those collections. Harmonisation also includes practices whereby prospective agreement is made to collect data in such a way as to directly enable pooled analysis” (BioSHaRE Consensus position on the distinction between standardisation and harmonisation, 2012).

A variety of tools and methods are developed in BioSHaRE for retrospective and prospective harmonisation, facilitating full valorisation of the database contents for the scientific community. 

Tool Description Keywords Website Demo
Data Harmonisation Across Databases
BiobankConnect Ontologies for variables classification index Biobanks, data mapping, data harmonisation, data integration, data search  
DataSchema Template for the retrospective harmonisation process by defining the common format measures to be derived using study data Data harmonisation, variable template, common format  
EnviroSHaPER Noise modelling tool Noise exposure, geographic information systems (GIS), CNOSSOS-EU, LAeq, road
Opal Management of study data enabling data harmonisation and data integration across biobanks/ cohort studies Data storage, data management, data harmonisation, DataSHIELD  
SORTA System for Ontology-based Re-coding and Technical Annotation of biomedical phenotype
Data harmonisation, data annotation, data recoding, ontology
Vortext/Spá System for literature based discovery Text mining, PDFs, literature based discovery, machine learning