New international standards to aid data sharing
30 January 2012
Led by researchers at University of Oxford (UK) and the Harvard Stem Cell Institute (HSCI) at Harvard University, (USA), more than 50 collaborators at over 30 scientific organizations around the globe have agreed on a common standard for integrating biological data sets. This will make it possible to consistently describe the enormous and radically different databases that are compiled in the biosciences in fields ranging from genetics to stem cell science, to environmental studies.
This collaborative effort provides a way for scientists in widely disparate life science fields to co-ordinate each other's findings by allowing behind-the-scenes combination of the mountains of data produced by modern, technology driven science. This will allow researchers to put data to work more effectively and to find relationships between different research projects.
"We are now working together to provide the means to manage enormous quantities of otherwise incompatible data, ranging from the biomedical to the environmental," says Dr. Susanna-Assunta Sansone, the BBSRC-funded Team Leader of the project, based the University of Oxford's e-Research Centre.
A commentary, published on Friday (27 February) in the in the journal Nature Genetics describes an ecosystem of standard-compliant data curation and sharing solutions and the establishment of its on-line presence, the ISA Commons. The commentary is signed by all the collaborators.
This emerging commons depends on its participants' use of the 'Investigation', 'Study', and 'Assay' (ISA) metadata tracking framework. "The ISA system is the ideal solution for managing experimental metadata from diverse groups and is now a core solution at the NERC Environmental Bioinformatics Centre. We look forward in the future to being able to exchange data with other ISA-compliant projects," says Dawn Field, director of NERC NEBC, and visiting Professor at the Oxford e-Research Centre, noting that "this is the type of data sharing that should underpin ELIXIR."
"What we like about the ISA framework is its unifying nature across different bioscience fields and institutions", notes Dr. Christoph Steinbeck of the European Molecular Biology Laboratory, The European Bioinformatics Institute (EBI), who uses the ISA framework to power MetaboLights, the BBSRC-funded public repository for metabolomics experiments at EBI developed in collaboration with Griffin.
"An example of how this works at the Harvard Stem Cell Institute is that we can now find a relationship between experiments involving normal blood stem cells in fish and cancers in children", says Winston Hide, director of HSCI's new Center for Stem Cell Bioinformatics, and an associate Professor of Bioinformatics at the Harvard School of Public Health.
"Understanding biology requires data from multiple fields, laboratories and experiments to work together. That not only requires commons standards, but tools that enable scientists to work with those standards without additional overheads. ISA is a great example of these principles in action" says Lee Harland, CTO at ConnectedDiscovery, London, UK
"One of the things that I find most empowering about this effort is that now small research groups can begin to store laboratory data using this framework, complying with community standards, without their own dedicated bioinformatics support. It is a bit like Facebook allowing everyone to create their own website pages - suddenly you don't need to be an expert in computing to get your data out to the rest of the world", says Jules Griffin, of the University of Cambridge.
"It also has the potential to work for large centres too", says Scott Edmunds, editor of the journal published by open-access publisher BioMedCentral and BGI Shenzhen (previously known as the Beijing Genomics Institute) the world's largest genomics institute, "We are working with this framework to help harmonizing and presenting may large-data types as possible in a common standardized and usable form, publishing it in the associated GigaScience journal."
It was necessary to establish common data standards, say the commentary's authors, because of the tsunami of data and technologies washing over the sciences. "There are hundreds of new technologies coming along but also many ways to describe the information produced" said Sansone, noting that "we can take a jigsaw puzzle of different sciences and now fit the many pieces together to form a complete picture".
Notes to editors
Source article: Sansone, S-A. et al. Toward interoperable bioscience data. Nature Genetics 44, 2 (2012). ISA Commons.
Sansone's work is supported by the UK's Biotechnology and Biological Sciences Research Council (BBSRC) and Natural Environment Research Council (NERC); (BB/I000771/1, BB/I000917/1, BB/H024921/1, BB/I000860/1)
About the Oxford e-Research Centre
The Oxford e-Research Centre, www.oerc.ox.ac.uk , works across the University of Oxford, and at national and international level, to accelerate research through development of innovative computational and information technologies in multidisciplinary collaborations.
About the Harvard Stem Cell Institute
The Harvard Stem Cell Institute, www.hsci.harvard.edu , is a collaboration of more than 100 Harvard and Harvard-affiliated scientists dedicated to using the power of stem cell biology to advance basic understanding of human development in order to develop treatments and cures for a host of degenerative conditions and diseases.
Tags: press release