PHA4GE’s SARS-CoV-2 contextual data specification package has been released in GigaScience Journal. The paper was authored by members of the Data Structures Working Group. Of importance, the contextual data standard is open source for usage among public health microbial bioinformatic groups.
The Working Group pointed out that SARS-CoV-2 sequencing is taking place in several public health laboratories globally. While this is important, “public health sequence data are of limited value without accompanying contextual metadata”.
From defining what contextual data is to providing a framework of the specification, the paper is an easy read for people of different academic backgrounds. The thorough detail in the full package where supplementary materials such as standard operating procedures, tools, a reference guide, and repository submission protocols, are provided; makes the standard becomes easier to put in practice.
Uptake of the PHAGE metadata specification has already spread globally. Some groups implementing it include: CanCOGeN (Canada), SPHERES (USA), AusTrakka (Australia and New Zealand), Global Emerging Pathogens Treatment Consortium (Africa), the African Centre of Excellence for Genomics of Infectious Diseases (Nigeria), Baobab LIMS at the South African National Bioinformatics Institute (SANBI) and the Latin American Genomics Network.
Hopefully, the “specification will improve the consistency of collected data, making information reusable by agencies as they continue working towards an increased understanding of SARS-CoV-2 epidemiological and biological characteristics, and harmonizing them such that community-based data-sharing efforts are not excessively burdened.” From this initiative, the Working Group anticipates that public health pathogen genomic surveillance in the future will be characterized by rapid development and deployment of pathogen-specific standards.
Link to paper: https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac003/6529104?login=false