The Data Structures Working Group (DSWG) congratulates its members – Amos Raphenya (McMaster University) and Finlay Maguire (Dalhousie University), as well as James Robertson (Public Health Agency of Canada), Casper Jamin (Maastricht University), Leonardo de Oliveira Martins (Quadram Institute), Andrew McArthur (McMaster University) and John Hays (Erasmus University), on their recent Nature series publication in Scientific Data earlier this month, entitled Datasets for benchmarking antimicrobial resistance genes in bacterial metagenomic and whole genome sequencing. The manuscript was based on work carried out during the AMR Hackathon organized by the DSWG, CLIMB-Big Data and the Joint Programming Initiative on Antimicrobial Resistance (JPIAMR) back in October 2021. The datasets and accompanying metadata described in the manuscript are freely available for use in benchmarking studies of bacteria and their antimicrobial resistance genes and will help improve tool development for the identification of AMR genes in complex samples. This achievement is an example of the important work that can be done when people are brought together to use their creativity and expertise to solve problems at community-driven hackathons.
On that note, the 8th Microbial Bioinformatics Hackathon will be held September 11-13 in Bath, UK, in advance of the 13th International Meeting on Microbial Epidemiological Markers (IMMEM XIII). The event is organized by PHA4GE members Andrew Page, Nabil-Fareed Alikhan, and Lee Katz, as well as others, and will focus on addressing technical issues pertaining to antimicrobial resistance in the food chain, benchmarking datasets for bioinformatics tool validation, and scaling biological informatics methods. To register, please complete the and submit the hackathon application form.
In the second quarter of 2022, the DSWG also participated in a number of workshops and knowledge sharing sessions with organizations around the world working on developing bioinformatics resources and best practices. In March, the DSWG presented its SARS-CoV-2 Contextual Data Specification package at the COVID-19 Interoperability Workshop held by the Global Alliance for Genomics and Health (GA4GH). In May, GA4GH attended a WG meeting and shared its Variant Representation Specification for standardizing genetic variation data. In March, the European Society for Clinical Microbiology and Infectious Disease (ESCMID) presented their work in developing a standardized reporting template for NGS microbial typing to the DSWG. And in April, we were happy to host Elixir-CONVERGE, a project funded by the European Commission to help standardize life science data management across Europe. These meetings provided an opportunity to discuss technical details, but also served to create links between our different organizations so that we can better understand each other’s’ activities and missions. We hope to continue this community engagement in the second half of 2022.
The DSWG also embarked on a number of new projects, such as the INSDC Pathogen DOM (a set of recommendations for organizing information submitted to INSDC repositories), the SARS-CoV-2 Primer Scheme Standardization Project, the development of standardized contextual data tags for sequences submitted to repositories with known quality control issues, and a new Quality Control Project aimed at harmonizing QC metrics and criteria across a wide range of pathogens in order to provide guidance for labs seeking advice in setting up surveillance programs and to provide resources that can help support accreditation. Stay tuned for more about these exciting projects in upcoming newsletters.
To learn more about our activities, and/or how to join, check out our website.