PHA4GE meetings aim to set standards for metadata

Share:

Over two weeks, some 30 delegates met in South Africa for four stakeholder meetings to refine metadata standards for cholera, malaria, meningitis, and neonatal sepsis.

The tangled issue of sharing data about disease-causing pathogens in ways that can inform public health responses to outbreaks took centre stage at two exhaustive, and exhausting, workshops that took place at UWC’s South African National Bioinformatics Institute (SANBI) at the end of April and beginning of May 2026.

Some 30 delegates from across Africa, as well as from the UK and the US, attended the meetings, co-hosted by the Public Health Alliance for Genomic Epidemiology (PHA4GE), an international network whose secretariat is based within SANBI. The focus of the meeting was the development of metadata standards for four diseases of priority in Africa, namely cholera, malaria, meningitis, and neonatal sepsis.

As it has in other biomedical disciplines, metadata has become a pillar of pathogen genomics, which is the sequencing of the entire genome of pathogens. Described as the data about the genomic data, metadata gives context about each sequence. 

Metadata runs the gamut of details. It could tell you where a biomedical sample – be it a swap or sputum, blood or, in the case of cholera, a stool sample – was first taken. Or it could list the preparation methods and protocols that lab technicians used and followed in the laboratories. Or records the date that a new outbreak, or individual case, was confirmed. Among many other attributes. 

Used in conjunction with other types of data, metadata helps ensure that raw genetic data complies with the FAIR guiding principles of scientific data – Findability, Accessibility, Interoperability, and Reusability. 

“We are increasingly integrating genomic data into public health decision-making, such as during COVID-19 when it was used to identify new strains,” explains Dr Tracey Calvert-Joshua, scientific lead within PHA4GE. “But we also know that genomic data is not actionable without having context to it, which is why metadata matters.”

The in-person meetings follow up on multiple online discussions that have taken place over the past year, Dr Calvert-Joshua adds. Based on these discussions, draft standards were developed by PHA4GE’s Dr Dominique Anderson and Dr Eddie Lulamba. These draft standards were then probed and reworked at the UWC meetings. Supporting documents like a data dictionary (for terms used in the standards), standard operating procedures (SOPs), and other supporting training material were also fine-tuned.

One of the challenges delegates faced was paring down the standards to the absolutely essential information that can be sourced under real-world constraints. Consider a nurse working in a rural clinic being asked to complete lengthy forms while also caring for lines of patients.

“We have to be considerate of the users,” says Stephen Kanyerezi, a bioinformatics developer with Uganda’s Central Public Health Laboratories (CPHL). “What we are developing is not ours, and we have to think of who is going to use the final users – what data they need, balance with what data someone can realistically collect.”

The ‘final’ metadata standards will finally be piloted in health facilities and laboratories.