Harmonising dietary datasets for global surveillance: methods and findings from the Global Dietary Database
Malnutrition in all its forms is a leading modifiable risk factor for mortality and morbidity globally(1–Reference Mozaffarian, Hao and Rimm7). It is, therefore, essential that scientists, policy makers and other stakeholders are able to characterise the whole diet (food, beverage and nutrient intakes), estimate diet-related health, economic and environmental burdens and inform and implement evidence-based priorities(2,3,Reference Juma, Mohamed and Matanje Mwagomba8–Reference Haddad, Hawkes and Webb12) . Reliable, comprehensive and regularly collected dietary data in all nations are critical for such work. Given that national patterns hide significant inequalities within countries and populations, it is essential that dietary data are further disaggregated by key subgroups, such as age, sex, ethnicity, socio-economic status and location (e.g. urban or rural)(13). Ideally, dietary data should be nationally representative and collected at the individual level, using validated diet assessment tools, such as 24-h dietary recalls, that collect detailed information on every food item consumed(Reference Micha, Coates and Leclercq14). A plethora of national, subnational and community-level nutrition surveys at the individual level have been conducted worldwide(Reference Miller, Singh and Onopa15). However, these are rarely harmonised, hence not comparable between and within countries, populations and over time, due to differences mainly in representativeness, diet assessment tools used and data analysis and reporting.
Over the past 10 years, dietary surveys from around the world have been collated and harmonised at the food group level, e.g. total fruit intake, as part of the Global Dietary Database (GDD). However, such efforts have not previously accounted for the often large variation in the classification and description of individual food items, for example related to differing food definitions, preparation methods, local food names and more. For example, a food may be described by a simple food name (e.g. chicken), but not capture multiple additional levels of detail (e.g. baked or fried; light or dark meat; with or without skin and with or without added oil or salt). Local food names can also denote different foods in varying countries or even within the same country (e.g. biscuit in the USA v. the UK). Such heterogeneity can lead to inconsistent food names, definitions, groupings, food matching (referring to the process by which a food is assigned to its corresponding nutrient content using a food composition table or database) and ultimately discrepancies in analysis and reporting. Furthermore, inaccurate or inconsistent food aggregation (or classification) of individual food items with similar characteristics into larger food groups/ categories in a hierarchical manner can compound the errors. This could include, for example, heterogeneity in classifying a food (e.g. avocado and tomato) as a fruit v. vegetable or in different levels of food groups (e.g. total vegetables and starchy vegetables). Accurate food classification is especially crucial for mixed dishes and packaged foods, which are an increasing part of the global food supply and must be disaggregated into their ingredients in a standardised fashion to capture intake from all sources. Moreover, even if standardised, such dietary data are rarely publicly available, limiting use by and impact for national and global nutrition communities(Reference Leclercq, Allemand and Balcerzak16).
To address these critical gaps and advance the quality and quantification of dietary intakes worldwide, the GDD, in collaboration with the European Food Safety Authority (EFSA) and the Food and Agriculture Organization of the United Nations (FAO), developed and implemented comprehensive and standardised methods to streamline the collation, harmonisation and public dissemination of primary individual-level dietary datasets worldwide.