The database covers 50,000 processed foods

The database covers 50,000 processed foods

A recent study used machine learning techniques to analyze more than 50,000 products from the websites of major U.S. grocery stores, developing the GroceryDB database to aid consumer decision-making and inform public health initiatives.

Quantifying the degree of food processing in grocery stores

Studies have shown the adverse health effects of relying on ultra-processed foods (UPF), which provide up to 60% of total caloric intake in developed countries. Much of UPF reaches consumers through grocery stores, motivating questions about quantifying the extent of food processing in the food supply, methods used, and alternatives to reduce UPF intake.

Measuring the degree of processing of food is not simple because food labels often contain mixed and unclear messages, leaving room for ambiguity and differences in interpretation. Therefore, scientists advocate a more objective definition of the degree of food processing based on biological mechanisms.

Moreover, due to the large scale and complex data, artificial intelligence (AI) methodologies are increasingly being used to improve food security.

About the study

Publicly available food product data was collected from the websites of leading U.S. grocery stores, Walmart, Target, and Whole Foods. Website navigation enabled the identification of specific food products, and consistency was ensured by standardizing the classification systems used by each store.

Food labels were used to standardize nutrient concentrations, and FoodProX was used to assess the degree of processing of each food product. FoodProX is a random forest classifier that translates combinatorial changes in nutrient amounts affected by food processing into a food processing output (FPro).

Extensive testing and validation of FPro stability has been performed. The final result was dependent on the likelihood of observing an overall pattern of nutrient concentrations in unprocessed foods compared to UPF. Variation in price per calorie at different levels of food processing was calculated using robust linear models with Huber’s t-norm.

Research results

Using the FoodProX machine learning classifier, the GroceryDB database assigned an FPro score to all food products. Across all three supermarkets, FPro distribution was similar, and the results suggest that low-FPro (minimal processing) foods make up a relatively small portion of grocery stores’ inventories. Most items fell into the high FPro or UPF categories. Low FPro items make up a proportionately larger share of actual purchases, demonstrating the discrepancy between sales data and available food options.

Some differences were observed between stores, for example, Whole Foods carries fewer ultra-processed products, while Target carries a high percentage of foods high in FPro. Low FPro differentiation was seen in categories such as jerky, popcorn cookies, macaroni and cheese, french fries and bread, highlighting the limited consumer choice in these segments. This was not the case in other categories such as cereals, pasta, milk and milk substitutes and snack bars, where consumers had more choice. Moreover, the distribution of FPro in GroceryDB and the latest USDA Food and Nutrient Database for Dietary Studies (FNDDS) was similar.

On a price-to-calorie basis, a 10% increase in FPro resulted in an 8.7% drop in product price per calorie across all categories in GroceryDB. Food category was important in the relationship between FPro and price per calorie, with most processed foods likely to be cheaper per calorie than minimally processed alternatives. The relationship between the milk and milk substitute category and FPro showed an increasing trend.

When it comes to store diversity in the same food category, the analysis found that cereals sold at Whole Foods tend to contain less artificial and natural flavors, less sugar and fewer added vitamins compared to Walmart and Target. The brands each store carries may also explain this heterogeneity, as Whole Foods relies on suppliers other than Target and Walmart.

Some food categories, such as pizza, popcorn and macaroni and cheese, are highly processed in all stores. According to GroceryDB, Whole Foods offers consumers a wider range of FPro cookies and cakes to choose from, while Target and Walmart have identical but narrower ranges of FPro scores.

The FPro component (IgFPro) was calculated ranging from 0 (unprocessed) to 1 (ultra-processed) to rank the ingredients based on their contribution to the degree of processing of the final product. By analyzing different foods, it has been shown that not all ingredients contribute equally to the degree of processing, and foods with a more complex ingredient list tend to be more processed.

Conclusions

In summary, this work used machine learning techniques to model the chemical complexity of grocery items offered by some leading US supermarkets. GroceryDB and FPro offer consumers a data-driven approach to identify similar but less processed alternatives across categories.

Magazine number:

Leave a Reply

Your email address will not be published. Required fields are marked *