←back to thread

311 points joshdickson | 1 comments | | HN request time: 0s | source

Hi HN!

Today I’m excited to launch OpenNutrition: a free, ODbL-licenced nutrition database of everyday generic, branded, and restaurant foods, a search engine that can browse the web to import new foods, and a companion app that bundles the database and search as a free macro tracking app.

Consistently logging the foods you eat has been shown to support long-term health outcomes (1)(2), but doing so easily depends on having a large, accurate, and up-to-date nutrition database. Free, public databases are often out-of-date, hard to navigate, and missing critical coverage (like branded restaurant foods). User-generated databases can be unreliable or closed-source. Commercial databases come with ongoing, often per-seat licensing costs, and usage restrictions that limit innovation.

As an amateur powerlifter and long-term weight loss maintainer, helping others pursue their health goals is something I care about deeply. After exiting my previous startup last year, I wanted to investigate the possibility of using LLMs to create the database and infrastructure required to make a great food logging app that was cost engineered for free and accessible distribution, as I believe that the availability of these tools is a public good. That led to creating the dataset I’m releasing today; nutritional data is public record, and its organization and dissemination should be, too.

What’s in the database?

- 5,287 common everyday foods, 3,836 prepared and generic restaurant foods, and 4,182 distinct menu items from ~50 popular US restaurant chains; foods have standardized naming, consistent numeric serving sizes, estimated micronutrient profiles, descriptions, and citations/groundings to USDA, AUSNUT, FRIDA, CNF, etc, when possible.

- 313,442 of the most popular US branded grocery products with standardized naming, parsed serving sizes, and additive/allergen data, grounded in branded USDA data; the most popular 1% have estimated micronutrient data, with the goal of full coverage.

Even the largest commercial databases can be frustrating to work with when searching for foods or customizations without existing coverage. To solve this, I created a real-time version of the same approach used to build the core database that can browse the web to learn about new foods or food customizations if needed (e.g., a highly customized Starbucks order). There is a limited demo on the web, and in-app you can log foods with text search, via barcode scan, or by image, all of which can search the web to import foods for you if needed. Foods discovered via these searches are fed back into the database, and I plan to publish updated versions as coverage expands.

- Search & Explore: https://www.opennutrition.app/search

- Methodology/About: https://www.opennutrition.app/about

- Get the iOS App: https://apps.apple.com/us/app/opennutrition-macro-tracker/id...

- Download the dataset: https://www.opennutrition.app/download

OpenNutrition’s iOS app offers free essential logging and a limited number of agentic searches, plus expenditure tracking and ongoing diet recommendations like best-in-class paid apps. A paid tier ($49/year) unlocks additional searches and features (data backup, prioritized micronutrient coverage for logged foods), and helps fund further development and broader library coverage.

I’d love to hear your feedback, questions, and suggestions—whether it’s about the database itself, a really great/bad search result, or the app.

1. Burke et al., 2011, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268700/

2. Patel et al., 2019, https://mhealth.jmir.org/2019/2/e12209/

Show context
octotep ◴[] No.43569843[source]
Overall, very cool and seriously much needed! How does the micronutrient estimation work? Or is that part of the secret sauce?

I was looking at this page: https://www.opennutrition.app/search/original-shells-cheese-... and saw the amino acid, vitamin, and mineral sections; there are many things listed which aren't covered by the official nutritional data. These entries also have very precise numbers but I'm not sure where and how they're derived and if I could put any serious weight in them. I'd love to hear more if you're willing to share!

replies(2): >>43569955 #>>43570163 #
joshdickson ◴[] No.43569955[source]
TL;DR: They are estimates from giving an LLM (generally o3 mini high due to cost, some o1 preview) a large corpus of grounding data to reason over and asking it to use its general world knowledge to return estimates it was confident in, which, when escalating to better LLMs like o1-pro and manual verification, proved to be good enough that I thought they warranted release.

You can read about the background on how I did them in more detail in the about/methodology section: https://www.opennutrition.app/about (see "Technical Approach")

replies(1): >>43570036 #
Xiol32 ◴[] No.43570036[source]
You need to add a disclaimer for this data. People could rely on them being accurate, and you simply can't prove they are.
replies(1): >>43570142 #
joshdickson ◴[] No.43570142{3}[source]
There is a large disclaimer that states, among other things, "We strive to ensure accuracy and quality using authoritative sources and AI-based validation; however, we make no guarantees regarding completeness, accuracy, or timeliness. Always confirm nutritional data independently when accuracy is critical." on every page on the website where that kind of in-depth data is available.
replies(2): >>43570187 #>>43570286 #
adamas ◴[] No.43570286{4}[source]
At that point, if you are not sure a data point is accurate, should you really display it ? You have no proof appart from "The LLM said it was ok" which is kind of poor.
replies(1): >>43571919 #
1. sswatson ◴[] No.43571919{5}[source]
I disagree with the idea that data must be accompanied by a guarantee of accuracy to be used or published. That standard would rule out almost all datasets for which the underlying data is not programmatically generated.

My guess is that this dataset is probably more accurate on the whole than many datasets used by the kinds of calorie-tracking apps that outsource their collection of nutrition information to users. But an analysis would be required.

Regardless, the only workable approach is to describe the provenance of your data and explain what steps have been taken to ensure accuracy. Then anyone who wants to use the data can account for that information.