Evaluation of LLMs in retrieving food and nutritional context for RAG systems
AI 摘要
评估LLM在食品营养RAG系统中检索数据的能力,发现其在复杂查询中存在挑战。
主要贡献
- 评估LLM在食品营养数据检索中的应用
- 分析了LLM在处理复杂查询时的局限性
- 提出了LLM驱动的元数据过滤方法
方法论
利用食品成分数据库,评估LLM将自然语言查询转化为结构化元数据过滤器的能力,并通过Chroma向量数据库进行检索。
原文摘要
In this article, we evaluate four Large Language Models (LLMs) and their effectiveness at retrieving data within a specialized Retrieval-Augmented Generation (RAG) system, using a comprehensive food composition database. Our method is focused on the LLMs ability to translate natural language queries into structured metadata filters, enabling efficient retrieval via a Chroma vector database. By achieving high accuracy in this critical retrieval step, we demonstrate that LLMs can serve as an accessible, high-performance tool, drastically reducing the manual effort and technical expertise previously required for domain experts, such as food compilers and nutritionists, to leverage complex food and nutrition data. However, despite the high performance on easy and moderately complex queries, our analysis of difficult questions reveals that reliable retrieval remains challenging when queries involve non-expressible constraints. These findings demonstrate that LLM-driven metadata filtering excels when constraints can be explicitly expressed, but struggles when queries exceed the representational scope of the metadata format.