Chunk Utilization

Understand Galileo's Chunk Utilization Metric

The metric is intended for RAG workflows.

Definition: For each chunk retrieved in a RAG pipeline, Chunk Utilization measures the fraction of the text in that chunk that had an impact on the model's response.

Chunk Utilization ranges from 0 to 1. A value of 1 means that the entire chunk affected the response, while a lower value like 0.5 means that the chunk contained some "extraneous" text which did not affect the response.

Chunk Utilization is closely related to Chunk Attribution: Attribution measures whether or not a chunk affected the response, and Utilization measures how much of the chunk text was involved in the effect. Only chunks that were Attributed can have Utilization scores greater than zero.

Calculation: Chunk Utilization is computed by sending an additional request to your LLM, using a carefully engineered prompt that asks the model to trace information in the response back to individual chunks and sentences within those chunks.

The same prompt is used for both Chunk Attribution and Chunk Utilization, and a single LLM request is used to compute both metrics at once.

Usefulness: low Chunk Utilization scores indicate that your chunks are probably longer than they need to be. In this case, we recommend tuning your retriever to return shorter chunks, which will improve the efficiency of the system (lower cost and latency).

Deep dive: to read more about the research behind this metric, see RAG Quality Metrics.

Note: This metric is computed by prompting an LLM, and thus requires additional LLM calls to compute.

Last updated