LLMs and LLM applications can have unpredictable behaviors. Mission-critical generative AI applications in production require meticulous observability to ensure performance, security and positive user experience.

Galileo Observe helps you monitor your generative AI applications in production. With Observe you will understand how your users are using your application and identify where things are going wrong. Keep tabs on your production system, instantly receive alerts when bad things happen, and perform deep root cause analysis though the Observe dashboard.

Core features

Real-time Monitoring

Keep a close watch on your Large Language Model (LLM) applications in production. Monitor the performance, behavior, and health of your applications in real-time.

Guardrail Metrics

Galileo has built a number of Guardrail Metrics to monitor the quality and safety of your LLM applications in production. The same set of metrics you used during Evaluation and Experimentation in pre-production can be used to keep tabs on your productionized system:

  • Context Adherence
  • Completeness
  • Correctness
  • Instruction Adherence
  • Prompt Injections
  • PII
  • And more.

Custom Metrics

Every use case is different. And out-of-the-box metrics won’t cover all your needs. Galileo allows you to customize our Guardrail Metrics or to register your own.

Insights and Alerts

Always on, Galileo Observe sends you an alert when things go south. Trace errors down to the LLM call, Agent plan or Vector Store lookup. Stay informed about potential issues, anomalies, or improvements that require your attention.

The Workflow

1

Log your production traffic

Integrate Observe into your production system
2

Set up your metrics and alerts

Define what you want to measure and set your expectations. Get alerted when anything goes wrong.
3

Debug, re-test

Debug and perform root cause analysis. Form hypothesis and test them using Evaluate, or use Protect to block these scenarios from occurring again.

Getting Started