Creating a Prompt Run
Video Walkthrough of how to get started with Galileo Prompt
Create a Prompt Run to evaluate the model's response to a template and a number of inputs. You can create runs through Galileo's Python library
promptquality
or through the Prompt Inspector UI. Below is a walkthrough of how to create a run through the UI:
Choosing your Template, Model, and Tune Settings
The first thing you do is choose a Model and Tune Settings. The Prompt Inspector UI lets you query popular LLM APIs. For custom or self-hosted LLMs, you need to use the Python client.
Then, you select a template. You can create a new template or load an existing one. All of your templates and their versions are tracked and can be updated from here. To add a variable slot to your template, wrap it in curly brackets (e.g. "{topic}). You can upload a CSV file with a list of values for your slots, or manually enter values through the DATA section.

Galileo has built a menu of Guardrail Metrics for you to choose from. These metrics are tailored to your use case and are designed to help you evaluate your prompts and models. Galileo's Guardrail Metrics are a combination of industry-standard metrics (e.g. BLEU, ROUGE-1, Perplexity) and an outcome of Galileo's in-house ML Research Team (e.g. Uncertainty, Factuality, Groundedness).

Below is the list of metrics we support:
- Uncertainty
- Groundedness
- Factuality
- Context Relevance
- QA Relevance
- Tone
- PII
- BLEU
- ROUGE-1
- More coming very soon.
The same set of Guardrail Metrics can be used to monitor your LLM App once it's in production. See LLM Monitor for more details.
Alternatively, you can use
promptquality
to create runs through your Python notebook. After running pip install promptquality
, you can create prompt runs by doing the following:import promptquality as pq
pq.login({YOUR_GALILEO_URL})
template = "Explain {topic} to me like I'm a 5 year old"
data = {"topic": ["Quantum Physics", "Politics", "Large Language Models"]}
pq.run(project_name='my_first_project',
template=template,
dataset=data,
settings=pq.Settings(model_alias='ChatGPT (16K context)',
temperature=0.8,
max_tokens=400))
Last modified 2mo ago