Galileo
Search
K

Creating Prompt Runs

Prompt Engineering from the Galileo UI

Creating a Prompt Run

Create a Prompt Run to evaluate the model's response to a template and a number of inputs. You can create runs through Galileo's Python library promptquality or through the Galileo Evaluate UI.
Below is a walkthrough of how to create a run through the UI:

Choosing a Template, Model, and Tune Settings

The first thing you do is choose a Model and Tune Settings. The Galileo Evaluate UI lets you query popular LLM APIs. For custom or self-hosted LLMs, you need to use the Python client.
Then, you select a template. You can create a new template or load an existing one. All of your templates and their versions are tracked and can be updated from here. To add a variable slot to your template, wrap it in curly brackets (e.g. "{topic}). You can upload a CSV file with a list of values for your slots, or manually enter values through the DATA section. If you want to reference a column of data from your CSV as a variable, use the header as the variable name. For example, if you have a column named evidence_text, use the variable {evidence_text} to reference that column.

Choosing Guardrail Metrics

Galileo has built a menu of Guardrail Metrics for you to choose from. These metrics are tailored to your use case and are designed to help you evaluate your prompts and models. Galileo's Guardrail Metrics are a combination of industry-standard metrics (e.g. BLEU, ROUGE-1, Perplexity) and an outcome of Galileo's in-house ML Research Team (e.g. Uncertainty, Factuality, Groundedness).
Here is the List of Metrics we support. The same set of Guardrail Metrics can be used to monitor your LLM App once it's in production. See LLM Monitor for more details.
Video Walkthrough of how to get started with Galileo Evaluate

Prompt Run Creation via Python

You can use promptquality to create runs through your Python notebook. After running pip install promptquality, you can create prompt runs by doing the following:
import promptquality as pq
pq.login({YOUR_GALILEO_URL})
template = "Explain {topic} to me like I'm a 5 year old"
data = {"topic": ["Quantum Physics", "Politics", "Large Language Models"]}
pq.run(project_name='my_first_project',
template=template,
dataset=data,
settings=pq.Settings(model_alias='ChatGPT (16K context)',
temperature=0.8,
max_tokens=400))

Looking to build more complex systems?

If you're building more complex systems, e.g. an application that leverages RAG, Agents or other multi-step workflows, check out how to use Galileo with RAG or Galileo with Agents.