Quickstart
Galileo Fine-Tune is design to automatically surface insights and errors in your data that drag your fine tuned LLM’s performance down.
You have two options to use Fine-tune.
-
You are training your fine tuned LLM, you can hook Galileo into your model training loop
-
You don’t have a model yet, and simply have training and evaluation data.
Use Galileo with your Fine-tuned LLM
If you already have a model, we recommend hooking Galileo into it during training. This will allow Galileo to tailor its insights to your own model.
To integrate into your model training, use our dataquality library. We have built easy-to-use watch functions for the most popular model frameworks. To learn about how watch works, have a look at our documentation or follow the notebook below.
Once you train a model with Galileo (either manually or with dq.auto
), your data will appear in Galileo’s Fine-Tune Console.
Use Galileo without a Fine-tuned LLM (No Model, Just Data)
If you need insights on your data, you can use Galileo Auto.
This takes your dataset as a parameter and all you need to do is run the following:
from dataquality.integrations.seq2seq.auto import auto
from dataquality.integrations.seq2seq.schema import Seq2SeqDatasetConfig
dataset_config = Seq2SeqDatasetConfig(train_path="train.jsonl", eval_path="eval.jsonl")
auto(
project_name="s2s_auto",
run_name="completion_dataset",
dataset_config=dataset_config,
)
To surface data insights and data errors, Galileo runs a lightweight model behind the scenes. To display the data as fast as possible in the console and avoid fine-tuning entirely, simply create training_config = Seq2SeqTrainingConfig(epochs=0)
and pass it to auto
.
See dq.auto configuration for more details.
Data Upload via the UI
Uploading the data directly into the Galileo UI will be coming soon.
Get started with a notebook
-
PyTorch/HuggingFace Notebook (FlanT5 encoder-decoder model)
Was this page helpful?