After you complete a run, Galileo surfaces a summary of issues it has found in your dataset in the Alerts section. Each Alert represents a problematic pocket of data that Galileo has identified.

Clicking on an alert will filter the dataset to this problematic subset of data and allows you to fix them.

Alerts will also educate you on why this subset of your data might be causing issues and tell you how you can fix them. You can think of Alerts as a partner Data Scientist working with you to find and fix your data.

Alerts that we support today

We support a growing list of alerts, and are open to feature requests! Some of the highlights include:

Hard for the modelExposes the samples we believe are hard for your model to learn. These are the samples with high Data Error Potential scores.
Hard for the model clusterExposes clusters of data that have a high Data Error Potential.
High Uncertainty OutputsSurfaces samples that have High Uncertainty on the generated output (only available if generations were created for this split).
High Perplexity SamplesIdentifies samples whose predictions have high Perplexity.
Empty SamplesIdentifies samples that have empty Input, empty Target or empty Generations.
Low Performing ClusterExposes clusters that have poor BLEU or ROUGE scores (only available if generations were created for this split).

How to request a new alert?

Have a great idea for a new alert? We’d love to hear about it! Contact us.