Results on Public Datasets

Use the Galileo Sandbox environment to explore the enterprise-grade data quality platform
Galileo helps you discover insights and errors in your training dataset within minutes, not days! You can now confidently ditch excel sheets and ad hoc python scripts, mitigating the cumbersome detective work of exploratory dataset analysis.
We used a pretrained DistilBERT model to train (until convergence) on four popular public datasets across two tasks (described in Table below). Galileo was used to inspect, discover, and fix dataset errors using insights surfaced from the UI.
Go ahead and test drive the Galileo Sandbox environment with these datasets. Feel free to follow along with our provided insights (see Table below)
Dataset / Task
Movie Reviews / Sentiment Classification
SST2 (Stanford Sentiment Treebank) dataset for sentiment classification of IMDb movie reviews with positive, very positive, negative or very negative sentiment classes. The training set has 25K samples and the test set has 25K samples, with no movie overlap between these partitions. Each samples is classified into one of the 4 classes based on the review rating.
Conversational AI / Intent Classification
Multi-class intent classification dataset for conversational AI. The training set has ~13K samples and the test set has ~700 samples, each query belonging to one of the following 7 classes: AddToPlaylist, PlayMusic, SearchScreeningEvent, BookRestaurant, RateBook, GetWeather, SearchCreativeWork
Product Reviews / Sentiment Classification
Dataset for binary sentiment classification of amazon reviews in either positive or negative sentiment class. The training set has 30K samples and the test set has 3K samples.
Banking Call Center / Intent Classification
Dataset for multi-class intent classification for call-center banking queries. The training set has ~10K samples and the test set has ~3K samples, each query belonging to one of 77 classes.