Yolo v5 and v8
​Ultralytics YOLO is a powerful suite for computer vision. With the cutting-edge Python module ultralytics, you can simply train and predict via the CLI. For Object Detection, Galileo hooks into ultralytics through ultralytics' CLI, providing data insights without a single additional line of code required.
The Ultralytics repo supports both Object Detection and Semantic Segmentation tasks.


Install both frameworks in your environment using the usual command
pip install dataquality ultralytics
Note that you can use this command in a notebook (local environment, Colab, etc) by prefixing it with an exclamation point
Installing dataquality and ultralytics in Colab

How does it work?

Our integration with ultralytics provides a wrapper around their CLI, allowing for training a model, predicting on your entire dataset and logging everything to Galileo using a single line of code!
Interact with the ultralytics CLI in the usual way as you would do without using Galileo, the only command change is simply replacing yolo with dqyolo. For example train your model for 3 epochs using the command
dqyolo detect train model=yolov8x data=./data.yaml epochs=3
Below we show more example on how to train, fine-tune, predict, etc, using Ultralytics, and refer to their documentation for more details.

YAML configuration

The Ultralytics CLI requires a yaml file which indicates the list of classes (labels) and the location of the data and annotations.
In order for Galileo to successfully display the images in the console, they have to be stored remotely in the cloud (S3 and GS supported). In addition, they need to be under the same folder structure required by YOLO and your environment needs to have access to them. Specifying their location is done in the yaml file as follows:
# lines to add in the YAML configuration for the Galileo integration
bucket: s3://coco-images # or gs://coco-images
bucket_train: images/training_folder
bucket_val: images/val_folder # Optional
bucket_test: images/test_folder # Optional
  • bucket is the URI of the bucket containing the images (S3 and GS supported)
  • bucket_split is the relative path in the bucket to the folder containing the images (where split can be train / val / test)

Galileo login

While executing the dqyolo, you will be prompted to indicate the console url, enter your API key, and then enter your project and run names. To avoid being prompted you can set environment variables, in particular:
GALILEO_CONSOLE_URL - url of your cluster, eg, "" GALILEO_USERNAME - your username GALILEO_PASSWORD - your password GALILEO_PROJECT_NAME - name of the project GALILEO_RUN_NAME - name of the run.
There are various ways to set these, in a notebook you can do it via the %env magic command
%env GALILEO_CONSOLE_URL = your_cluster_url
%env GALILEO_USERNAME = your_username
%env GALILEO_PASSWORD = your_password
%env GALILEO_PROJECT_NAME = awesome_project
%env GALILEO_RUN_NAME = bestest_run


After having setup your data in the cloud and downloaded it locally in your environment, you can use one of the below commands to get started.

Fine-tune a model for a few epochs

dqyolo detect train model=yolov8x data=/content/dataset/data.yaml epochs=3

Without training

dqyolo detect predict model=yolov8x data=/content/dataset/data.yaml

Predict on a fine-tuned model

The above detect train command automatically stores the best model, which can later be re-used as follows:
dqyolo detect predict model=/content/runs/detect/train/weights/ data=/content/dataset/data.yaml

Example notebooks

Last modified 6mo ago