Ultralytics
Yolo v5 and v8
Ultralytics YOLO is a powerful suite for computer vision. With the cutting-edge Python module
ultralytics
, you can simply train and predict via the CLI. For Object Detection, Galileo hooks into ultralytics through ultralytics' CLI, providing data insights without a single additional line of code required.Install both frameworks in your environment using the usual command
pip install dataquality ultralytics
Note that you can use this command in a notebook (local environment, Colab, etc) by prefixing it with an exclamation point

Installing dataquality and ultralytics in Colab
Our integration with
ultralytics
provides a wrapper around their CLI, allowing for training a model, predicting on your entire dataset and logging everything to Galileo using a single line of code!Interact with the
ultralytics
CLI in the usual way as you would do without using Galileo, the only command change is simply replacing yolo
with dqyolo
. For example train your model for 3 epochs using the commanddqyolo detect train model=yolov8x data=./data.yaml epochs=3
Below we show more example on how to train, fine-tune, predict, etc, using Ultralytics, and refer to their documentation for more details.
The Ultralytics CLI requires a yaml file which indicates the list of classes (labels) and the location of the data and annotations.
In order for Galileo to successfully display the images in the console, they have to be stored remotely in the cloud (S3 and GS supported). In addition, they need to be under the same folder structure required by YOLO and your environment needs to have access to them. Specifying their location is done in the yaml file as follows:
# lines to add in the YAML configuration for the Galileo integration
bucket: s3://coco-images # or gs://coco-images
bucket_train: images/training_folder
bucket_val: images/val_folder # Optional
bucket_test: images/test_folder # Optional
where
bucket
is the URI of the bucket containing the images (S3 and GS supported)bucket_split
is the relative path in the bucket to the folder containing the images (where split can be train / val / test)
While executing the
dqyolo
, you will be prompted to indicate the console url, enter your API key, and then enter your project and run names. To avoid being prompted you can set environment variables, in particular:GALILEO_CONSOLE_URL - url of your cluster, eg, "https://console.cloud.rungalileo.io"
GALILEO_USERNAME - your username
GALILEO_PASSWORD - your password
GALILEO_PROJECT_NAME - name of the project
GALILEO_RUN_NAME - name of the run.
There are various ways to set these, in a notebook you can do it via the
%env
magic command%env GALILEO_CONSOLE_URL = your_cluster_url
%env GALILEO_USERNAME = your_username
%env GALILEO_PASSWORD = your_password
%env GALILEO_PROJECT_NAME = awesome_project
%env GALILEO_RUN_NAME = bestest_run
After having setup your data in the cloud and downloaded it locally in your environment, you can use one of the below commands to get started.
dqyolo detect train model=yolov8x data=/content/dataset/data.yaml epochs=3
dqyolo detect predict model=yolov8x data=/content/dataset/data.yaml
The above
detect train
command automatically stores the best model, which can later be re-used as follows:dqyolo detect predict model=/content/runs/detect/train/weights/best.pt data=/content/dataset/data.yaml
Last modified 6mo ago