Yolo v5 and v8
Ultralytics YOLO is a powerful suite for computer vision. With the cutting-edge Python module
ultralytics, you can simply train and predict via the CLI. For Object Detection, Galileo hooks into ultralytics through ultralytics' CLI, providing data insights without a single additional line of code required.
Install both frameworks in your environment using the usual command
pip install dataquality ultralytics
Note that you can use this command in a notebook (local environment, Colab, etc) by prefixing it with an exclamation point
Installing dataquality and ultralytics in Colab
Our integration with
ultralyticsprovides a wrapper around their CLI, allowing for training a model, predicting on your entire dataset and logging everything to Galileo using a single line of code!
Interact with the
ultralyticsCLI in the usual way as you would do without using Galileo, the only command change is simply replacing
dqyolo. For example train your model for 3 epochs using the command
dqyolo detect train model=yolov8x data=./data.yaml epochs=3
The Ultralytics CLI requires a yaml file which indicates the list of classes (labels) and the location of the data and annotations.
In order for Galileo to successfully display the images in the console, they have to be stored remotely in the cloud (S3 and GS supported). In addition, they need to be under the same folder structure required by YOLO and your environment needs to have access to them. Specifying their location is done in the yaml file as follows:
# lines to add in the YAML configuration for the Galileo integration
bucket: s3://coco-images # or gs://coco-images
bucket_val: images/val_folder # Optional
bucket_test: images/test_folder # Optional
bucketis the URI of the bucket containing the images (S3 and GS supported)
bucket_splitis the relative path in the bucket to the folder containing the images (where split can be train / val / test)
While executing the
dqyolo, you will be prompted to indicate the console url, enter your API key, and then enter your project and run names. To avoid being prompted you can set environment variables, in particular:
GALILEO_CONSOLE_URL - url of your cluster, eg, "https://console.cloud.rungalileo.io" GALILEO_USERNAME - your username GALILEO_PASSWORD - your password GALILEO_PROJECT_NAME - name of the project GALILEO_RUN_NAME - name of the run.
There are various ways to set these, in a notebook you can do it via the
%env GALILEO_CONSOLE_URL = your_cluster_url
%env GALILEO_USERNAME = your_username
%env GALILEO_PASSWORD = your_password
%env GALILEO_PROJECT_NAME = awesome_project
%env GALILEO_RUN_NAME = bestest_run
After having setup your data in the cloud and downloaded it locally in your environment, you can use one of the below commands to get started.
dqyolo detect train model=yolov8x data=/content/dataset/data.yaml epochs=3
dqyolo detect predict model=yolov8x data=/content/dataset/data.yaml
detect traincommand automatically stores the best model, which can later be re-used as follows:
dqyolo detect predict model=/content/runs/detect/train/weights/best.pt data=/content/dataset/data.yaml