Galileo + Delta Lake (Databricks)

This page shows how to export data directly into Delta Lake from the Galileo UI and then reading the same data using Galileo's Python SDK and executing a Galileo Run.

Setting Up a Databricks Connection

First, go to the Integrations Page and set up your Databricks connection.

Using Galileo to Read from Delta Lake and Execute a Run

The following code snippet shows how to read labeled data from Delta Lake and execute a Galileo training run.

import os

import pandas as pd
from deltalake import DeltaTable, write_deltalake

# Dataframe with 2 columns: text and label
df_train = pd.DataFrame({"text":, "label":})
df_test = pd.DataFrame({"text":, "label":})

write_deltalake("tmp/delta_lake_path", df_train)
write_deltalake("tmp/delta_lake_path", df_test)

df_train_from_deltalake = DeltaTable("tmp/delta_lake_path").to_pandas()
df_test_from_deltalake = DeltaTable("tmp/delta_lake_path").to_pandas()

Exporting Data from Galileo UI into Delta Lake

