Running and Logging Chain Experiments

Langchain Chains

Galileo supports the logging of chains from langchain. To log these chains, we require using the callback from our Python client promptquality.

For logging your data, first login:

import promptquality as pq
pq.login({YOUR_GALILEO_URL})

After that, you can set up the GalileoPromptCallback:

galileo_handler = pq.GalileoPromptCallback(
    project_name=<project-name>, scorers=[<list-of-scorers>]
)

All of the typical Galileo-provided, custom, and registered scorers are supported for chain runs and they can be enabled for prompt runs when initializing the callback.

When you execute your chain (with run, invoke or batch), just include the callback instance created earlier in the callbacks as:

If using .run():

chain.run(<inputs>, callbacks=[galileo_handler])

If using .invoke():

chain.invoke(inputs, , config=dict(callbacks=[galileo_handler]))

If using .batch():

.batch(..., config=dict(callbacks=[galileo_handler]))

And once you complete executing for your dataset, tell Galileo the run is complete by:

galileo_handler.finish()

The finish step uploads the run to Galileo and starts the execution of the scorers server-side. This step will also display the link you can use to interact with the run on the Galileo console.

A full example can be found here.

Note 1: Please make sure to set the callback at execution time, not at definition time so that the callback is invoked for all nodes of the chain.

Note 2: We recommend using .invoke instead of .batch because langchain reports latencies for the entire batch instead of each individual chain execution.

Custom Chains

If you're not using an orchestration library, or using one other than Langchain, we also provide a similar interface for uploading your chains that do not use a callback mechanism. To log your chain runs with Galileo, you'd start with the same typical flow of logging into Galileo:

import promptquality as pq
pq.login({YOUR_GALILEO_URL})

Then, for each node of your chain, construct a chain row:

from promptquality import NodeType, NodeRow

rows = [
    NodeRow(node_id=..., chain_root_id=..., node_type=<ChainNodeType>)
]

For example, you can log your retriever and llm node with the snippet below.

from promptquality import NodeType, NodeRow
import uuid

rows = []

rows.append(
    NodeRow(node_id=uuid.uuid4(), # Randomly generated UUID
             chain_root_id=CHAIN_ROOT_ID, # UUID of the 'parent' node
             step = ..., #an integer indicating which step this node is
             node_input=..., # input into your retriever
             node_output=..., # serialized output of the retriever (i.e. json.dumps([{"page_content": "doc_1", "metadata": {"key": "val"}}, {"page_content": "doc_2", "metadata": {"key": "val"}}, ...]))
             node_type=NodeType.retriever)
)

rows.append(
    NodeRow(node_id=uuid.uuid4(), # Randomly generated UUUID
             chain_root_id=CHAIN_ROOT_ID, # UUID of the 'parent' node
             step = ..., #an integer indicating which step this node is
             node_input=..., # input into your llm (i.e. user query + relevant contexts passed in as a string)
             prompt = ..., # input into your llm (i.e. user query + relevant contexts passed in as a string)
             node_output=..., # output of the llm passed in as a string
             response = ..., # output of the llm passed in as a string
             node_type=NodeType.llm)
)

We recommend you randomly generate node_id and chain_root_id (e.g. uuid()). Add the id of a 'parent' node as the chain_root_id of its children.

When your execution completes, log that data to Galileo:

pq.chain_run(rows, project_name=<project-name>, scorers=[<list-of-scorers>])

Once that's complete, this step will display the link to access the run from your Galileo Console.

Logging metadata from chains

If you are logging chains from langchain, metadata values (such as chunk-level metadata for the retriever) will be automatocally included.

For custom chains, metadata values can be logged by dumping metadata along with page_content as demonstrated below.

from promptquality import NodeType, NodeRow
import uuid

retriever_output = [ { "page_content": "chunk 1 content", "metadata": {"key": "value"} }, { "page_content": "chunk 2 content", "metadata": {"key": "value"} }, ]

rows = []

rows.append(
    NodeRow(
        node_id=uuid.uuid4(), # Randomly generated UUID
        chain_root_id=..., # UUID of the 'parent' node
        step = ...,  #an integer indicating which step this node is
        node_type=NodeType.retriever, 
        step=..., #an integer indicating which step this node is
        node_input="the query to the retriever", 
        node_output=json.dumps(retriever_output) )
)

Last updated