Log, query, and update LLM traces programmatically. Upload bulk traces or update evaluations and annotations after the fact.
Key Capabilities
- List and filter spans for a project
- Bulk upload traces from offline processing
- Update evaluations asynchronously (LLM-as-judge patterns)
- Add human feedback and annotations
- Attach custom metadata for filtering and analysis
- Export spans for offline analysis
List Spans
spans.list is currently in ALPHA. A one-time warning is emitted on first use. For downloading large volumes of spans, use export_to_df instead.
List spans for a project within an optional time window. Spans are returned in descending start-time order (most recent first). If start_time and end_time are not provided, the last seven days are queried.
from datetime import datetime
resp = client.spans.list(
project="your-project-name-or-id",
start_time=datetime(2024, 1, 1), # optional
end_time=datetime(2024, 2, 1), # optional
limit=100,
)
for span in resp.spans:
print(span.span_id, span.name)
Filter Spans
Use the filter parameter to narrow results by status, evaluation labels, annotation labels, or latency:
# Filter by status
resp = client.spans.list(
project="your-project-name-or-id",
filter="status_code = 'ERROR'",
)
# Filter by evaluation label
resp = client.spans.list(
project="your-project-name-or-id",
filter="eval.Correctness.label = 'correct'",
)
# Filter by annotation label
resp = client.spans.list(
project="your-project-name-or-id",
filter="annotation.Quality.label = 'good'",
)
# Filter by latency
resp = client.spans.list(
project="your-project-name-or-id",
filter="latency_ms > 1000",
)
# Combine filters with AND / OR
resp = client.spans.list(
project="your-project-name-or-id",
filter="status_code = 'ERROR' AND eval.Correctness.label = 'correct'",
)
For details on pagination, field introspection, and data conversion (to dict/JSON/DataFrame), see Response Objects.
Log Spans
Upload traces in bulk from offline processing or batch evaluation.
import pandas as pd
# Prepare spans DataFrame
spans_df = pd.DataFrame([
{
"context.span_id": "span-1",
"context.trace_id": "trace-1",
"name": "llm_call",
"span_kind": "LLM",
"start_time": "2024-01-15T10:00:00Z",
"end_time": "2024-01-15T10:00:02Z",
"attributes.llm.model_name": "gpt-4",
"attributes.llm.input_messages": [...],
"attributes.llm.output_messages": [...],
},
])
# Optional: include evaluations
evals_df = pd.DataFrame([
{
"context.span_id": "span-1",
"name": "Correctness",
"label": "correct",
"score": 1.0,
},
])
# Log spans
response = client.spans.log(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=spans_df,
evals_dataframe=evals_df, # Optional
)
print(f"Logged spans successfully: {response.status_code}")
Log Spans Only
client.spans.log(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=spans_df,
)
Update Evaluations
Add or update evaluations for existing spans (useful for LLM-as-judge patterns).
import pandas as pd
evals_df = pd.DataFrame([
{
"context.span_id": "span-1",
"name": "Relevance",
"label": "relevant",
"score": 0.95,
"explanation": "The response directly answers the question.",
},
{
"context.span_id": "span-2",
"name": "Relevance",
"label": "not_relevant",
"score": 0.2,
},
])
response = client.spans.update_evaluations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=evals_df,
)
print("Updated evaluations successfully")
Batch Evaluation Pattern
# Run async LLM evaluations on existing traces
async def evaluate_traces():
# Fetch traces to evaluate
traces = fetch_recent_traces()
# Run LLM-as-judge evaluations
eval_results = []
for trace in traces:
score = await llm_judge.evaluate(trace)
eval_results.append({
"context.span_id": trace.span_id,
"name": "Quality",
"score": score,
})
# Upload evaluations
evals_df = pd.DataFrame(eval_results)
client.spans.update_evaluations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=evals_df,
)
Update Annotations
Add human feedback and annotations to spans.
import pandas as pd
annotations_df = pd.DataFrame([
{
"context.span_id": "span-1",
"annotation.Quality.label": "correct",
"annotation.Quality.score": 1.0,
"annotation.Quality.text": "Verified by human reviewer",
},
])
response = client.spans.update_annotations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=annotations_df,
)
print("Updated annotations successfully")
Attach or patch custom metadata on existing spans for filtering and analysis. The method uses JSON Merge Patch semantics and supports three input approaches.
Method 1: Direct Field Columns
Set individual metadata fields using attributes.metadata.<field> column names. This is the simplest approach.
import pandas as pd
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"attributes.metadata.customer_id": "cust-456",
"attributes.metadata.experiment_version": "v2",
"attributes.metadata.region": "us-west",
},
{
"context.span_id": "span-2",
"attributes.metadata.customer_id": "cust-789",
"attributes.metadata.region": "eu-central",
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
print(f"Updated: {response['spans_updated']}, Failed: {response['spans_failed']}")
Method 2: Patch Document Column
Provide a JSON patch document per span for more control. The patch is applied after any field columns. The default column name is "patch_document".
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"patch_document": {"tag": "important", "priority": "high"},
},
{
"context.span_id": "span-2",
"patch_document": {"tag": "standard"},
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
Use a custom column name with the patch_document_column_name parameter:
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
patch_document_column_name="my_patch_col",
)
Method 3: Combined Approach
Use both field columns and a patch document. The patch document is applied last and overrides any conflicting field column values.
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"attributes.metadata.tag": "important",
"patch_document": {"priority": "high"}, # Applied after field columns
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
Type Handling
| Python type | Stored as |
|---|
str | string |
int / float | number |
bool | string ("True" / "False") |
None | JSON null (field is set to null, not removed) |
dict / list | JSON string |
Setting a field to None stores JSON null — it does not remove the field. This differs from standard JSON Merge Patch behavior.
Response Structure
update_metadata returns a dictionary with the following keys:
| Key | Description |
|---|
spans_processed | Total spans in the input DataFrame |
spans_updated | Spans successfully updated |
spans_failed | Spans that failed to update |
errors | List of {"span_id": ..., "error_message": ...} for each failure |
response = client.spans.update_metadata(...)
print(f"Processed: {response['spans_processed']}")
print(f"Updated: {response['spans_updated']}")
print(f"Failed: {response['spans_failed']}")
for err in response.get("errors", []):
print(f" span {err['span_id']}: {err['error_message']}")
Export Spans
Export spans for offline analysis, custom processing, or archival.
from datetime import datetime
start_time = datetime.strptime("2024-01-01", "%Y-%m-%d")
end_time = datetime.strptime("2026-01-01", "%Y-%m-%d")
# Export to DataFrame
df = client.spans.export_to_df(
space_id="your-space-id",
project_name="my-llm-app",
start_time=start_time,
end_time=end_time,
)
print(f"Exported {len(df)} spans")
Export to Parquet
client.spans.export_to_parquet(
space_id="your-space-id",
project_name="my-llm-app",
start_time=start_time,
end_time=end_time,
path="./spans_export.parquet",
)
Export capabilities:
- Time-range filtering
- DataFrame or Parquet output
- Efficient Arrow Flight transport for large exports
- Progress bars for long-running exports