Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
This section covers migrating dataset management methods from v7’s ArizeDatasetsClient to v8’s ArizeClient.datasets.
from arize.experimental.datasets import ArizeDatasetsClient
# v7 api_key parameter took developer key values
client = ArizeDatasetsClient(
api_key="your-developer-key" # Developer key (deprecated)
)
list_datasets()
The list_datasets() method migrates from client.list_datasets() to client.datasets.list().
Parameter Reference
| Parameter | v7 | v8 | Changes |
|---|
space_id | Required | Optional | Now optional; if not provided, lists datasets across all spaces |
limit | N/A | ✅ Optional | Maximum number of datasets to return (default 100) |
cursor | N/A | ✅ Optional | Pagination cursor for retrieving next page |
Side-by-Side Comparison
from arize.experimental.datasets import ArizeDatasetsClient
# Client initialization
client = ArizeDatasetsClient(api_key="your-developer-key")
# List datasets
datasets_df = client.list_datasets(space_id="your-space-id")
create_dataset()
The create_dataset() method migrates from client.create_dataset() to client.datasets.create().
Parameter Reference
| Parameter | v7 | v8 | Changes |
|---|
space_id | Required | Required | — |
dataset_name | Required | Required | Renamed to name |
name | N/A | ✅ Required | Renamed from dataset_name |
dataset_type | Required | ❌ Removed | No longer required |
data | Required | Required | Renamed to examples |
examples | N/A | ✅ Required | Renamed from data; accepts DataFrame or list of dicts |
convert_dict_to_json | Optional | ❌ Removed | Automatic conversion in v8 |
max_chunk_size | Optional | ❌ Removed | Now configured at client level |
force_http | N/A | ✅ Optional | Force HTTP upload instead of gRPC (default False) |
Side-by-Side Comparison
from arize.experimental.datasets import ArizeDatasetsClient
from arize.pandas.proto import flight_pb2
import pandas as pd
# Client initialization
client = ArizeDatasetsClient(api_key="your-developer-key")
# Create dataset
dataset_id = client.create_dataset(
space_id="your-space-id",
dataset_name="my-dataset",
dataset_type=flight_pb2.DatasetType.GENERATIVE,
data=dataset_df,
convert_dict_to_json=True,
max_chunk_size=1000
)
get_dataset()
The get_dataset() method has a different behavior in v8. In v7, client.get_dataset() returned the dataset examples (underlying data). In v8, client.datasets.get() returns only the dataset metadata and versions, while client.datasets.list_examples() retrieves the actual examples.
Parameter Reference
For dataset metadata (v8’s datasets.get()):
| Parameter | v7 | v8 | Changes |
|---|
space_id | Required | ❌ Removed | Not needed in v8 |
dataset_id | Optional | Required | Now required; no longer accepts dataset_name |
dataset_name | Optional | ❌ Removed | Use dataset_id instead |
dataset_version | Optional | ❌ Removed | All versions are returned in metadata |
convert_json_str_to_dict | Optional | N/A | Only applies to examples, not metadata |
For dataset examples (v8’s datasets.list_examples()):
| Parameter | v7 | v8 | Changes |
|---|
dataset_id | Optional | Required | — |
dataset_version | Optional | Optional | Renamed to dataset_version_id |
dataset_version_id | N/A | ✅ Optional | If empty, returns latest version |
limit | N/A | ✅ Optional | Maximum number of examples per page (default 100); ignored if all=True |
all | N/A | ✅ Optional | When True, retrieves all examples via Flight (bypasses pagination). When False (default), uses REST with pagination |
convert_json_str_to_dict | Optional | ❌ Removed | Automatic conversion in v8 |
Side-by-Side Comparison
from arize.experimental.datasets import ArizeDatasetsClient
# Client initialization
client = ArizeDatasetsClient(api_key="your-developer-key")
# Get dataset examples (underlying data) by ID
dataset_df = client.get_dataset(
space_id="your-space-id",
dataset_id="dataset-123",
dataset_version="v1",
convert_json_str_to_dict=True
)
# Returns: pandas DataFrame with the dataset examples
# Or get by name
dataset_df = client.get_dataset(
space_id="your-space-id",
dataset_name="my-dataset"
)
delete_dataset()
The delete_dataset() method migrates from client.delete_dataset() to client.datasets.delete().
Parameter Reference
| Parameter | v7 | v8 | Changes |
|---|
space_id | Required | ❌ Removed | Not needed in v8 |
dataset_id | Optional | Required | Now required; no longer accepts dataset_name |
dataset_name | Optional | ❌ Removed | Use dataset_id instead |
Side-by-Side Comparison
from arize.experimental.datasets import ArizeDatasetsClient
# Client initialization
client = ArizeDatasetsClient(api_key="your-developer-key")
# Delete dataset by ID
success = client.delete_dataset(
space_id="your-space-id",
dataset_id="dataset-123"
)
# Or delete by name
success = client.delete_dataset(
space_id="your-space-id",
dataset_name="my-dataset"
)