> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Binary classification metrics

> Measure F1, Accuracy, Precision, and Recall across experiments by mapping ground truth and predicted columns to a positive class

When your dataset contains labeled ground truth and your experiments produce categorical predictions, you can measure experiment quality with standard binary classification metrics directly in the Arize AX UI. Select a positive class, and Arize AX computes metrics by comparing each predicted value against the ground truth — no evaluator code required.

# Prerequisites

Before configuring classification metrics, make sure:

* Your **dataset** has a column with categorical ground truth labels (e.g., `expected_category`, `true_label`).
* Your **experiments** produce an output column with predicted labels (e.g., `output`, `predicted_label`).
* Column values are clean categorical strings. Metrics are computed by exact match against the selected positive class.

# Configure metrics settings

<Steps>
  <Step title="Open Metrics Settings">
    Navigate to the experiments page for your dataset and click **Metrics Settings**.
  </Step>

  <Step title="Select Ground Truth Column">
    Choose the column from your **dataset** that contains the true labels. The dropdown shows all columns available in the dataset version.
  </Step>

  <Step title="Select Predicted Column">
    Choose the column from your **experiment** outputs that contains the predicted labels. The dropdown shows columns available across the selected experiments.
  </Step>

  <Step title="Select Positive Class">
    Pick the value that represents the **positive** class for binary metric computation. The dropdown is populated with distinct values from the ground truth column you selected.

    <Info>
      Changing the ground truth column resets the positive class selection.
    </Info>
  </Step>

  <Step title="Click Done">
    Metrics are computed automatically for every experiment on the dataset.
  </Step>
</Steps>

# Metrics computed

Arize AX computes the following binary classification metrics using the selected positive class. Rows where either the ground truth or predicted value is null are excluded.

| Metric        | Formula                     | Description                                         |
| ------------- | --------------------------- | --------------------------------------------------- |
| **Accuracy**  | (TP + TN) / Total           | Fraction of predictions that match the ground truth |
| **Precision** | TP / (TP + FP)              | Of all positive predictions, how many are correct   |
| **Recall**    | TP / (TP + FN)              | Of all actual positives, how many were predicted    |
| **F1**        | 2 · TP / (2 · TP + FP + FN) | Harmonic mean of Precision and Recall               |

Where:

* **TP** (True Positive) — predicted and ground truth both match the positive class
* **TN** (True Negative) — neither predicted nor ground truth match the positive class
* **FP** (False Positive) — predicted matches the positive class, ground truth does not
* **FN** (False Negative) — ground truth matches the positive class, predicted does not

# Viewing results

Once configured, classification metrics appear per experiment in the experiments view. Use the charting view to visualize how metrics compare across experiments, or inspect values in the table headers for a detailed breakdown.

<Tip>
  Settings persist per dataset version, so you only need to configure them once. You can compare the same metrics across multiple experiments simultaneously.
</Tip>
