> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Compare experiments

The **Compare Experiments** feature helps you identify meaningful improvements in performance across experiments so you can decide which experiment to move forward to. This enables:

* **Faster iteration** : Quickly spot where performance diverges without manual guesswork.
* **Evidence-based decisions:** Confirm if improvements are real, significant, or just noise.
* **Understand trade-offs:** See if gains come with costs to balance accuracy, speed, & token usage.

# 1. Select Experiments to Compare

Select the experiments you want to analyze and click **Compare Experiments** to view runs side by side.

* See outputs, evaluator results, and metadata together for direct comparison.
* Choose only the columns that matter most and hide the rest.
* Use **Table View** for detailed text results or **Charting View** to visualize evaluator outputs across runs.
* For classification tasks, you can also enable [binary classification metrics](/ax/develop/datasets-and-experiments/experiment-classification-metrics) to compute F1, Accuracy, Precision, and Recall across experiments.

<Frame>
  <video
    src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/arize-compare-experiments-1.mp4"
    width="100%"
    height="100%"
    style={{ 
  display: 'block',
  objectFit: 'fill',
  backgroundColor: 'transparent'
}}
    controls
    autoPlay
    muted
    loop
  />
</Frame>

# 2. Enable Diff Mode

Turn on **Diff Mode** to quickly see where runs improved or regressed against a baseline.

* Select a baseline experiment to measure other runs against.
* Differences in evaluator results are highlighted, making improvements and failures easy to spot.
* Aggregated metrics at the top show how each evaluation measure changed relative to the baseline.

This makes it faster to identify meaningful changes without scanning through every result.

<Frame>
  <video
    src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/arize-compare-experiments-2.mp4"
    width="100%"
    height="100%"
    style={{ 
  display: 'block',
  objectFit: 'fill',
  backgroundColor: 'transparent'
}}
    controls
    autoPlay
    muted
    loop
  />
</Frame>

# 3. Enable Diff Output Mode

Enable **Diff Output Mode** to visually compare the textual outputs of experiments.

* Highlights insertions, deletions, and changes compared to a baseline experiment.
* Makes it easy to see how model responses evolve across runs.
* Useful for spotting subtle but important output differences that metrics alone may miss.

This helps you understand not just whether results changed, but exactly **how** they changed.

<Frame>
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/arize-compare-experiments-3.png" alt="" />
</Frame>
