[SHOWDOWN]

Real people. Real conversations. Real rankings.

Showdown ranks AI models based on how they perform in real-world use -- not synthetic tests or lab settings. Votes are blind, optional, and organic, so rankings reflect authentic preferences.

Methodology & Technical Report Compare Models

Prompts compared0

Real conversation prompts compared across models through pairwise votes.

Active users0

From 80+ countries and 70+ languages, spanning all backgrounds and professions.

* This model's API does not consistently return Markdown-formatted responses. Since raw outputs are used in head-to-head comparisons, this may affect its ranking.

Win Rate vs. Each Model

Battle Count vs. Each Model

Confidence Intervals

Average Win Rate

Prompt Distribution

* This model's API does not consistently return Markdown-formatted responses. Since raw outputs are used in head-to-head comparisons, this may affect its ranking.

[SHOWDOWN]

Showdown Leaderboard - LLMs

Leaderboard - LLMs

Performance Comparison Across Language Models

Win Rate vs. Each Model

Win Rate vs. Each Model

Battle Count vs. Each Model

Battle Count vs. Each Model

Confidence Intervals

Confidence Intervals

Average Win Rate

Average Win Rate

Prompt Distribution

Prompt Distribution

gemini-3-pro-preview

gemini-3-flash

qwen3-omni

gpt-4o-audio-preview-2025-06-03

voxtral-small-24b-2507

gemma3n

gpt-realtime

phi-4-multimodal-instruct

Voice Model Performance Comparison

Win Rate vs. Each Model

Win Rate vs. Each Model

Battle Count vs. Each Model

Battle Count vs. Each Model

Confidence Intervals

Confidence Intervals

Average Win Rate

Average Win Rate

Prompt Distribution

Prompt Distribution

Leaderboard - LLMs

Performance Comparison Across Language Models

Win Rate vs. Each Model

Win Rate vs. Each Model

Battle Count vs. Each Model

Battle Count vs. Each Model

Confidence Intervals

Confidence Intervals

Average Win Rate

Average Win Rate

Prompt Distribution

Prompt Distribution

gemini-3-pro-preview

gemini-3-flash

qwen3-omni

gpt-4o-audio-preview-2025-06-03

voxtral-small-24b-2507

gemma3n

gpt-realtime

phi-4-multimodal-instruct

Voice Model Performance Comparison

Win Rate vs. Each Model

Win Rate vs. Each Model

Battle Count vs. Each Model

Battle Count vs. Each Model

Confidence Intervals

Confidence Intervals

Average Win Rate

Average Win Rate

Prompt Distribution

Prompt Distribution