LLM Index

Overview and comparison of available Inference API providers and supported LLMs.

Promptmetheus currently supports 12 providers and 118 models.
ProviderHQModelStatusDistributionReleaseDeprecationParametersContext lengthMax tokensPrice in/out ($/1MT)Links
AI21
Tel Aviv
Jamba 1.5 Large
Aug 22, 2024† May 06, 2025256,0004,096$2 / $8
Jamba 1.5 Mini
Aug 22, 2024† May 06, 2025256,0004,096$0.2 / $0.4
Jamba 1.6 Large
Mar 06, 2025256,0004,096$2 / $8
Jamba 1.6 Mini
Mar 06, 2025256,0004,096$0.2 / $0.4
Jurassic 2 Light
Mar 09, 20238,1928,192$0.1 / $0.5
Jurassic 2 Mid
Mar 09, 20238,1928,192$0.25 / $1.25
Jurassic 2 Ultra
Mar 09, 20238,1928,192$2 / $10
Aleph Alpha
Heidelberg
Luminous Base
Apr 14, 20232,0482,048$30
Luminous Base Control
Apr 14, 20232,0482,048$37.5
Luminous Extended
Apr 14, 20232,0482,048$45
Luminous Extended Control
Apr 14, 20232,0482,048$56.25
Luminous Supreme
Apr 14, 20232,0482,048$175
Luminous Supreme Control
Apr 14, 20232,0482,048$218.75
Anthropic
San Francisco
Claude 2
Jul 11, 2023100,0004,096$8 / $24
Claude 2.1
Nov 21, 2023200,0004,096$8 / $24
Claude 3 Haiku
Mar 15, 2024200,0004,096$0.25 / $1.25
Claude 3 Opus
Mar 04, 2024200,0004,096$15 / $75
Claude 3 Sonnet
Mar 04, 2024200,0004,096$3 / $15
Claude 3.5 Haiku
2024/10/22
Oct 22, 2024200,0008,192$1 / $4
Claude 3.5 Sonnet
2024/10/22
Oct 22, 2024200,0008,192$3 / $15
Claude 3.5 Sonnet
2024/06/20
Jun 20, 2024200,0008,192$3 / $15
Claude 3.7 Sonnet
latest
Feb 24, 2025200,0008,192$3 / $15
Claude Instant 1
Mar 14, 2023100,0002,048$1.63 / $5.51
Claude Instant 1.2
Sep 21, 2023100,0002,048$1.63 / $5.51
Cohere
Toronto
Command
4,0964,096$15
Command A
03/2025
Mar 13, 2025256,0008,000$2.5 / $10
Command Light
4,0964,096$15
Command Nightly
4,0964,096$15
Command R
Mar 11, 2024128,0004,096$0.5 / $1.5
Command R+
Apr 04, 2024128,0004,096$3 / $15
DeepInfra
Palo Alto
Code Llama 34B
Instruct HF
Aug 24, 2023
34B
16,3844,096$0.6
DeepSeek R1
Dec 26, 2024
671B
128,0008,192$0.55 / $2.19
DeepSeek R1 Turbo
Mar 26, 2025
671B
128,0008,192$1 / $3
DeepSeek V3
Dec 26, 2024
671B
128,0004,096$0.85 / $0.9
Gemma 2 27B
27B
8,1928,192$0.27
Gemma 2 9B
9B
8,1928,192$0.06
Gemma 3 12B
Mar 10, 2025
12B
131,0728,192$0.05 / $0.1
Gemma 3 27B
Mar 10, 2025
27B
131,0728,192$0.1 / $0.2
Gemma 3 4B
Mar 10, 2025
4B
131,0728,192$0.02 / $0.04
Llama 2 13B
Chat HF
Jul 18, 2023
13B
4,0964,096$0.35
Llama 2 70B
Chat HF
Jul 18, 2023
70B
4,0964,096$1.88
Llama 2 7B
Chat HF
Jul 18, 2023
7B
4,0964,096$0.2
Llama 3.3 70B
Instruct
Dec 06, 2024
70B
128,0008,192$0.23 / $0.4
Llama 3.3 70B Turbo
Instruct
Dec 06, 2024
70B
128,0008,192$0.12 / $0.3
Llama 4 Maverick 17B 128e
Instruct FP8
Apr 05, 2025
17B
131,0728,192$0.2 / $0.6
Llama 4 Scout 17B 16e
Instruct
Apr 05, 2025
17B
131,0728,192$0.1 / $0.3
Meta Llama 3.1 405B
Instruct
405B
128,0008,192$1.79
Meta Llama 3.1 70B
Instruct
70B
128,0008,192$0.35 / $0.4
Meta Llama 3.1 8B
Instruct
8B
128,0008,192$0.06
Meta Llama 3.2 1B
Instruct
1B
128,0008,192$0.01 / $0.02
Meta Llama 3.2 3B
Instruct
3B
128,0008,192$0.03 / $0.05
Mistral 7B
Instruct v0.1
Sep 27, 2023
7B
32,7688,192$0.2
Mistral Nemo
Instruct 24/07
12B
131,0728,192$0.13
Mixtral 8x22B
Instruct v0.1
† Sep 19, 2024
22B
65,5368,192$0.65
Mixtral 8x7B
Instruct v0.1
56B
32,7688,192$0.24
Phi 4
Dec 12, 2024
14B
16,3848,192$0.07 / $0.14
Qwen 2 72B
Instruct
72B
128,0008,192$0.35 / $0.4
Qwen 2.5 72B
Instruct
Sep 19, 2024
72B
131,0728,192$0.23 / $0.4
WizardLM 2 7B
7B
32,7688,192$0.06
WizardLM 2 8x22B
176B
65,5368,192$0.5
DeepSeek
Hangzhou
Chat V3
Dec 26, 2024
671B
128,0004,096$0.14 / $0.28
Reasoner (R1)
Jan 20, 2025
671B
64,0008,192$0.55 / $2.19
FetchAI
Cambridge
ASI-1 mini
Feb 25, 20251,000,0004,096$0
Gemini
Mountain View
Chat Bison
May 10, 20238,1961,024$0.5
Code Bison
May 10, 20238,1961,024$0.5
Gemini 1.0 Pro
Dec 13, 202332,7688,192$0.5 / $1.5
Gemini 1.5 Flash
exp 08/27
Aug 27, 20241,048,5768,192$0.07 / $0.3
Gemini 1.5 Flash
May 14, 20241,048,5768,192$0.07 / $0.3
Gemini 1.5 Flash 8B
Oct 03, 2024
8B
1,048,5768,192$0.07 / $0.3
Gemini 1.5 Flash 8B
exp 08/27
Aug 27, 2024
8B
1,048,5768,192$0.04 / $0.15
Gemini 1.5 Pro
exp 08/27
Aug 27, 20242,097,1528,192$1.25 / $5
Gemini 1.5 Pro
exp 08/01
Aug 27, 20242,097,1528,192$1.25 / $5
Gemini 1.5 Pro
Feb 15, 20242,097,1528,192$1.25 / $5
Gemini 1.5 Pro
Feb 15, 20242,097,1528,192$1.25 / $5
Gemini 2.0 Flash
exp
Dec 11, 20241,048,5768,192$0
Gemini 2.0 Flash
Dec 11, 20241,048,5768,192$0.1 / $0.4
Gemini 2.0 Flash Lite
Feb 05, 20251,048,5768,192$0.07 / $0.3
Gemini 2.0 Flash Thinking
exp 12/19
Dec 19, 20241,048,5768,192$0
Gemini 2.0 Pro
exp 02/05
Feb 05, 20251,048,5768,192$0
Gemini 2.5 Pro
exp 03/25
Mar 25, 20251,048,57665,536$0
Text Bison
May 10, 20238,1921,024$0.5
Groq
Mountain View
DeepSeek R1 Distill Llama 70B
70B
128,0001,024$0.75 / $0.99
DeepSeek R1 Distill Qwen 32B
32B
128,00016,384$0.69
Gemma 2 9B
Jun 27, 2024
9B
8,1928,192$0.2
Gemma 7B
Jan 15, 2024
7B
8,1928,192$0
Llama 2 70B
Jan 15, 2024
70B
4,0964,096$0
Llama 3 70B
Apr 18, 2024
70B
128,0002,048$0.59 / $0.79
Llama 3 8B
Apr 18, 2024
8B
128,0002,048$0.05 / $0.08
Llama 3.1 405B
preview
Jul 23, 2024
405B
128,0002,048$0.59 / $0.79
Llama 3.1 70B
preview
Jul 23, 2024
70B
128,0002,048$0.59 / $0.79
Llama 3.1 8B
preview
Jul 23, 2024
8B
128,0002,048$0.05 / $0.08
Llama 3.2 1B
preview
Sep 25, 2024
1B
128,0002,048$0.04
Llama 3.2 3B
preview
Sep 25, 2024
3B
128,0002,048$0.06
Llama 3.3 70B
Versatile
Dec 06, 2024
70B
128,0002,048$0.59 / $0.79
Llama 4 Maverick 17B 128e
Apr 05, 2025
17B
131,0728,192$0.5 / $0.77
Llama 4 Scout 17B 16e
Apr 05, 2025
17B
131,0728,192$0.11 / $0.34
Mistral Saba 24B
Feb 17, 2025
24B
32,7688,192$0.79
Mixtral 8x7B
Jan 15, 2024
7B
32,7688,192$0.24
Qwen 2.5 32B
32B
128,0008,000$0.79
Qwen QwQ 32B
32B
131,0728,000$0.29 / $0.39
Mistral
Paris
Large
Feb 26, 202432,0008,192$4 / $12
Medium
23/12
Dec 11, 2023† Mar 30, 202532,0008,192$2.7 / $8.1
Medium
latest
Dec 11, 2023† Mar 30, 202532,0008,192$2.7 / $8.1
Ministral 3B
latest
Oct 16, 2024
3B
131,0008,192$0.04
Ministral 8B
latest
Oct 16, 2024
8B
131,0008,192$0.1
Nemo
Jul 18, 2024
12B
131,0008,192$0.3
Open Mixtral 8x22B
Apr 17, 2024† Mar 30, 2025
176B
64,0008,192$2 / $6
Saba
Feb 17, 2025
24B
32,0008,192$0.2 / $0.6
Small
Mixtral 8x7B
Dec 11, 2023
56B
32,0008,192$0.7
Small
Dec 11, 202332,0008,192$1 / $3
Tiny
Mistral 7B
Dec 11, 2023† Mar 30, 2025
7B
32,0008,192$0.25
NLP Cloud
New York
Chat Dolphin
65,5368,192$0.5
Dolphin
65,5368,192$0.5
OpenAI
San Francisco
Babbage 002
16,38416,384$1.6
DaVinci 002
16,38416,384$12
GPT-3.5 Turbo
06/13
16,3854,096$1.5 / $2
GPT-3.5 Turbo
11/06
Nov 06, 202316,3854,096$1 / $2
GPT-3.5 Turbo
01/25
Jan 25, 202416,3854,096$0.5 / $1.5
GPT-3.5 Turbo
03/01
† Jun 13, 202416,3854,096$1.5 / $2
GPT-3.5 Turbo
Nov 30, 202216,3854,096$1.5 / $2
GPT-3.5 Turbo
Instruct
16,3854,096$1.5 / $2
GPT-3.5 Turbo 16k
Nov 30, 202216,3854,096$3 / $4
GPT-3.5 Turbo 16k
03/01
† Jun 13, 202416,3854,096$3 / $4
GPT-3.5 Turbo 16k
06/13
† Jun 13, 202416,3854,096$3 / $4
GPT-4
06/13
8,1924,096$30 / $60
GPT-4
03/14
8,1924,096$30 / $60
GPT-4
Mar 14, 20238,1924,096$30 / $60
GPT-4 0125 Turbo
Jan 25, 2024128,0004,096$10 / $30
GPT-4 32k
Mar 14, 202332,7684,096$60 / $120
GPT-4 32k
03/14
32,7684,096$60 / $120
GPT-4 32k
06/13
32,7684,096$60 / $120
GPT-4 Turbo
Jan 25, 2024128,0004,096$10 / $30
GPT-4 Turbo
2024/04/09
Apr 09, 2024128,0004,096$10 / $30
GPT-4 Turbo
11/06
Nov 06, 2023128,0004,096$10 / $30
GPT-4.5
preview
Feb 27, 2025128,00016,384$75 / $150
GPT-4o
2024/08/06
Aug 06, 2024128,0004,096$5 / $15
GPT-4o
2024/11/20
Nov 20, 2024128,0004,096$5 / $15
GPT-4o
2024/05/13
May 13, 2024128,0004,096$5 / $15
GPT-4o
May 13, 2024128,0004,096$5 / $15
GPT-4o Search
preview
128,00016,384$2.5 / $10
GPT-4o mini
2024/07/18
Jul 18, 2024128,0004,096$0.15 / $0.6
GPT-4o mini
Jul 18, 2024128,0004,096$0.15 / $0.6
GPT-4o mini Search
preview
128,00016,384$0.15 / $0.6
o1
2024/12/17
Sep 12, 2024200,000100,000$15 / $60
o1
Sep 12, 2024200,000100,000$15 / $60
o1 mini
2024/09/12
Sep 12, 2024128,00065,536$3 / $12
o1 mini
Sep 12, 2024128,00065,536$3 / $12
o1 preview
2024/09/12
Sep 12, 2024128,00032,768$15 / $60
o1 preview
Sep 12, 2024128,00032,768$15 / $60
o1-pro
Mar 19, 2025200,000100,000$150 / $600
o3 mini
2025/01/31
Jan 31, 2025200,000100,000$1.1 / $4.4
o3 mini
Jan 31, 2025200,000100,000$1.1 / $4.4
Perplexity
San Francisco
Code Llama 34B
Instruct
Oct 04, 2023
34B
16,3844,096$0.35 / $1.4
Code Llama 70B
Instruct
Oct 04, 2023
70B
16,3844,096$0.7 / $2.8
Llama 2 70B
Oct 04, 2023
70B
4,0964,096$0.7 / $2.8
Llama 3 70B
Instruct
May 14, 2024† Aug 12, 2024
70B
8,1928,192$0 / $1
Llama 3 8B
Instruct
May 14, 2024† Aug 12, 2024
8B
8,1928,192$0 / $0.2
Llama 3 Sonar large 32k
Online
May 14, 2024† Aug 12, 2024127,0724,096$0 / $1
Llama 3 Sonar large 32k
Chat
May 14, 2024† Aug 12, 2024127,0724,096$0 / $1
Llama 3 Sonar small 32k
Online
May 14, 2024† Aug 12, 2024127,0724,096$0 / $0.2
Llama 3 Sonar small 32k
Chat
May 14, 2024† Aug 12, 2024127,0724,096$0 / $0.2
Llama 3.1 Sonar huge 128k
Online
Aug 14, 2024† Feb 22, 2025127,0724,096$0 / $5
Llama 3.1 Sonar large 128k
Chat
Jul 31, 2024† Feb 22, 2025127,0724,096$0 / $1
Llama 3.1 Sonar large 128k
Online
Jul 31, 2024† Feb 22, 2025127,0724,096$0 / $1
Llama 3.1 Sonar small 128k
Chat
Jul 31, 2024† Feb 22, 2025127,0724,096$0 / $0.2
Llama 3.1 Sonar small 128k
Online
Jul 31, 2024† Feb 22, 2025127,0724,096$0 / $0.2
Mistral 7B
Oct 04, 2023† Aug 12, 2024
7B
32,7688,192$0.07 / $0.28
Mixtral 8x7B
Instruct
Oct 04, 2023† Aug 12, 2024
7B
32,7688,192$0 / $0.6
R1 1776
Feb 18, 2025
671B
128,0008,192$2 / $8
Sonar
Jan 25, 2025128,0008,196$1
Sonar Deep Research
Feb 14, 2025200,0008,196$2 / $8
Sonar Pro
Jan 25, 2025200,0008,196$3 / $15
Sonar Reasoning
Jan 30, 2025128,0008,196$1 / $5
Sonar Reasoning Pro
Jan 30, 2025200,0008,196$2 / $8
Sonar medium
Chat
Feb 24, 2024127,0724,096$0.6 / $1.8
Sonar medium
Online
Feb 24, 2024127,0724,096$0 / $1.8
Sonar small
Chat
Feb 24, 2024127,0724,096$0.07 / $0.28
Sonar small
Online
Feb 24, 2024127,0724,096$0 / $0.28
pplx 70B
Chat
Oct 27, 2023
70B
16,3844,096$0.7 / $2.8
pplx 70B
Online
Nov 29, 2023
70B
16,3844,096$0 / $2.8
pplx 7B
Online
Nov 29, 2023
7B
16,3844,096$0 / $0.28
pplx 7B
Chat
Oct 27, 2023
7B
16,3844,096$0.07 / $0.28
Replicate
San Francisco
Llama 2 13B
13B
4,0964,096$0.1 / $0.5
Llama 2 13B chat
13B
4,0964,096$0.1 / $0.5
Meta Llama 3 8B
8B
4,0964,096$0.05 / $0.25
xAI
Bay Area
Grok 1
Nov 06, 2020
314B
8,1928,192$0
Grok 1.5
Mar 28, 2024128,0008,192$0
Grok 2
Aug 13, 2024
314B
131,0728,192$2 / $10
Grok 2 mini
Aug 13, 2024
35B
131,0728,192$0
Grok Beta
Nov 04, 2024131,0728,192$5 / $15

Legend

Status:  
Supported   Deprecated   Not available

Distribution:  
Proprietary   Open weights   Open source

Release:
Dates marked with a dagger (†) indicate that the model has already been deprecated or will be retired at the given date.

Links:  
Info   Announcement   Benchmark

Disclaimer

Even though we try to keep this list accurate and up-to-date, please do not rely on the presented information for critical use cases and always double check official sources.

Model Parameters

Promptmetheus currently relies on the LiteLLM open-source library to efficiently connect to the different LLM APIs. You can find all relevant details for model parameter support in their documentation.

Shout-out to Krrish and Ishaan who make adding new models easy as pie.

LLM Support

We aim to evenutally support all major LLM providers that have a public API and always add new foundation models as soon as they get released. It is also planned to support local inference and hosted products from Amazon, Google, Microsoft, etc. in the future.

If there is a provider or model missing that you would like to use in your Promptmetheus project, please don't hesitate to request it.

LLM Benchmarks

We currently don't do any benchmarking ourselves. Please consult the listed links for benchmarks of each specific model.

Promptmetheus © 2023-present. All Rights Reserved.