LLM Index

Overview and comparison of available inference API providers and supported LLMs.

Promptmetheus currently supports 11 providers and 85 models.
Provider
Model
Status
Distribution
Release
Max. tokens (in/out)
Token price in/out ($/1M)
Links
AI21
Jurassic 2 Light
Mar 9, 2023 8,192$0.1 / $0.5
Jurassic 2 Mid
Mar 9, 2023 8,192$0.25 / $1.25
Jurassic 2 Ultra
Mar 9, 2023 8,192$2 / $10
Aleph Alpha
Luminous Base
Apr 14, 2023 2,048$30
Luminous Base Control
Apr 14, 2023 2,048$37.5
Luminous Extended
Apr 14, 2023 2,048$45
Luminous Extended Control
Apr 14, 2023 2,048$56.25
Luminous Supreme
Apr 14, 2023 2,048$175
Luminous Supreme Control
Apr 14, 2023 2,048$218.75
Anthropic
Claude 2
Jul 11, 2023 100,000$8 / $24
Claude 2.1
Nov 21, 2023 200,000$8 / $24
Claude 3 Haiku
Mar 15, 2024 200,000$0.25 / $1.25
Claude 3 Opus
Mar 4, 2024 200,000$15 / $75
Claude 3 Sonnet
Mar 4, 2024 200,000$3 / $15
Claude 3.5 Sonnet 20240620
Jun 20, 2024 200,000 / 4,096$3 / $15
Claude Instant 1
Mar 14, 2023 100,000$1.63 / $5.51
Claude Instant 1.2
Sep 21, 2023 100,000$1.63 / $5.51
Cohere
Command
4,096$15
Command Light
4,096$15
Command Nightly
4,096$15
Command R
Mar 11, 2024 128,000$0.5 / $1.5
Command R+
Apr 4, 2024 128,000$3 / $15
Deep Infra
Code Llama 34B Instruct HF
Aug 24, 2023 4,096$0.6
Llama 2 13B Chat HF
Jul 18, 2023 4,096$0.35
Llama 2 70B Chat HF
Jul 18, 2023 6,144$1.88
Llama 2 7B Chat HF
Jul 18, 2023 4,096$0.2
Mistral 7B Instruct v0.1
Sep 27, 2023 4,096$0.2
Google AI
Chat Bison
May 10, 2023 4,096$0.5
Code Bison
May 10, 2023 6,144$0.5
Gemini 1.0 Pro
Dec 13, 2023 30,720$0.5 / $1.5
Gemini 1.5 Flash (latest)
May 14, 2024 1,048,576$0.35 / $0.53
Gemini 1.5 Pro
Feb 15, 2024 1,048,576$7 / $21
Gemini 1.5 Pro (latest)
Feb 15, 2024 1,048,576$3.5 / $10.5
Text Bison
May 10, 2023 8,192$0.5
Groq
Gemma 2 9B
Jun 27, 2024 8,192$0.07
Gemma 7B
Jan 15, 2024 8,192$0.07
Llama 2 70B
Jan 15, 2024 4,096$0.59 / $0.79
Llama 3 70B
Apr 18, 2024 8,192 / 8,192$0.59 / $0.79
Llama 3 8B
Apr 18, 2024 8,192 / 8,192$0.05 / $0.08
Llama 3.1 405B (preview)
Jul 23, 2024 131,072$0
Llama 3.1 70B (preview)
Jul 23, 2024 131,072$0
Llama 3.1 8B (preview)
Jul 23, 2024 131,072$0
Mixtral 8x7B
Jan 15, 2024 32,768$0.24
Mistral
Large (latest)
Feb 26, 2024 32,000 / 8,192$4 / $12
Medium (2312)
Dec 11, 2023 32,000 / 8,192$2.7 / $8.1
Medium (latest)
Dec 11, 2023 32,000 / 8,192$2.7 / $8.1
Open Mixtral 8x22B
Apr 17, 2024 64,000 / 8,192$2 / $6
Small (Mixtral 8x7B)
Dec 11, 2023 32,000 / 8,192$0.7
Small (latest)
Dec 11, 2023 32,000 / 8,192$1 / $3
Tiny (Mistral 7B)
Dec 11, 2023 32,000 / 8,192$0.25
NLP Cloud
Chat Dolphin
16,384$0.5
Dolphin
16,384$0.5
OpenAI
Babbage 002
16,384$1.6
DaVinci 002
16,384$12
GPT-3.5 Turbo
Nov 30, 2022 4,096$1.5 / $2
GPT-3.5 Turbo 0125
Jan 25, 2024 16,385$0.5 / $1.5
GPT-3.5 Turbo 0301
(โ€ Jun 13, 2024)4,096$1.5 / $2
GPT-3.5 Turbo 0613
4,096$1.5 / $2
GPT-3.5 Turbo 1106
Nov 6, 2023 16,385$1 / $2
GPT-3.5 Turbo 16k
Nov 30, 2022 16,384$3 / $4
GPT-3.5 Turbo 16k 0301
(โ€ Jun 13, 2024)16,384$3 / $4
GPT-3.5 Turbo 16k 0613
(โ€ Jun 13, 2024)16,384$3 / $4
GPT-3.5 Turbo Instruct
4,096$1.5 / $2
GPT-4
Mar 14, 2023 8,192$30 / $60
GPT-4 0125 Turbo
Jan 25, 2024 128,000$10 / $30
GPT-4 0314
8,192$30 / $60
GPT-4 0613
8,192$30 / $60
GPT-4 1106 Turbo
Nov 6, 2023 128,000$10 / $30
GPT-4 32k
Mar 14, 2023 32,768$60 / $120
GPT-4 32k 0314
32,768$60 / $120
GPT-4 32k 0613
32,768$60 / $120
GPT-4 Turbo
Jan 25, 2024 128,000$10 / $30
GPT-4 Turbo 2024-04-09
Apr 9, 2024 128,000$10 / $30
GPT-4o
May 13, 2024 128,000 / 4,096$5 / $15
GPT-4o (2024-05-13)
May 13, 2024 128,000 / 4,096$5 / $15
GPT-4o mini
Jul 18, 2024 128,000 / 4,096$0.15 / $0.6
GPT-4o mini (2024-07-18)
Jul 18, 2024 128,000 / 4,096$0.15 / $0.6
Perplexity
Code Llama 34B Instruct
Oct 4, 2023 16,384$0.35 / $1.4
Code Llama 70B Instruct
Oct 4, 2023 16,384$0.7 / $2.8
Llama 2 70B
Oct 4, 2023 4,096$0.7 / $2.8
Llama 3 70B instruct
May 14, 2024 8,192$0 / $1
Llama 3 8B instruct
May 14, 2024 8,192$0 / $0.2
Llama 3 Sonar large 32k (chat)
May 14, 2024 32,768$0 / $1
Llama 3 Sonar large 32k (online)
May 14, 2024 28,000$0 / $1
Llama 3 Sonar small 32k (chat)
May 14, 2024 32,768$0 / $0.2
Llama 3 Sonar small 32k (online)
May 14, 2024 28,000$0 / $0.2
Mistral 7B
Oct 4, 2023 4,096$0.07 / $0.28
Mixtral 8x7B Instruct
Oct 4, 2023 16,384$0 / $0.6
Sonar medium (chat)
Feb 24, 2024 16,384$0.6 / $1.8
Sonar medium (online)
Feb 24, 2024 12,000$0 / $1.8
Sonar small (chat)
Feb 24, 2024 16,384$0.07 / $0.28
Sonar small (online)
Feb 24, 2024 12,000$0 / $0.28
pplx 70B chat
Oct 27, 2023 4,096$0.7 / $2.8
pplx 70B online
Nov 29, 2023 4,096$0 / $2.8
pplx 7B chat
Oct 27, 2023 8,192$0.07 / $0.28
pplx 7B online
Nov 29, 2023 4,096$0 / $0.28
Replicate
Llama 2 13B
4,096 / 4,096$0.1 / $0.5
Llama 2 13B chat
4,096 / 4,096$0.1 / $0.5
Meta Llama 3 8B
4,096 / 4,096$0.05 / $0.25
xAI
Grok 1
Nov 6, 2020 25,000$0
Grok 1.5
Mar 28, 2024 128,000$0

Legend

Status:ย ย 
Supportedย ย  Not availableย ย  Deprecated

Distribution:ย ย 
Proprietaryย ย  Open weightsย ย  Open source

Release:
Dates marked with a dagger (โ€ ) indicate that the model has already been deprecated or will be retired at the given date.

Links:ย ย 
Infoย ย  Announcementย ย  Benchmark

Disclaimer

Even though we try to keep this list accurate and up-to-date, please do not rely on the presented information for critical use cases and always double check official sources.

Model Parameters

Promptmetheus currently relies on the LiteLLM open-source library to efficiently connect to the different LLM APIs. You can find all relevant details for model parameter support in their documentation.

Shout-out to Krrish and Ishaan who make adding new models easy as pie.

LLM Support

We aim to evenutally support all major LLM providers that have a public API and always add new foundation models as soon as they get released. It is also planned to support local inference and hosted products from Amazon, Google, Microsoft, etc. in the future.

If there is a provider or model missing that you would like to use in your Promptmetheus project, please don't hesitate to request it.

LLM Benchmarks

We currently don't do any benchmarking ourselves. Please consult the listed links for benchmarks of each specific model.

PROMPTMETHEUS ยฉ 2024