How to Choose the Right LLM for your App, Agent, Workflow, ...

A systematic approach to figuring out which model yields the best results for a specific use case

Toni Engelhardt
Mar 31, 2024 · 1 min read

"Which LLM is the best for this workflow?" is the question you probably have asked yourself many times if you are working with Large Language Models (LLMs) — and unfortunately, there is no universal answer to this question.

The short answer is: "It depends on your use case, your budget, and in some cases also on your personal preferences."

This doesn't help you though...

Let me give you a step-by-step approach to figure out which LLM to pick:

If you are new to the game, start with the industry standard, OpenAI GPT-5 mini for simple tasks and GPT-5 / GPT-5 pro for more complex ones. Depending on when you read this, there might be a new king on the hill. Check the LLM Benchmarks Use Promptmetheus to compose and optimize your prompts until you get satisfying and reproducible completions. If you do not get anywhere with OpenAI models, try different providers like Anthropic, DeepMind, or xAI.

Once you have a working prompt, try to optimize it for performance, speed, reliability, and/or cost (depending on your requirements) by comparing different LLMs and configurations of model parameters (temperature, frequency penalty, etc). You can find the full list of the 150+ we support in our LLM Index.

Goal

The goal is to find the 1) cheapest model, which is 2) fast enough, and can do the job 3) reliably.

If you already have some experience with Prompt Engineering, start with a model that previously worked well for use cases similar to the one you have at hand (keep in mind that the technology is evolving fast and your initial selection will likely change every few months), but before you start fine-tuning, take an early version and execute it with a few different LLMs to see if any of them yields superior results. If yes, switch to that one.

Now, test the prompt with different inputs and make adjustments (one section at a time) until you achieve great results. Run it again with alternative models and settings to see if any of them performs better. Rinse and repeat until performance reaches a plateau.

That's it, your prompt is ready for action.

Share this post with your friends

Keep reading...

Apr 01, 2024 by Helena Dias
A real-world case study on using Prompt Engineering to make HR and People Operations tasks easy as pie
Jul 21, 2023 by Toni Engelhardt
Exploring 3 different projects that connect to multiple providers and simplify the LLM integration process
Jun 7, 2023 by Toni Engelhardt
The hacks and techniques that take your prompting game to the next level and help you craft better instructions for your LLM workflows
Jun 6, 2023 by Toni Engelhardt
Software Engineers have powerful tools to develop code efficiently – let's bring that to Prompt Engineering