Model Catalog

Check out the vast selection of large language models, their capabilities, knowledge freshness, price and other technical details.

Filter Models

Nova Micro

Now Serving

Provider
Amazon Bedrock
Originator
Amazon
Max Context Tokens
128000
Max Generated Tokens
5000
Data Freshness
October 2023

$0.14 Per Million Generated Tokens

Generated token cost by provider

Nova Micro

Now Serving

Version
nova_micro
Description
Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. It is highly performant at language understanding, translation, reasoning, code completion, brainstorming, and mathematical problem-solving. With its generation speed of over 200 tokens per second, Amazon Nova Micro is ideal for applications that require fast responses. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.035 Per Million Context Tokens

Context token cost by provider

Nova Lite

Now Serving

Provider
Amazon Bedrock
Originator
Amazon
Max Context Tokens
300000
Max Generated Tokens
5000
Data Freshness
October 2023

$0.24 Per Million Generated Tokens

Generated token cost by provider

Nova Lite

Now Serving

Version
nova_lite
Description
Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Lite’s accuracy across a breadth of tasks, coupled with its lightning-fast speed, makes it suitable for a wide range of interactive and high-volume applications where cost is a key consideration. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.06 Per Million Context Tokens

Context token cost by provider

Nova Pro

Now Serving

Provider
Amazon Bedrock
Originator
Amazon
Max Context Tokens
300000
Max Generated Tokens
5000
Data Freshness
October 2023

$4.0 Per Million Generated Tokens

Generated token cost by provider

Nova Pro

Now Serving

Version
nova_pro
Description
Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multi-step workflows. In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$1.0 Per Million Context Tokens

Context token cost by provider

Nova Premier

Now Serving

Provider
Amazon Bedrock
Originator
Amazon
Max Context Tokens
1000000
Max Generated Tokens
5000
Data Freshness
October 2023

$12.5 Per Million Generated Tokens

Generated token cost by provider

Nova Premier

Now Serving

Version
nova_premier
Description
Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$2.5 Per Million Context Tokens

Context token cost by provider

Reasoner-R1

Now Serving

Provider
DeepSeek
Originator
DeepSeek
Max Context Tokens
64000
Max Generated Tokens
64000
Data Freshness
Unpublished

$2.19 Per Million Generated Tokens

Generated token cost by provider

Reasoner-R1

Now Serving

Version
deepseek_reasoner
Description
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.55 Per Million Context Tokens

Context token cost by provider

Chat-V3

Now Serving

Provider
DeepSeek
Originator
DeepSeek
Max Context Tokens
64000
Max Generated Tokens
8192
Data Freshness
Unpublished

$1.1 Per Million Generated Tokens

Generated token cost by provider

Chat-V3

Now Serving

Version
deepseek_chat
Description
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.27 Per Million Context Tokens

Context token cost by provider

GPT-3.5 Turbo

Deprecated

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
16385
Max Generated Tokens
4096
Data Freshness
Sep 2021

$1.5 Per Million Generated Tokens

Generated token cost by provider

GPT-3.5 Turbo

Deprecated

Version
gpt_3_5_turbo
Description
The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.5 Per Million Context Tokens

Context token cost by provider

GPT-4

Deprecated

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
8192
Max Generated Tokens
8192
Data Freshness
Sep 2021

$60.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4

Deprecated

Version
gpt_4
Description
Snapshot of gpt-4 from June 13th 2023 with improved tool calling support. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$30.0 Per Million Context Tokens

Context token cost by provider

GPT-4.1

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
1047576
Max Generated Tokens
32768
Data Freshness
May 2024

$8.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4.1

Now Serving

Version
gpt_4_1
Description
GPT-4.1 is our flagship model for complex tasks. It is well suited for problem solving across domains. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.0 Per Million Context Tokens

Context token cost by provider

GPT-4o

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
128000
Max Generated Tokens
16384
Data Freshness
Sep 2023

$10.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4o

Now Serving

Version
gpt_4o
Description
GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is the best model for most tasks, and is our most capable model outside of our o-series models. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.5 Per Million Context Tokens

Context token cost by provider

4o Mini

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
128000
Max Generated Tokens
16384
Data Freshness
Sep 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

4o Mini

Now Serving

Version
gpt_4o_mini
Description
GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.15 Per Million Context Tokens

Context token cost by provider

o1 Mini

Deprecated

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
128000
Max Generated Tokens
65536
Data Freshness
Sep 2023

$12.0 Per Million Generated Tokens

Generated token cost by provider

o1 Mini

Deprecated

Version
gpt_o1_mini
Description
The o1 reasoning model is designed to solve hard problems across domains. o1-mini is a faster and more affordable reasoning model, but we recommend using the newer o3-mini model that features higher intelligence at the same latency and price as o1-mini. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$3.0 Per Million Context Tokens

Context token cost by provider

o1

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
Sep 2023

$60.0 Per Million Generated Tokens

Generated token cost by provider

o1

Now Serving

Version
o1
Description
The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$15.0 Per Million Context Tokens

Context token cost by provider

o1 Pro

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
Sep 2023

$600.0 Per Million Generated Tokens

Generated token cost by provider

o1 Pro

Now Serving

Version
o1_pro
Description
The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$150.0 Per Million Context Tokens

Context token cost by provider

o3 Mini

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
Sep 2023

$4.4 Per Million Generated Tokens

Generated token cost by provider

o3 Mini

Now Serving

Version
o3_mini
Description
o3-mini is our newest small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini supports key developer features, like Structured Outputs, function calling, and Batch API. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$1.1 Per Million Context Tokens

Context token cost by provider

o3

Preview Only

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
May 2024

$8.0 Per Million Generated Tokens

Generated token cost by provider

o3

Preview Only

Version
o3
Description
o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.0 Per Million Context Tokens

Context token cost by provider

o3-Pro

Preview Only

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
May 2024

$80.0 Per Million Generated Tokens

Generated token cost by provider

o3-Pro

Preview Only

Version
o3_pro
Description
The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$20.0 Per Million Context Tokens

Context token cost by provider

o4-Mini

Now Serving

Provider
OpenAI
Originator
OpenAI
Max Context Tokens
200000
Max Generated Tokens
100000
Data Freshness
May 2024

$2.2 Per Million Generated Tokens

Generated token cost by provider

o4-Mini

Now Serving

Version
o4_mini
Description
o4-mini is our latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.55 Per Million Context Tokens

Context token cost by provider

Mistral Small

Now Serving

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
128000
Max Generated Tokens
8192
Data Freshness
Unpublished

$0.3 Per Million Generated Tokens

Generated token cost by provider

Mistral Small

Now Serving

Version
mistral_small
Description
This new model comes with improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. The model outperforms comparable models like Gemma 3 and GPT-4o Mini, while delivering inference speeds of 150 tokens per second. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.1 Per Million Context Tokens

Context token cost by provider

Mixtral 8x22B

Deprecated

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
64000
Max Generated Tokens
8192
Data Freshness
Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mixtral 8x22B

Deprecated

Version
mixtral_8_22b
Description
A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Medium

Now Serving

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
128000
Max Generated Tokens
8192
Data Freshness
Unpublished

$2.0 Per Million Generated Tokens

Generated token cost by provider

Mistral Medium

Now Serving

Version
mistral_medium
Description
Mistral Medium 3 delivers frontier performance while being an order of magnitude less expensive. For instance, the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks across the board at a significantly lower cost. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.4 Per Million Context Tokens

Context token cost by provider

Mistral Large

Now Serving

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
128000
Max Generated Tokens
8192
Data Freshness
Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mistral Large

Now Serving

Version
mistral_large
Description
Pixtral Large, a 124B open-weights multimodal model built on top of Mistral Large 2. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Nemo

Deprecated

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
128000
Max Generated Tokens
8192
Data Freshness
Unpublished

$0.15 Per Million Generated Tokens

Generated token cost by provider

Mistral Nemo

Deprecated

Version
mistral_nemo
Description
A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.15 Per Million Context Tokens

Context token cost by provider

Magistral Small

Now Serving

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
40000
Max Generated Tokens
8192
Data Freshness
Unpublished

$1.5 Per Million Generated Tokens

Generated token cost by provider

Magistral Small

Now Serving

Version
magistral_small
Description
Magistral — our first reasoning model. Released in both open and enterprise versions, Magistral is designed to think things through — in ways familiar to us — while bringing expertise across professional domains, transparent reasoning that you can follow and verify, along with deep multilingual flexibility. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.5 Per Million Context Tokens

Context token cost by provider

Magistral Medium

Now Serving

Provider
MistralAI
Originator
MistralAI
Max Context Tokens
40000
Max Generated Tokens
8192
Data Freshness
Unpublished

$5.0 Per Million Generated Tokens

Generated token cost by provider

Magistral Medium

Now Serving

Version
magistral_medium
Description
Magistral — our first reasoning model. Released in both open and enterprise versions, Magistral is designed to think things through — in ways familiar to us — while bringing expertise across professional domains, transparent reasoning that you can follow and verify, along with deep multilingual flexibility. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Gemini1.0 Pro

Deprecated

Provider
Google VertexAI
Originator
Google
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Feb 2024

$1.5 Per Million Generated Tokens

Generated token cost by provider

Gemini1.0 Pro

Deprecated

Version
gemini_1_0_pro
Description
Gemini 1.0 Pro is an NLP model that handles tasks like multi-turn text and code chat, and code generation. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$0.5 Per Million Context Tokens

Context token cost by provider

Gemini1.5 Pro

Now Serving

Provider
Google VertexAI
Originator
Google
Max Context Tokens
2097152
Max Generated Tokens
8192
Data Freshness
May 2024

$10.0 Per Million Generated Tokens

Generated token cost by provider

Gemini1.5 Pro

Now Serving

Version
gemini_1_5_pro
Description
Gemini 1.5 Pro is a mid-size multimodal model that is optimized for a wide-range of reasoning tasks. 1.5 Pro can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$2.5 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Flash

Now Serving

Provider
Google VertexAI
Originator
Google
Max Context Tokens
1048576
Max Generated Tokens
65536
Data Freshness
January 2025

$2.5 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Flash

Now Serving

Version
gemini_2_5_flash
Description
Google's best model in terms of price-performance, offering well-rounded capabilities. 2.5 Flash is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$1.0 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Flash Lite

Preview Only

Provider
Google VertexAI
Originator
Google
Max Context Tokens
1000000
Max Generated Tokens
64000
Data Freshness
January 2025

$0.4 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Flash Lite

Preview Only

Version
gemini_2_5_flash_lite
Description
A Gemini 2.5 Flash model optimized for cost efficiency and low latency. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$0.5 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Pro

Now Serving

Provider
Google VertexAI
Originator
Google
Max Context Tokens
1048576
Max Generated Tokens
65536
Data Freshness
January 2025

$15.0 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Pro

Now Serving

Version
gemini_2_5_pro
Description
Gemini 2.5 Pro is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$2.5 Per Million Context Tokens

Context token cost by provider

Claude3 Haiku

Deprecated

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
4096
Data Freshness
August 2023

$1.25 Per Million Generated Tokens

Generated token cost by provider

Claude3 Haiku

Deprecated

Version
claude_v3_haiku
Description
Fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.25 Per Million Context Tokens

Context token cost by provider

Claude3 Sonnet

Deprecated

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
4096
Data Freshness
August 2023

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Sonnet

Deprecated

Version
claude_v3_sonnet
Description
Balance of intelligence and speed. Strong utility, balanced for scaled deployments Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Claude3 Opus

Now Serving

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
4096
Data Freshness
August 2023

$75.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Opus

Now Serving

Version
claude_v3_opus
Description
Powerful model for highly complex tasks. Top-level performance, intelligence, fluency, and understanding Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$15.0 Per Million Context Tokens

Context token cost by provider

Claude3.5 Haiku

Now Serving

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
8192
Data Freshness
July 2024

$4.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Haiku

Now Serving

Version
claude_v3_5_haiku
Description
Fastest model; serves with intelligence at blazing speeds. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.8 Per Million Context Tokens

Context token cost by provider

Claude3.5 Sonnet

Now Serving

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
8192
Data Freshness
April 2024

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Sonnet

Now Serving

Version
claude_v3_5_sonnet
Description
Most intelligent model with Highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Claude4 Opus

Now Serving

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
32000
Data Freshness
March 2025

$75.0 Per Million Generated Tokens

Generated token cost by provider

Claude4 Opus

Now Serving

Version
claude_v4_opus
Description
Anthropic's most capable model with highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$15.0 Per Million Context Tokens

Context token cost by provider

Claude4 Sonnet

Now Serving

Provider
Anthropic
Originator
Anthropic
Max Context Tokens
200000
Max Generated Tokens
64000
Data Freshness
March 2025

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude4 Sonnet

Now Serving

Version
claude_v4_sonnet
Description
Anthropic's high-performance model with high intelligence and balanced performance. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3 8B

Deprecated

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
8192
Max Generated Tokens
8192
Data Freshness
Mar 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

Llama3 8B

Deprecated

Version
llama3_8b_instruct
Description
Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.3 Per Million Context Tokens

Context token cost by provider

Llama3 70B

Deprecated

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
8192
Max Generated Tokens
8192
Data Freshness
Dec 2023

$3.5 Per Million Generated Tokens

Generated token cost by provider

Llama3 70B

Deprecated

Version
llama3_70b_instruct
Description
Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$2.65 Per Million Context Tokens

Context token cost by provider

Llama3.1 8B

Deprecated

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Dec 2023

$0.22 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 8B

Deprecated

Version
llama3_1_8b_instruct
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.22 Per Million Context Tokens

Context token cost by provider

Llama3.1 70B

Deprecated

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Dec 2023

$0.9 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 70B

Deprecated

Version
llama3_1_70b_instruct
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.9 Per Million Context Tokens

Context token cost by provider

Llama3.1 405B

Coming Soon

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Dec 2023

$3.0 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 405B

Coming Soon

Version
llama3_1_405b_instruct
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3.2 11B

Now Serving

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
December 2023

$0.16 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 11B

Now Serving

Version
llama3_2_11b_instruct
Description
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.16 Per Million Context Tokens

Context token cost by provider

Llama3.2 90B

Now Serving

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
December 2023

$0.72 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 90B

Now Serving

Version
llama3_2_90b_instruct
Description
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.72 Per Million Context Tokens

Context token cost by provider

Llama3.3 70B

Now Serving

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
December 2023

$0.72 Per Million Generated Tokens

Generated token cost by provider

Llama3.3 70B

Now Serving

Version
llama3_3_70b_instruct
Description
Text-only 70B instruction-tuned model that provides enhanced performance relative to Llama 3.1 70B–and to Llama 3.2 90B when used for text-only applications. Llama 3.3 70B delivers similar performance to Llama 3.1 405B while requiring only a fraction of the computational resources. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.72 Per Million Context Tokens

Context token cost by provider

Llama4 Maverick

Now Serving

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
1000000
Max Generated Tokens
4096
Data Freshness
August 2024

$0.97 Per Million Generated Tokens

Generated token cost by provider

Llama4 Maverick

Now Serving

Version
llama4_maverick
Description
A general purpose model featuring 128 experts and 400 billion total parameters. It excels in text understanding across 12 languages and English image understanding, making it suitable for versatile assistant and chat applications. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.24 Per Million Context Tokens

Context token cost by provider

Llama4 Scout

Now Serving

Provider
Amazon Bedrock
Originator
Meta
Max Context Tokens
3500000
Max Generated Tokens
4096
Data Freshness
August 2024

$0.66 Per Million Generated Tokens

Generated token cost by provider

Llama4 Scout

Now Serving

Version
llama4_scout
Description
A general purpose multimodal model with 16 experts, 17 billion active parameters, and 109 billion total parameters. Its multimillion context window enables comprehensive multi-document analysis, establishing it as a uniquely powerful and efficient model in its class. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.17 Per Million Context Tokens

Context token cost by provider

Command R

Now Serving

Provider
Amazon Bedrock
Originator
Cohere
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Unpublished

$1.5 Per Million Generated Tokens

Generated token cost by provider

Command R

Now Serving

Version
command_r
Description
Command R is a large language model optimized for conversational interaction and long context tasks. It targets the “scalable” category of models that balance high performance with strong accuracy, enabling companies to move beyond proof of concept and into production. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$0.5 Per Million Context Tokens

Context token cost by provider

Command R+

Now Serving

Provider
Amazon Bedrock
Originator
Cohere
Max Context Tokens
128000
Max Generated Tokens
4096
Data Freshness
Unpublished

$15.0 Per Million Generated Tokens

Generated token cost by provider

Command R+

Now Serving

Version
command_r_plus
Description
Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads. It is most powerful, scalable large language model (LLM) purpose-built to excel at real-world enterprise use cases. Command R+ joins our R-series of LLMs focused on balancing high efficiency with strong accuracy, enabling businesses to move beyond proof-of-concept, and into production with AI. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$3.0 Per Million Context Tokens

Context token cost by provider

Model Catalog

Nova Micro

Amazon Bedrock

Amazon

128000

5000

October 2023

$0.14 Per Million Generated Tokens

Nova Micro

nova_micro

$0.035 Per Million Context Tokens

Nova Lite

Amazon Bedrock

Amazon

300000

5000

October 2023

$0.24 Per Million Generated Tokens

Nova Lite

nova_lite

$0.06 Per Million Context Tokens

Nova Pro

Amazon Bedrock

Amazon

300000

5000

October 2023

$4.0 Per Million Generated Tokens

Nova Pro

nova_pro

$1.0 Per Million Context Tokens

Nova Premier

Amazon Bedrock

Amazon

1000000

5000

October 2023

$12.5 Per Million Generated Tokens

Nova Premier

nova_premier

$2.5 Per Million Context Tokens

Reasoner-R1

DeepSeek

DeepSeek

64000

64000

Unpublished

$2.19 Per Million Generated Tokens

Reasoner-R1

deepseek_reasoner

$0.55 Per Million Context Tokens

Chat-V3

DeepSeek

DeepSeek

64000

8192

Unpublished

$1.1 Per Million Generated Tokens

Chat-V3

deepseek_chat

$0.27 Per Million Context Tokens

GPT-3.5 Turbo

OpenAI

OpenAI

16385

4096

Sep 2021

$1.5 Per Million Generated Tokens

GPT-3.5 Turbo

gpt_3_5_turbo

$0.5 Per Million Context Tokens

GPT-4

OpenAI

OpenAI

8192

8192

Sep 2021

$60.0 Per Million Generated Tokens

GPT-4

gpt_4