Model Catalog

Check out the vast selection of large language models, their capabilities, knowledge freshness, price and other technical details.

Nova Micro

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    128000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$0.14 Per Million Generated Tokens

Generated token cost by provider

Nova Micro

Now Serving

  • Version

    nova_micro

  • Description

    Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. It is highly performant at language understanding, translation, reasoning, code completion, brainstorming, and mathematical problem-solving. With its generation speed of over 200 tokens per second, Amazon Nova Micro is ideal for applications that require fast responses. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.035 Per Million Context Tokens

Context token cost by provider

Nova Lite

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    300000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$0.24 Per Million Generated Tokens

Generated token cost by provider

Nova Lite

Now Serving

  • Version

    nova_lite

  • Description

    Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Lite’s accuracy across a breadth of tasks, coupled with its lightning-fast speed, makes it suitable for a wide range of interactive and high-volume applications where cost is a key consideration. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.06 Per Million Context Tokens

Context token cost by provider

Nova Pro

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    300000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$4.0 Per Million Generated Tokens

Generated token cost by provider

Nova Pro

Now Serving

  • Version

    nova_pro

  • Description

    Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multi-step workflows. In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$1.0 Per Million Context Tokens

Context token cost by provider

Nova Premier

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    1000000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$12.5 Per Million Generated Tokens

Generated token cost by provider

Nova Premier

Now Serving

  • Version

    nova_premier

  • Description

    Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$2.5 Per Million Context Tokens

Context token cost by provider

Reasoner-R1

Now Serving

  • Provider

    DeepSeek

  • Originator

    DeepSeek

  • Max Context Tokens

    64000

  • Max Generated Tokens

    64000

  • Data Freshness

    Unpublished

$2.19 Per Million Generated Tokens

Generated token cost by provider

Reasoner-R1

Now Serving

  • Version

    deepseek_reasoner

  • Description

    We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.55 Per Million Context Tokens

Context token cost by provider

Chat-V3

Now Serving

  • Provider

    DeepSeek

  • Originator

    DeepSeek

  • Max Context Tokens

    64000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$1.1 Per Million Generated Tokens

Generated token cost by provider

Chat-V3

Now Serving

  • Version

    deepseek_chat

  • Description

    We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.27 Per Million Context Tokens

Context token cost by provider

GPT-3.5 Turbo

Deprecated

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    16385

  • Max Generated Tokens

    4096

  • Data Freshness

    Sep 2021

$1.5 Per Million Generated Tokens

Generated token cost by provider

GPT-3.5 Turbo

Deprecated

  • Version

    gpt_3_5_turbo

  • Description

    The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.5 Per Million Context Tokens

Context token cost by provider

GPT-4

Deprecated

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Sep 2021

$60.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4

Deprecated

  • Version

    gpt_4

  • Description

    Snapshot of gpt-4 from June 13th 2023 with improved tool calling support. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$30.0 Per Million Context Tokens

Context token cost by provider

GPT-4.1

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    1047576

  • Max Generated Tokens

    32768

  • Data Freshness

    May 2024

$8.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4.1

Now Serving

  • Version

    gpt_4_1

  • Description

    GPT-4.1 is our flagship model for complex tasks. It is well suited for problem solving across domains. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.0 Per Million Context Tokens

Context token cost by provider

GPT-4o

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    16384

  • Data Freshness

    Sep 2023

$10.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4o

Now Serving

  • Version

    gpt_4o

  • Description

    GPT-4o ('o' for 'omni') is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is the best model for most tasks, and is our most capable model outside of our o-series models. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.5 Per Million Context Tokens

Context token cost by provider

4o Mini

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    16384

  • Data Freshness

    Sep 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

4o Mini

Now Serving

  • Version

    gpt_4o_mini

  • Description

    GPT-4o mini ('o' for 'omni') is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.15 Per Million Context Tokens

Context token cost by provider

o1 Mini

Deprecated

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    65536

  • Data Freshness

    Sep 2023

$12.0 Per Million Generated Tokens

Generated token cost by provider

o1 Mini

Deprecated

  • Version

    gpt_o1_mini

  • Description

    The o1 reasoning model is designed to solve hard problems across domains. o1-mini is a faster and more affordable reasoning model, but we recommend using the newer o3-mini model that features higher intelligence at the same latency and price as o1-mini. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$3.0 Per Million Context Tokens

Context token cost by provider

o1

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    Sep 2023

$60.0 Per Million Generated Tokens

Generated token cost by provider

o1

Now Serving

  • Version

    o1

  • Description

    The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$15.0 Per Million Context Tokens

Context token cost by provider

o1 Pro

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    Sep 2023

$600.0 Per Million Generated Tokens

Generated token cost by provider

o1 Pro

Now Serving

  • Version

    o1_pro

  • Description

    The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$150.0 Per Million Context Tokens

Context token cost by provider

o3 Mini

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    Sep 2023

$4.4 Per Million Generated Tokens

Generated token cost by provider

o3 Mini

Now Serving

  • Version

    o3_mini

  • Description

    o3-mini is our newest small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini supports key developer features, like Structured Outputs, function calling, and Batch API. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$1.1 Per Million Context Tokens

Context token cost by provider

o3

Preview Only

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    May 2024

$8.0 Per Million Generated Tokens

Generated token cost by provider

o3

Preview Only

  • Version

    o3

  • Description

    o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.0 Per Million Context Tokens

Context token cost by provider

o3-Pro

Preview Only

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    May 2024

$80.0 Per Million Generated Tokens

Generated token cost by provider

o3-Pro

Preview Only

  • Version

    o3_pro

  • Description

    The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$20.0 Per Million Context Tokens

Context token cost by provider

o4-Mini

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    200000

  • Max Generated Tokens

    100000

  • Data Freshness

    May 2024

$2.2 Per Million Generated Tokens

Generated token cost by provider

o4-Mini

Now Serving

  • Version

    o4_mini

  • Description

    o4-mini is our latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.55 Per Million Context Tokens

Context token cost by provider

Mistral Small

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.3 Per Million Generated Tokens

Generated token cost by provider

Mistral Small

Now Serving

  • Version

    mistral_small

  • Description

    This new model comes with improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. The model outperforms comparable models like Gemma 3 and GPT-4o Mini, while delivering inference speeds of 150 tokens per second. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.1 Per Million Context Tokens

Context token cost by provider

Mixtral 8x22B

Deprecated

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    64000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mixtral 8x22B

Deprecated

  • Version

    mixtral_8_22b

  • Description

    A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Medium

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$2.0 Per Million Generated Tokens

Generated token cost by provider

Mistral Medium

Now Serving

  • Version

    mistral_medium

  • Description

    Mistral Medium 3 delivers frontier performance while being an order of magnitude less expensive. For instance, the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks across the board at a significantly lower cost. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.4 Per Million Context Tokens

Context token cost by provider

Mistral Large

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mistral Large

Now Serving

  • Version

    mistral_large

  • Description

    Pixtral Large, a 124B open-weights multimodal model built on top of Mistral Large 2. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Nemo

Deprecated

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.15 Per Million Generated Tokens

Generated token cost by provider

Mistral Nemo

Deprecated

  • Version

    mistral_nemo

  • Description

    A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.15 Per Million Context Tokens

Context token cost by provider

Magistral Small

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    40000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$1.5 Per Million Generated Tokens

Generated token cost by provider

Magistral Small

Now Serving

  • Version

    magistral_small

  • Description

    Magistral — our first reasoning model. Released in both open and enterprise versions, Magistral is designed to think things through — in ways familiar to us — while bringing expertise across professional domains, transparent reasoning that you can follow and verify, along with deep multilingual flexibility. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.5 Per Million Context Tokens

Context token cost by provider

Magistral Medium

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    40000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$5.0 Per Million Generated Tokens

Generated token cost by provider

Magistral Medium

Now Serving

  • Version

    magistral_medium

  • Description

    Magistral — our first reasoning model. Released in both open and enterprise versions, Magistral is designed to think things through — in ways familiar to us — while bringing expertise across professional domains, transparent reasoning that you can follow and verify, along with deep multilingual flexibility. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customizable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Gemini1.0 Pro

Deprecated

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Feb 2024

$1.5 Per Million Generated Tokens

Generated token cost by provider

Gemini1.0 Pro

Deprecated

  • Version

    gemini_1_0_pro

  • Description

    Gemini 1.0 Pro is an NLP model that handles tasks like multi-turn text and code chat, and code generation. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$0.5 Per Million Context Tokens

Context token cost by provider

Gemini1.5 Pro

Now Serving

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    2097152

  • Max Generated Tokens

    8192

  • Data Freshness

    May 2024

$10.0 Per Million Generated Tokens

Generated token cost by provider

Gemini1.5 Pro

Now Serving

  • Version

    gemini_1_5_pro

  • Description

    Gemini 1.5 Pro is a mid-size multimodal model that is optimized for a wide-range of reasoning tasks. 1.5 Pro can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$2.5 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Flash

Now Serving

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    1048576

  • Max Generated Tokens

    65536

  • Data Freshness

    January 2025

$2.5 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Flash

Now Serving

  • Version

    gemini_2_5_flash

  • Description

    Google's best model in terms of price-performance, offering well-rounded capabilities. 2.5 Flash is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$1.0 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Flash Lite

Preview Only

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    1000000

  • Max Generated Tokens

    64000

  • Data Freshness

    January 2025

$0.4 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Flash Lite

Preview Only

  • Version

    gemini_2_5_flash_lite

  • Description

    A Gemini 2.5 Flash model optimized for cost efficiency and low latency. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$0.5 Per Million Context Tokens

Context token cost by provider

Gemini2.5 Pro

Now Serving

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    1048576

  • Max Generated Tokens

    65536

  • Data Freshness

    January 2025

$15.0 Per Million Generated Tokens

Generated token cost by provider

Gemini2.5 Pro

Now Serving

  • Version

    gemini_2_5_pro

  • Description

    Gemini 2.5 Pro is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context. Gemini is an interface to a multimodal LLM (handling text, audio, images and more). Gemini is based on Google’s cutting-edge research in LLMs. Google initially launched Gemini (then called Bard) as an experiment in March 2023 in accordance with their AI Principles. Since then, users have turned to Gemini to write compelling emails, debug tricky coding problems, brainstorm ideas for upcoming events, get help learning difficult concepts, and so much more. Today, Gemini is a versatile AI tool that can help you in many ways.

$2.5 Per Million Context Tokens

Context token cost by provider

Claude3 Haiku

Deprecated

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    August 2023

$1.25 Per Million Generated Tokens

Generated token cost by provider

Claude3 Haiku

Deprecated

  • Version

    claude_v3_haiku

  • Description

    Fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.25 Per Million Context Tokens

Context token cost by provider

Claude3 Sonnet

Deprecated

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    August 2023

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Sonnet

Deprecated

  • Version

    claude_v3_sonnet

  • Description

    Balance of intelligence and speed. Strong utility, balanced for scaled deployments Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Claude3 Opus

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    August 2023

$75.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Opus

Now Serving

  • Version

    claude_v3_opus

  • Description

    Powerful model for highly complex tasks. Top-level performance, intelligence, fluency, and understanding Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$15.0 Per Million Context Tokens

Context token cost by provider

Claude3.5 Haiku

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    8192

  • Data Freshness

    July 2024

$4.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Haiku

Now Serving

  • Version

    claude_v3_5_haiku

  • Description

    Fastest model; serves with intelligence at blazing speeds. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.8 Per Million Context Tokens

Context token cost by provider

Claude3.5 Sonnet

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    8192

  • Data Freshness

    April 2024

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Sonnet

Now Serving

  • Version

    claude_v3_5_sonnet

  • Description

    Most intelligent model with Highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Claude4 Opus

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    32000

  • Data Freshness

    March 2025

$75.0 Per Million Generated Tokens

Generated token cost by provider

Claude4 Opus

Now Serving

  • Version

    claude_v4_opus

  • Description

    Anthropic's most capable model with highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$15.0 Per Million Context Tokens

Context token cost by provider

Claude4 Sonnet

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    64000

  • Data Freshness

    March 2025

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude4 Sonnet

Now Serving

  • Version

    claude_v4_sonnet

  • Description

    Anthropic's high-performance model with high intelligence and balanced performance. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3 8B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Mar 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

Llama3 8B

Deprecated

  • Version

    llama3_8b_instruct

  • Description

    Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.3 Per Million Context Tokens

Context token cost by provider

Llama3 70B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Dec 2023

$3.5 Per Million Generated Tokens

Generated token cost by provider

Llama3 70B

Deprecated

  • Version

    llama3_70b_instruct

  • Description

    Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$2.65 Per Million Context Tokens

Context token cost by provider

Llama3.1 8B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.22 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 8B

Deprecated

  • Version

    llama3_1_8b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.22 Per Million Context Tokens

Context token cost by provider

Llama3.1 70B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.9 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 70B

Deprecated

  • Version

    llama3_1_70b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.9 Per Million Context Tokens

Context token cost by provider

Llama3.1 405B

Coming Soon

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$3.0 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 405B

Coming Soon

  • Version

    llama3_1_405b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3.2 11B

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    December 2023

$0.16 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 11B

Now Serving

  • Version

    llama3_2_11b_instruct

  • Description

    The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.16 Per Million Context Tokens

Context token cost by provider

Llama3.2 90B

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    December 2023

$0.72 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 90B

Now Serving

  • Version

    llama3_2_90b_instruct

  • Description

    The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.72 Per Million Context Tokens

Context token cost by provider

Llama3.3 70B

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    December 2023

$0.72 Per Million Generated Tokens

Generated token cost by provider

Llama3.3 70B

Now Serving

  • Version

    llama3_3_70b_instruct

  • Description

    Text-only 70B instruction-tuned model that provides enhanced performance relative to Llama 3.1 70B–and to Llama 3.2 90B when used for text-only applications. Llama 3.3 70B delivers similar performance to Llama 3.1 405B while requiring only a fraction of the computational resources. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.72 Per Million Context Tokens

Context token cost by provider

Llama4 Maverick

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    1000000

  • Max Generated Tokens

    4096

  • Data Freshness

    August 2024

$0.97 Per Million Generated Tokens

Generated token cost by provider

Llama4 Maverick

Now Serving

  • Version

    llama4_maverick

  • Description

    A general purpose model featuring 128 experts and 400 billion total parameters. It excels in text understanding across 12 languages and English image understanding, making it suitable for versatile assistant and chat applications. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.24 Per Million Context Tokens

Context token cost by provider

Llama4 Scout

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    3500000

  • Max Generated Tokens

    4096

  • Data Freshness

    August 2024

$0.66 Per Million Generated Tokens

Generated token cost by provider

Llama4 Scout

Now Serving

  • Version

    llama4_scout

  • Description

    A general purpose multimodal model with 16 experts, 17 billion active parameters, and 109 billion total parameters. Its multimillion context window enables comprehensive multi-document analysis, establishing it as a uniquely powerful and efficient model in its class. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.17 Per Million Context Tokens

Context token cost by provider

Command R

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Cohere

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Unpublished

$1.5 Per Million Generated Tokens

Generated token cost by provider

Command R

Now Serving

  • Version

    command_r

  • Description

    Command R is a large language model optimized for conversational interaction and long context tasks. It targets the “scalable” category of models that balance high performance with strong accuracy, enabling companies to move beyond proof of concept and into production. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$0.5 Per Million Context Tokens

Context token cost by provider

Command R+

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Cohere

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Unpublished

$15.0 Per Million Generated Tokens

Generated token cost by provider

Command R+

Now Serving

  • Version

    command_r_plus

  • Description

    Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads. It is most powerful, scalable large language model (LLM) purpose-built to excel at real-world enterprise use cases. Command R+ joins our R-series of LLMs focused on balancing high efficiency with strong accuracy, enabling businesses to move beyond proof-of-concept, and into production with AI. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$3.0 Per Million Context Tokens

Context token cost by provider

Model Details