Which Foundation Models to Choose in AWS Bedrock?

AWS, Bedrock, Cloud Computing

By Adarsh Rai
May 23, 2024

· 13 min read

Amazon Bedrock leverages foundation models (FMs) to revolutionize the construction of generative AI applications. It offers a rich selection of FMs from prominent AI pioneers like OpenAI, Anthropic, Meta, AI21 Labs, and more, with Retrieval Augmentation Generation for relevant, fine-tuned responses.

This article is dedicated to providing a detailed comparative analysis of the various foundation models available on Amazon Bedrock. We highlight the distinct capabilities and ideal use cases of each model, alongside a comprehensive breakdown of their pricing structures. Our objective is to equip you with the essential insights needed to determine the most suitable FM for your specific generative AI requirements.

Understanding Foundation Models in AWS Bedrock

Amazon Bedrock hosts a variety of foundation models, each with unique strengths and application domains. These include AI21 Labs’ Jurassic, Anthropic’s Claude, Cohere’s Command and Embed, Meta’s Llama 2, Stability AI’s Stable Diffusion, and Amazon’s own Titan models. Users can choose the most appropriate FM based on their specific use case and application requirements.

How do Foundation Models work in AWS Bedrock?

At the heart of Amazon Bedrock is its ability to streamline the experimentation and evaluation of leading FMs, tailored for specific use cases. Users can fine-tune these models with their data, employing techniques like fine-tuning and Retrieval Augmented Generation (RAG) with models like Amazon Q to create customized solutions that resonate with their unique business requirements.

Foundation Models List, AWS Bedrock, Model Versions, Pricing, RAG,

The serverless nature of Amazon Bedrock ensures hassle-free management of infrastructure, allowing seamless integration of these AI capabilities into applications using familiar AWS services.

Amazon Bedrock’s single API is a notable feature, offering the flexibility to switch between different FMs and upgrade to the latest versions with minimal code adjustments. This unified approach simplifies the integration of models from providers, keeping applications up-to-date with the latest AI innovations.

Retrieval Augmented Generation (AUG) in Bedrock

RAG is a powerful technique used to enrich FM responses with contextual and relevant company data. Amazon Bedrock’s fully managed Knowledge Bases automate the RAG workflow, streamlining the process of integrating data sources and managing queries for enriched, accurate model responses.

Model Playgrounds

The versatility of Amazon Bedrock extends to various tasks across different modalities, including text, chat, and image. Interactive playgrounds within the platform enable users to experiment with various models, providing a practical understanding of each model’s suitability for specific tasks.

Model Customization

Customization is a cornerstone of Amazon Bedrock. The platform allows users to adapt generic models into specialized ones, tuned to specific business contexts. Techniques like fine-tuning and continued pre-training enable users to refine models using labeled or unlabeled data from Amazon S3, creating private, tailored versions of the base models.

Fine Tuning – Model customization in Amazon Bedrock is designed to deliver personalized user experiences. Supported models like Cohere Command, Meta Llama 2, and various Amazon Titan models can be fine-tuned using labeled datasets.

Continued Pre-training – Continued pre-training allows for domain-specific adaptations of Amazon Titan Text models. Both techniques result in a private, customized copy of the base FM, ensuring that user data remains secure and distinct from the original model training datasets.

Amazon Bedrock Foundation Models List : Versions & Pricing

Here is an extensive, all-encompassing list of all the foundation models available on Amazon Bedrock, including their capabilities, model versions, and pricing.

Foundation Model	Model Version	Max Capacity	Distinct Features & Languages	Supported Use Cases & Applications	Pricing
AI21 Labs Jurassic	Jurassic-2 Ultra	8,192 tokens	Advanced text generation in English, Spanish, French, German, Portuguese, Italian, Dutch	Intricate QA, summarization, draft generation for finance, legal, and research sectors	$0.0188 per 1,000 tokens
	Jurassic-2 Mid	8,192 tokens	Text generation in multiple languages for broad applications	Ideal for QA, content creation, info extraction across various industries	$0.0125 per 1,000 tokens
Anthropic Claude	Claude 2.1	200K tokens	High-capacity text generation, multiple languages	Comprehensive analysis, trend forecasting, document comparison	$0.00800 input, $0.02400 output per 1,000 tokens
	Claude 2.0	100K tokens	Creative content generation, coding support, multiple languages	Versatile for creative dialogue, tech development, and educational content	$0.00800 input, $0.02400 output per 1,000 tokens
	Claude 1.3	100K tokens	Writing assistance, advisory capabilities, multiple languages	Effective for document editing, coding, general advisory in diverse sectors	$0.00800 input, $0.02400 output per 1,000 tokens
	Claude Instant	100K tokens	Rapid response generation, multiple languages	Fast dialogue, summary, and text analysis, ideal for customer support and quick content creation	$0.00163 input, $0.00551 output per 1,000 tokens
Cohere Command & Embed	Command	4K tokens	Advanced chat and text generation, English	Dynamic user experiences in customer support, content creation for marketing and media	$0.0015 input, $0.0020 output per 1,000 tokens
	Command Light	4K tokens	Efficient text generation, English	Cost-effective for smaller-scale chat and content tasks, adaptable for business communications	$0.0003 input, $0.0006 output per 1,000 tokens
	Embed – English	1024 dimensions	Semantic search, English	Ideal for precise text retrieval, classification in knowledge management and information systems	$0.0001 per 1,000 tokens
	Embed – Multilingual	1024 dimensions	Global reach with support for 100+ languages	Multilingual applications in semantic search and data clustering for international business and research	$0.0001 per 1,000 tokens
Meta Llama 2	Llama-2-13b-chat	4K tokens	Optimized for dialogue, English	Small-scale tasks like language translation, text classification, ideal for multilingual communication platforms	$0.00075 input, $0.00100 output per 1,000 tokens
	Llama-2-70b-chat	4K tokens	Enhanced for large-scale language modeling, English	Suitable for detailed text generation, dialogue systems in customer service, and creative industries	$0.00195 input, $0.00256 output per 1,000 tokens
Stable Diffusion	SDXL 1.0	77-token limit for prompts	Native 1024×1024 image generation, English	High-quality image creation for advertising, gaming, and media, excels in photorealism	Standard: $0.04, Premium: $0.08 per image (1024×1024)
	SDXL 0.8	77-token limit for prompts	Text-to-image model, English	Creative asset development in marketing, media, suitable for diverse artistic styles	Standard: $0.018, Premium: $0.036 per image (512×512)
Amazon Titan	Titan Text Express	8K tokens	High-performance text model, 100+ languages	Diverse text-related tasks in content creation, classification, and open-ended Q&A, applicable in education and content marketing	$0.0008 input, $0.0016 output per 1,000 tokens
	Titan Text Lite	4K tokens	Cost-effective text generation, English	Efficient for summarization, copywriting in marketing and corporate communications	$0.0003 input, $0.0004 output per 1,000 tokens
	Titan Text Embeddings	8K tokens	Text translation to numerical representations, 25+ languages	Semantic similarity, clustering for data analysis, knowledge management	$0.0001 per 1,000 tokens
	Titan Multimodal Embeddings	128 tokens, 25 MB images	Multimodal (text and image) search, English	Accurate and contextually relevant search, recommendation experiences in e-commerce, and digital media	$0.0008 per 1,000 tokens; $0.00006 per image
	Titan Image Generator	77 tokens, 25 MB images	High-quality image generation using text prompts, English	Image creation and editing for advertising, e-commerce, and entertainment with natural language prompts	512×512: $0.008, 1024×1024: $0.01 per image

Foundation Model Comparison and Evaluation on Amazon Bedrock

While having a list of all the available models and their features is immensely helpful, nothing compares to a model evaluation using your actual datasets. Bedrock’s Model Evaluation feature helps you compare different models and versions with your own custom dataset.

Foundation Models List, AWS Bedrock, Model Versions, Pricing, RAG, Compare, Evaluation — Source: AWS

After evaluation, users receive an evaluation report that compares accuracy, fluency, coherence, and other characteristics between different models and versions.

Which Foundation Model to use on AWS Bedrock? Use Case and Examples

The selection of the right foundation model on AWS Bedrock determines the effectiveness and relevance of the AI capabilities integrated into your applications. Whether it’s generating human-like text, creating stunning visual content, or understanding and processing natural language, each foundation model on Bedrock has its own unique strengths and specialties.

In this section, we will examine each model available on Bedrock in detail. We’ll uncover their unique features, delve into their ideal application scenarios, and showcase real-world examples where these models have been successfully deployed.

Anthropic Claude in AWS Bedrock

Anthropic’s Claude is a state-of-the-art large language model (LLM) featured on Amazon Bedrock, renowned for its safety-focused design. Developed using advanced techniques like Constitutional AI and harmlessness training, Claude is engineered to be helpful, honest, and harmless, thus reducing brand risk significantly. This model excels in thoughtful dialogue, content creation, complex reasoning, creativity, coding, and more, showcasing versatility in a broad range of applications.

Claude on Bedrock Applications and Use Cases

Primary serving as a knowledge assistant with complex, creative capabilities, here are the applications of using model versions from Claude on AWS Bedrock.

Sales and Customer Service: Claude acts as a virtual sales representative, enhancing customer interaction and satisfaction.
Business Analysis: Efficient in extracting and summarizing key information from business documents and emails.
Legal Document Processing: Assists in parsing legal documents for quick information retrieval, aiding legal professionals.
Coding and Technical Tasks: Continuously improving in coding, mathematical, and logical reasoning, making it a valuable asset for technical challenges.

Claude Model Versions and Pricing on AWS Bedrock

This table offers a concise comparison of the different Claude model versions available on AWS Bedrock, highlighting their maximum token limits, supported use cases, and pricing for on-demand and batch usage in the US East (N. Virginia) and US West (Oregon) regions.

Claude Model Versions	Max Tokens	Supported Use Cases	On-Demand and Batch Pricing (US East & US West)
Claude 2.1	200K	Summarization, Q&A, Trend Forecasting, Document Analysis	Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens
Claude 2.0	100K	Dialogue, Content Creation, Reasoning, Coding	Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens
Claude 1.3	100K	Writing, Editing, Advice, Coding	Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens
Claude Instant	100K	Dialogue, Text Analysis, Summarization	Input: $0.00163/1,000 tokens Output: $0.00551/1,000 tokens

The pricing details provided are specific to the regions US East (N. Virginia) and US West (Oregon). For other regions like Asia Pacific (Tokyo) and Europe (Frankfurt), there are different pricing rates.
Claude 2.1, as the latest model, stands out with its extensive token limit and wide range of use cases, making it suitable for comprehensive tasks.
Claude Instant offers a more cost-effective solution for faster, less complex tasks.

When to use Claude on AWS Bedrock?

There are several instances where using Claude as your Foundation Model on AWS is ideal. Here are the most popular use cases:

You require a model that emphasizes safety and reliability in generative AI solutions.
Tasks involve multi-faceted dialogue, creative content generation, or complex problem-solving.
Speed and cost-effectiveness are key considerations, as seen in Claude Instant.

Claude in AWS Bedrock stands out for its ethical AI framework and versatility, making it an ideal choice for a variety of applications where safety, accuracy, and creativity are stressed.

Cohere Command & Embed in AWS Bedrock

Cohere’s Command and Embed models offer versatile generative large language model (LLM) capabilities on Amazon Bedrock, making them ideal for business applications. These models are designed to respect privacy, with customers having complete control over customization and inputs/outputs. Trained from reliable data sources, these models undergo thorough adversarial testing and bias mitigation.

Cohere on Bedrock Applications and Use Cases

The versatility and privacy-focused design of Cohere’s models make them an excellent choice for businesses looking to integrate generative AI capabilities into their workflows.

Command Model

Perfect for knowledge assistants, customer support chatbots, content creation, summarization, and search applications. It excels in maintaining conversational context, generating articles, summarizing long-form texts, and retrieving relevant information for RAG use cases.

Embed Model

This text model is ideal for semantic search, classification, and clustering tasks. Available in both English and multilingual versions, it supports over 100 languages, making it versatile for global applications.

Cohere Model Versions and Pricing on AWS Bedrock

Model Version	Max Tokens	Supported Use Cases	On-Demand and Batch Pricing (USD)	Customization Pricing (USD)
Command	4K	Chat, text generation, text summarization	Input: $0.0015/1,000 tokens; Output: $0.0020/1,000 tokens	Train: $0.004/1,000 tokens; Store: $1.95/month; Infer: $49.50/hour
Command Light	4K	Chat, text generation, text summarization	Input: $0.0003/1,000 tokens; Output: $0.0006/1,000 tokens	Train: $0.001/1,000 tokens; Store: $1.95/month; Infer: $8.56/hour
Embed – English	1024	Semantic search, RAG, classification, clustering	Input: $0.0001/1,000 tokens; N/A for output	N/A
Embed – Multilingual	1024	Semantic search, RAG, classification, clustering	Input: $0.0001/1,000 tokens; N/A for output	N/A

When to Use Cohere Models on AWS Bedrock?

Choose Cohere’s Command & Embed models for content that is confidential, contains identifiable information, or sensitive business data:

Advanced chatbot functionalities and content generation.
Multilingual text processing and semantic searches.
Privacy-centric applications with a need for model customization.
Efficient summarization, classification, and clustering tasks in diverse business scenarios.

AI21 Labs’ Jurassic on AWS Bedrock

Jurassic represents a cutting-edge suite of large language models (LLMs) on Amazon Bedrock, designed to meet the evolving demands of generative AI applications. This LLM stands out for its deep comprehension abilities and expansive scope of functionalities, encompassing various aspects of language processing, and offering a selection of model sizes tailored for speed and cost efficiency.

This model excels in thoughtful dialogue, content creation, complex reasoning, creativity, coding, and more, showcasing versatility in a broad range of applications.

Jurassic on Bedrock Applications and Use Cases

Its integration with AWS Bedrock opens up a plethora of diverse use cases. It can:

Condense complex financial reports into succinct summaries.
Generate tailored financial and legal statements.
Craft unique marketing content in various styles and lengths.
Provide natural language responses to customer inquiries.
Enable employees to access and interpret organizational data effortlessly.

Jurassic Model Versions and Pricing on AWS Bedrock

Jurassic’s versatility is further expanded with different model versions, each catering to specific needs. The following table outlines these versions, their capabilities, supported use cases, and pricing details:

Model Version	Max Tokens	Languages	Supported Use Cases	Pricing per 1,000 Tokens
Jurassic-2 Ultra	8,192	English, Spanish, French, German, Portuguese, Italian, Dutch	Advanced QA, summarization, draft generation, complex reasoning	$0.0188
Jurassic-2 Mid	8,192	English, Spanish, French, German, Portuguese, Italian, Dutch	QA, summarization, draft generation, info extraction	$0.0125

When to Use Jurassic on AWS Bedrock?

Whether for intricate reasoning, content generation, or natural language processing, Jurassic provides the necessary tools for innovative, AI-driven solutions. Jurassic’s models on AWS Bedrock are ideal when:

You need high-quality, nuanced text generation for intricate tasks.
Your application demands quick, natural language processing for real-time responses.
You’re seeking a balance between exceptional quality and cost-efficiency.
Your tasks require deep reasoning, logic, and language comprehension.

In essence, Jurassic’s models on AWS Bedrock serve as robust, adaptable tools for a wide range of generative AI applications, offering unique solutions tailored to the specific needs of various industries.

Meta Llama 2 on AWS Bedrock

Meta Llama 2 represents a revolutionary step in the realm of generative AI with its large language models (LLMs) ranging from 7 billion to 70 billion parameters. These models, particularly the fine-tuned Llama Chat variants, are meticulously crafted with a focus on safety, leveraging over a million human annotations and extensive red-teaming efforts.

This ensures that Llama 2 not only delivers exceptional performance but also maintains a high standard of safety in its interactions. These parameters are crucial when creating generative AI support assistants or chatbots that will serve on the websites of official entities.

Llama 2 on Bedrock Applications and Use Cases

Llama 2’s integration with AWS Bedrock brings to the forefront an array of possibilities for AI-powered applications:

Llama-2-13b-chat: This version is adept at smaller-scale tasks like text classification, sentiment analysis, and language translation. Its 13B parameter size makes it suitable for applications that require precise, yet compact model capabilities.
Llama-2-70b-chat: Tailored for more extensive tasks like language modeling, text generation, and dialogue systems, the 70B parameter model excels in handling complex and large-scale AI challenges.

Llama 2 Model Versions and Pricing on AWS Bedrock

Meta’s Llama 2 models on AWS Bedrock come in two primary versions, each designed for specific application scopes. The following table provides a detailed overview of these versions:

Llama 2 Model Versions	Max Tokens	Languages	Supported Use Cases	On-Demand and Batch Pricing (USD)
Llama-2-13b-chat	4K	English	Assistant-like chat	Input: $0.00075/1,000 tokens; Output: $0.00100/1,000 tokens
Llama-2-70b-chat	4K	English	Assistant-like chat	Input: $0.00195/1,000 tokens; Output: $0.00256/1,000 tokens

When to Use Llama 2 on AWS Bedrock?

Llama 2’s models are a prime choice in scenarios where:

Dialogue and interaction-focused AI applications are a priority.
Tasks require a balance between comprehensive language understanding and concise model deployment.
There’s a need for a wide-ranging scale of tasks, from basic text classification to elaborate language modeling and text generation.

Llama 2 in AWS Bedrock stands out for its adaptability and safety-focused design, making it a robust option for businesses seeking to integrate dialogue-based AI capabilities into their applications.

Stable Diffusion XL on AWS Bedrock

Stability AI’s Stable Diffusion XL presents a groundbreaking advancement in text-to-image generation on Amazon Bedrock. This model, with its 3.5 billion parameter base and 6.6 billion parameter ensemble pipeline, is renowned for its state-of-the-art open architecture.

It is specifically designed to generate high-quality images with cinematic photorealism and intricate detail, capable of creating complex compositions from basic natural language prompts.

Stable Diffusion XL on Bedrock Applications and Use Cases

Stable Diffusion XL’s integration into AWS Bedrock unlocks a new realm of creative possibilities, ideally suited for:

Personalized Advertising and Marketing Campaigns: Generating tailor-made ad visuals and marketing materials to enhance brand appeal.
Creative Asset Development: Ideating and crafting unlimited creative assets, including characters, scenes, and worlds, particularly useful in media, entertainment, gaming, and metaverse projects.

Stable Diffusion XL Model Versions and Pricing on AWS Bedrock

Stable Diffusion XL comes in two primary versions, each optimized for specific creative tasks. Below is a table that outlines these versions, their prompt token limits, supported use cases, and pricing details for image generation:

Stable Diffusion XL Model Versions	Max Prompt Tokens	Supported Use Cases	Image Resolution	Pricing per Image (Standard Quality <=50 steps)	Pricing per Image (Premium Quality >50 steps)
SDXL 1.0	77-token limit	Advertising, Media, Gaming	Up to 1024×1024	$0.04	$0.08
SDXL 0.8	77-token limit	Advertising, Media, Gaming	512×512 or smaller: $0.018; Larger than 512×512: $0.036	512×512 or smaller: $0.036; Larger than 512×512: $0.072	$0.06

Provisioned Throughput Pricing is also available, with SDXL 1.0 priced at $49.86 per hour per model unit for a 1-month commitment and $46.18 for a 6-month commitment.

When to Use Stable Diffusion XL on AWS Bedrock?

Stable Diffusion XL is the ideal choice for applications that demand:

High-fidelity, photorealistic image generation from textual descriptions.
Creative brainstorming and asset creation for advertising, media, and entertainment domains.
Innovative character and world-building in gaming and metaverse environments.

With its advanced architecture and fine-tuning capabilities, Stable Diffusion XL on AWS Bedrock serves as a powerful tool for anyone looking to harness the power of generative AI for visually driven projects.

Amazon Titan on AWS Bedrock

Amazon Titan represents a comprehensive suite of foundation models (FMs) exclusive to Amazon Bedrock, embodying Amazon’s extensive experience in AI and machine learning. This family of models includes high-performing image, multimodal, and text models, each designed to empower a broad spectrum of generative AI applications.

These models stand out for their built-in support for responsible AI, including content filtering and input rejection mechanisms, which prioritize safety over every other command or input.

Titan on Bedrock Applications and Use Cases

Amazon Titan’s models integrate seamlessly into AWS Bedrock, offering versatile applications across various sectors:

Text Models: Enhance productivity in tasks like blog post creation, article classification, open-ended Q&A, and conversational chat.
Multimodal Embeddings: Improve the accuracy of multimodal searches, recommendations, and personalization experiences.
Image Generator: Support content creators in generating high-quality images quickly and efficiently, ideal for industries like advertising, e-commerce, and media.

Titan Model Versions and Pricing on AWS Bedrock

Amazon Titan models come in various versions, each tailored to specific application needs. Below is a detailed overview of these versions, including their capabilities, supported use cases, and pricing:

Titan Model Versions	Max Tokens/Images	Languages	Supported Use Cases	On-Demand and Batch Pricing
Titan Text Express	8K tokens	English (GA), 100+ languages (Preview)	Retrieval generation, text generation, brainstorming, Q&A, chat	Input: $0.0008/1,000 tokens; Output: $0.0016/1,000 tokens
Titan Text Lite	4K tokens	English	Summarization, copywriting	Input: $0.0003/1,000 tokens; Output: $0.0004/1,000 tokens
Titan Text Embeddings	8K tokens	25+ languages	Text retrieval, semantic similarity, clustering	Input: $0.0001/1,000 tokens; N/A for output
Titan Multimodal Embeddings	128 tokens, 25 MB images	English	Search, recommendation, personalization	Input: $0.0008/1,000 tokens; Input image: $0.00006
Titan Image Generator	77 tokens, 25 MB images	English	Text to image generation, image editing, variations	512×512: $0.008 (Standard), $0.01 (Premium); 1024×1024: $0.01 (Standard), $0.012 (Premium)

When to Use Amazon Titan on AWS Bedrock?

Opt for Amazon Titan’s models when your application requires:

Diverse and high-performing AI solutions for text, image, and multimodal tasks.
A responsible AI approach with built-in mechanisms for content safety.
Customization options to tailor models to specific organizational needs and domains.
Generating high-quality, realistic images or enhancing search and recommendation systems.

Conclusion

Each Foundation Model with its unique offerings and tailored features, caters to specific needs within AI-driven applications. For cloud engineers and decision-makers, understanding the distinct characteristics of these models is crucial in harnessing their full potential. Users can leverage Bedrock’s model evaluation feature to understand each model’s performance and suitability for their specific use cases.

However, the decision-making process should not stop at capabilities alone.

Cost considerations, applicability to current and future projects, model updating frequency, and code reusability are critical factors that must be weighed. It’s essential to consider the total cost of ownership, including the ongoing costs associated with model training, maintenance, and scaling.

Struggling with AI Costs?

Organizations today use a multitude of cloud, AI, and SaaS services to keep their business operations running smoothly. However, managing these resources can become overwhelming, and organizations may find themselves overpaying for services or accumulating expenditures from underutilized resources.

Economize offers an end-to-end FinOps solution that enables organizations to develop a cost optimization strategy to reduce their cloud, AI, and SaaS costs. With Economize, organizations can achieve their business objectives without breaking the bank on cloud and AI services.

Granular visibility into your cloud

Discover hidden optimization signals

Monitor everything. Troubleshoot faster.

BUSINESS SIZE

For Small Business

For Medium Business

TRUST

Reviews & Security

USE CASE

Google Cloud Platform

Amazon Web Services

INTEGRATION

Slack, Teams and many more...

FOR TEAMS

RESOURCES

Guides

GCP Pricing Catalog

AWS Pricing Catalog

Glossary

GCP Pricing Calculator

AWS Pricing Calculator

Apna

ShopUp

DeepSource

Customer Stories