Which Foundation Models to Choose in AWS Bedrock?
Amazon Bedrock leverages foundation models (FMs) to revolutionize the construction of generative AI applications. It offers a rich selection of FMs from prominent AI pioneers like OpenAI, Anthropic, Meta, AI21 Labs, and more, with Retrieval Augmentation Generation for relevant, fine-tuned responses. This article is dedicated to providing a detailed comparative analysis of the various foundation […]
December 18, 2023
by Adarsh Rai
8 mins Read
Table of Contents
- Understanding Foundation Models in AWS Bedrock
- Amazon Bedrock Foundation Models List : Versions & Pricing
- Foundation Model Comparison and Evaluation on Amazon Bedrock
- Which Foundation Model to use on AWS Bedrock? Use Case and Examples
- Anthropic Claude in AWS Bedrock
- Cohere Command & Embed in AWS Bedrock
- AI21 Labs’ Jurassic on AWS Bedrock
- Meta Llama 2 on AWS Bedrock
- Stable Diffusion XL on AWS Bedrock
- Amazon Titan on AWS Bedrock
- Conclusion
Amazon Bedrock leverages foundation models (FMs) to revolutionize the construction of generative AI applications. It offers a rich selection of FMs from prominent AI pioneers like OpenAI, Anthropic, Meta, AI21 Labs, and more, with Retrieval Augmentation Generation for relevant, fine-tuned responses.
This article is dedicated to providing a detailed comparative analysis of the various foundation models available on Amazon Bedrock. We highlight the distinct capabilities and ideal use cases of each model, alongside a comprehensive breakdown of their pricing structures. Our objective is to equip you with the essential insights needed to determine the most suitable FM for your specific generative AI requirements.
Understanding Foundation Models in AWS Bedrock
Amazon Bedrock hosts a variety of foundation models, each with unique strengths and application domains. These include AI21 Labs’ Jurassic, Anthropic’s Claude, Cohere’s Command and Embed, Meta’s Llama 2, Stability AI’s Stable Diffusion, and Amazon’s own Titan models. Users can choose the most appropriate FM based on their specific use case and application requirements.
How do Foundation Models work in AWS Bedrock?
At the heart of Amazon Bedrock is its ability to streamline the experimentation and evaluation of leading FMs, tailored for specific use cases. Users can fine-tune these models with their data, employing techniques like fine-tuning and Retrieval Augmented Generation (RAG) with models like Amazon Q to create customized solutions that resonate with their unique business requirements.
The serverless nature of Amazon Bedrock ensures hassle-free management of infrastructure, allowing seamless integration of these AI capabilities into applications using familiar AWS services.
Amazon Bedrock’s single API is a notable feature, offering the flexibility to switch between different FMs and upgrade to the latest versions with minimal code adjustments. This unified approach simplifies the integration of models from providers, keeping applications up-to-date with the latest AI innovations.
Retrieval Augmented Generation (AUG) in Bedrock
RAG is a powerful technique used to enrich FM responses with contextual and relevant company data. Amazon Bedrock’s fully managed Knowledge Bases automate the RAG workflow, streamlining the process of integrating data sources and managing queries for enriched, accurate model responses.
Model Playgrounds
The versatility of Amazon Bedrock extends to various tasks across different modalities, including text, chat, and image. Interactive playgrounds within the platform enable users to experiment with various models, providing a practical understanding of each model’s suitability for specific tasks.
Model Customization
Customization is a cornerstone of Amazon Bedrock. The platform allows users to adapt generic models into specialized ones, tuned to specific business contexts. Techniques like fine-tuning and continued pre-training enable users to refine models using labeled or unlabeled data from Amazon S3, creating private, tailored versions of the base models.
- Fine Tuning – Model customization in Amazon Bedrock is designed to deliver personalized user experiences. Supported models like Cohere Command, Meta Llama 2, and various Amazon Titan models can be fine-tuned using labeled datasets.
- Continued Pre-training – Continued pre-training allows for domain-specific adaptations of Amazon Titan Text models. Both techniques result in a private, customized copy of the base FM, ensuring that user data remains secure and distinct from the original model training datasets.
Amazon Bedrock Foundation Models List : Versions & Pricing
Here is an extensive, all-encompassing list of all the foundation models available on Amazon Bedrock, including their capabilities, model versions, and pricing.
Foundation Model | Model Version | Max Capacity | Distinct Features & Languages | Supported Use Cases & Applications | Pricing |
---|---|---|---|---|---|
AI21 Labs Jurassic | Jurassic-2 Ultra | 8,192 tokens | Advanced text generation in English, Spanish, French, German, Portuguese, Italian, Dutch | Intricate QA, summarization, draft generation for finance, legal, and research sectors | $0.0188 per 1,000 tokens |
Jurassic-2 Mid | 8,192 tokens | Text generation in multiple languages for broad applications | Ideal for QA, content creation, info extraction across various industries | $0.0125 per 1,000 tokens | |
Anthropic Claude | Claude 2.1 | 200K tokens | High-capacity text generation, multiple languages | Comprehensive analysis, trend forecasting, document comparison | $0.00800 input, $0.02400 output per 1,000 tokens |
Claude 2.0 | 100K tokens | Creative content generation, coding support, multiple languages | Versatile for creative dialogue, tech development, and educational content | $0.00800 input, $0.02400 output per 1,000 tokens | |
Claude 1.3 | 100K tokens | Writing assistance, advisory capabilities, multiple languages | Effective for document editing, coding, general advisory in diverse sectors | $0.00800 input, $0.02400 output per 1,000 tokens | |
Claude Instant | 100K tokens | Rapid response generation, multiple languages | Fast dialogue, summary, and text analysis, ideal for customer support and quick content creation | $0.00163 input, $0.00551 output per 1,000 tokens | |
Cohere Command & Embed | Command | 4K tokens | Advanced chat and text generation, English | Dynamic user experiences in customer support, content creation for marketing and media | $0.0015 input, $0.0020 output per 1,000 tokens |
Command Light | 4K tokens | Efficient text generation, English | Cost-effective for smaller-scale chat and content tasks, adaptable for business communications | $0.0003 input, $0.0006 output per 1,000 tokens | |
Embed – English | 1024 dimensions | Semantic search, English | Ideal for precise text retrieval, classification in knowledge management and information systems | $0.0001 per 1,000 tokens | |
Embed – Multilingual | 1024 dimensions | Global reach with support for 100+ languages | Multilingual applications in semantic search and data clustering for international business and research | $0.0001 per 1,000 tokens | |
Meta Llama 2 | Llama-2-13b-chat | 4K tokens | Optimized for dialogue, English | Small-scale tasks like language translation, text classification, ideal for multilingual communication platforms | $0.00075 input, $0.00100 output per 1,000 tokens |
Llama-2-70b-chat | 4K tokens | Enhanced for large-scale language modeling, English | Suitable for detailed text generation, dialogue systems in customer service, and creative industries | $0.00195 input, $0.00256 output per 1,000 tokens | |
Stable Diffusion | SDXL 1.0 | 77-token limit for prompts | Native 1024×1024 image generation, English | High-quality image creation for advertising, gaming, and media, excels in photorealism | Standard: $0.04, Premium: $0.08 per image (1024×1024) |
SDXL 0.8 | 77-token limit for prompts | Text-to-image model, English | Creative asset development in marketing, media, suitable for diverse artistic styles | Standard: $0.018, Premium: $0.036 per image (512×512) | |
Amazon Titan | Titan Text Express | 8K tokens | High-performance text model, 100+ languages | Diverse text-related tasks in content creation, classification, and open-ended Q&A, applicable in education and content marketing | $0.0008 input, $0.0016 output per 1,000 tokens |
Titan Text Lite | 4K tokens | Cost-effective text generation, English | Efficient for summarization, copywriting in marketing and corporate communications | $0.0003 input, $0.0004 output per 1,000 tokens | |
Titan Text Embeddings | 8K tokens | Text translation to numerical representations, 25+ languages | Semantic similarity, clustering for data analysis, knowledge management | $0.0001 per 1,000 tokens | |
Titan Multimodal Embeddings | 128 tokens, 25 MB images | Multimodal (text and image) search, English | Accurate and contextually relevant search, recommendation experiences in e-commerce, and digital media | $0.0008 per 1,000 tokens; $0.00006 per image | |
Titan Image Generator | 77 tokens, 25 MB images | High-quality image generation using text prompts, English | Image creation and editing for advertising, e-commerce, and entertainment with natural language prompts | 512×512: $0.008, 1024×1024: $0.01 per image |
Foundation Model Comparison and Evaluation on Amazon Bedrock
While having a list of all the available models and their features is immensely helpful, nothing compares to a model evaluation using your actual datasets. Bedrock’s Model Evaluation feature helps you compare different models and versions with your own custom dataset.
After evaluation, users receive an evaluation report that compares accuracy, fluency, coherence, and other characteristics between different models and versions.
Which Foundation Model to use on AWS Bedrock? Use Case and Examples
The selection of the right foundation model on AWS Bedrock determines the effectiveness and relevance of the AI capabilities integrated into your applications. Whether it’s generating human-like text, creating stunning visual content, or understanding and processing natural language, each foundation model on Bedrock has its own unique strengths and specialties.
In this section, we will examine each model available on Bedrock in detail. We’ll uncover their unique features, delve into their ideal application scenarios, and showcase real-world examples where these models have been successfully deployed.
Anthropic Claude in AWS Bedrock
Anthropic’s Claude is a state-of-the-art large language model (LLM) featured on Amazon Bedrock, renowned for its safety-focused design. Developed using advanced techniques like Constitutional AI and harmlessness training, Claude is engineered to be helpful, honest, and harmless, thus reducing brand risk significantly. This model excels in thoughtful dialogue, content creation, complex reasoning, creativity, coding, and more, showcasing versatility in a broad range of applications.
Claude on Bedrock Applications and Use Cases
Primary serving as a knowledge assistant with complex, creative capabilities, here are the applications of using model versions from Claude on AWS Bedrock.
- Sales and Customer Service: Claude acts as a virtual sales representative, enhancing customer interaction and satisfaction.
- Business Analysis: Efficient in extracting and summarizing key information from business documents and emails.
- Legal Document Processing: Assists in parsing legal documents for quick information retrieval, aiding legal professionals.
- Coding and Technical Tasks: Continuously improving in coding, mathematical, and logical reasoning, making it a valuable asset for technical challenges.
Claude Model Versions and Pricing on AWS Bedrock
This table offers a concise comparison of the different Claude model versions available on AWS Bedrock, highlighting their maximum token limits, supported use cases, and pricing for on-demand and batch usage in the US East (N. Virginia) and US West (Oregon) regions.
Claude Model Versions | Max Tokens | Supported Use Cases | On-Demand and Batch Pricing (US East & US West) |
---|---|---|---|
Claude 2.1 | 200K | Summarization, Q&A, Trend Forecasting, Document Analysis | Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens |
Claude 2.0 | 100K | Dialogue, Content Creation, Reasoning, Coding | Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens |
Claude 1.3 | 100K | Writing, Editing, Advice, Coding | Input: $0.00800/1,000 tokens Output: $0.02400/1,000 tokens |
Claude Instant | 100K | Dialogue, Text Analysis, Summarization | Input: $0.00163/1,000 tokens Output: $0.00551/1,000 tokens |
- The pricing details provided are specific to the regions US East (N. Virginia) and US West (Oregon). For other regions like Asia Pacific (Tokyo) and Europe (Frankfurt), there are different pricing rates.
- Claude 2.1, as the latest model, stands out with its extensive token limit and wide range of use cases, making it suitable for comprehensive tasks.
- Claude Instant offers a more cost-effective solution for faster, less complex tasks.
When to use Claude on AWS Bedrock?
There are several instances where using Claude as your Foundation Model on AWS is ideal. Here are the most popular use cases:
- You require a model that emphasizes safety and reliability in generative AI solutions.
- Tasks involve multi-faceted dialogue, creative content generation, or complex problem-solving.
- Speed and cost-effectiveness are key considerations, as seen in Claude Instant.
Claude in AWS Bedrock stands out for its ethical AI framework and versatility, making it an ideal choice for a variety of applications where safety, accuracy, and creativity are stressed.
Cohere Command & Embed in AWS Bedrock
Cohere’s Command and Embed models offer versatile generative large language model (LLM) capabilities on Amazon Bedrock, making them ideal for business applications. These models are designed to respect privacy, with customers having complete control over customization and inputs/outputs. Trained from reliable data sources, these models undergo thorough adversarial testing and bias mitigation.
Cohere on Bedrock Applications and Use Cases
The versatility and privacy-focused design of Cohere’s models make them an excellent choice for businesses looking to integrate generative AI capabilities into their workflows.
Command Model
Perfect for knowledge assistants, customer support chatbots, content creation, summarization, and search applications. It excels in maintaining conversational context, generating articles, summarizing long-form texts, and retrieving relevant information for RAG use cases.
Embed Model
This text model is ideal for semantic search, classification, and clustering tasks. Available in both English and multilingual versions, it supports over 100 languages, making it versatile for global applications.
Cohere Model Versions and Pricing on AWS Bedrock
Model Version | Max Tokens | Supported Use Cases | On-Demand and Batch Pricing (USD) | Customization Pricing (USD) |
---|---|---|---|---|
Command | 4K | Chat, text generation, text summarization | Input: $0.0015/1,000 tokens; Output: $0.0020/1,000 tokens | Train: $0.004/1,000 tokens; Store: $1.95/month; Infer: $49.50/hour |
Command Light | 4K | Chat, text generation, text summarization | Input: $0.0003/1,000 tokens; Output: $0.0006/1,000 tokens | Train: $0.001/1,000 tokens; Store: $1.95/month; Infer: $8.56/hour |
Embed – English | 1024 | Semantic search, RAG, classification, clustering | Input: $0.0001/1,000 tokens; N/A for output | N/A |
Embed – Multilingual | 1024 | Semantic search, RAG, classification, clustering | Input: $0.0001/1,000 tokens; N/A for output | N/A |
When to Use Cohere Models on AWS Bedrock?
Choose Cohere’s Command & Embed models for content that is confidential, contains identifiable information, or sensitive business data:
- Advanced chatbot functionalities and content generation.
- Multilingual text processing and semantic searches.
- Privacy-centric applications with a need for model customization.
- Efficient summarization, classification, and clustering tasks in diverse business scenarios.
AI21 Labs’ Jurassic on AWS Bedrock
Jurassic represents a cutting-edge suite of large language models (LLMs) on Amazon Bedrock, designed to meet the evolving demands of generative AI applications. This LLM stands out for its deep comprehension abilities and expansive scope of functionalities, encompassing various aspects of language processing, and offering a selection of model sizes tailored for speed and cost efficiency.
This model excels in thoughtful dialogue, content creation, complex reasoning, creativity, coding, and more, showcasing versatility in a broad range of applications.
Jurassic on Bedrock Applications and Use Cases
Its integration with AWS Bedrock opens up a plethora of diverse use cases. It can:
- Condense complex financial reports into succinct summaries.
- Generate tailored financial and legal statements.
- Craft unique marketing content in various styles and lengths.
- Provide natural language responses to customer inquiries.
- Enable employees to access and interpret organizational data effortlessly.
Jurassic Model Versions and Pricing on AWS Bedrock
Jurassic’s versatility is further expanded with different model versions, each catering to specific needs. The following table outlines these versions, their capabilities, supported use cases, and pricing details:
Model Version | Max Tokens | Languages | Supported Use Cases | Pricing per 1,000 Tokens |
---|---|---|---|---|
Jurassic-2 Ultra | 8,192 | English, Spanish, French, German, Portuguese, Italian, Dutch | Advanced QA, summarization, draft generation, complex reasoning | $0.0188 |
Jurassic-2 Mid | 8,192 | English, Spanish, French, German, Portuguese, Italian, Dutch | QA, summarization, draft generation, info extraction | $0.0125 |
When to Use Jurassic on AWS Bedrock?
Whether for intricate reasoning, content generation, or natural language processing, Jurassic provides the necessary tools for innovative, AI-driven solutions. Jurassic’s models on AWS Bedrock are ideal when:
- You need high-quality, nuanced text generation for intricate tasks.
- Your application demands quick, natural language processing for real-time responses.
- You’re seeking a balance between exceptional quality and cost-efficiency.
- Your tasks require deep reasoning, logic, and language comprehension.
In essence, Jurassic’s models on AWS Bedrock serve as robust, adaptable tools for a wide range of generative AI applications, offering unique solutions tailored to the specific needs of various industries.
Meta Llama 2 on AWS Bedrock
Meta Llama 2 represents a revolutionary step in the realm of generative AI with its large language models (LLMs) ranging from 7 billion to 70 billion parameters. These models, particularly the fine-tuned Llama Chat variants, are meticulously crafted with a focus on safety, leveraging over a million human annotations and extensive red-teaming efforts.
This ensures that Llama 2 not only delivers exceptional performance but also maintains a high standard of safety in its interactions. These parameters are crucial when creating generative AI support assistants or chatbots that will serve on the websites of official entities.
Llama 2 on Bedrock Applications and Use Cases
Llama 2’s integration with AWS Bedrock brings to the forefront an array of possibilities for AI-powered applications:
- Llama-2-13b-chat: This version is adept at smaller-scale tasks like text classification, sentiment analysis, and language translation. Its 13B parameter size makes it suitable for applications that require precise, yet compact model capabilities.
- Llama-2-70b-chat: Tailored for more extensive tasks like language modeling, text generation, and dialogue systems, the 70B parameter model excels in handling complex and large-scale AI challenges.
Llama 2 Model Versions and Pricing on AWS Bedrock
Meta’s Llama 2 models on AWS Bedrock come in two primary versions, each designed for specific application scopes. The following table provides a detailed overview of these versions:
Llama 2 Model Versions | Max Tokens | Languages | Supported Use Cases | On-Demand and Batch Pricing (USD) |
---|---|---|---|---|
Llama-2-13b-chat | 4K | English | Assistant-like chat | Input: $0.00075/1,000 tokens; Output: $0.00100/1,000 tokens |
Llama-2-70b-chat | 4K | English | Assistant-like chat | Input: $0.00195/1,000 tokens; Output: $0.00256/1,000 tokens |
When to Use Llama 2 on AWS Bedrock?
Llama 2’s models are a prime choice in scenarios where:
- Dialogue and interaction-focused AI applications are a priority.
- Tasks require a balance between comprehensive language understanding and concise model deployment.
- There’s a need for a wide-ranging scale of tasks, from basic text classification to elaborate language modeling and text generation.
Llama 2 in AWS Bedrock stands out for its adaptability and safety-focused design, making it a robust option for businesses seeking to integrate dialogue-based AI capabilities into their applications.
Stable Diffusion XL on AWS Bedrock
Stability AI’s Stable Diffusion XL presents a groundbreaking advancement in text-to-image generation on Amazon Bedrock. This model, with its 3.5 billion parameter base and 6.6 billion parameter ensemble pipeline, is renowned for its state-of-the-art open architecture.
It is specifically designed to generate high-quality images with cinematic photorealism and intricate detail, capable of creating complex compositions from basic natural language prompts.
Stable Diffusion XL on Bedrock Applications and Use Cases
Stable Diffusion XL’s integration into AWS Bedrock unlocks a new realm of creative possibilities, ideally suited for:
- Personalized Advertising and Marketing Campaigns: Generating tailor-made ad visuals and marketing materials to enhance brand appeal.
- Creative Asset Development: Ideating and crafting unlimited creative assets, including characters, scenes, and worlds, particularly useful in media, entertainment, gaming, and metaverse projects.
Stable Diffusion XL Model Versions and Pricing on AWS Bedrock
Stable Diffusion XL comes in two primary versions, each optimized for specific creative tasks. Below is a table that outlines these versions, their prompt token limits, supported use cases, and pricing details for image generation:
Stable Diffusion XL Model Versions | Max Prompt Tokens | Supported Use Cases | Image Resolution | Pricing per Image (Standard Quality <=50 steps) | Pricing per Image (Premium Quality >50 steps) |
---|---|---|---|---|---|
SDXL 1.0 | 77-token limit | Advertising, Media, Gaming | Up to 1024×1024 | $0.04 | $0.08 |
SDXL 0.8 | 77-token limit | Advertising, Media, Gaming | 512×512 or smaller: $0.018; Larger than 512×512: $0.036 | 512×512 or smaller: $0.036; Larger than 512×512: $0.072 | $0.06 |
Provisioned Throughput Pricing is also available, with SDXL 1.0 priced at $49.86 per hour per model unit for a 1-month commitment and $46.18 for a 6-month commitment.
When to Use Stable Diffusion XL on AWS Bedrock?
Stable Diffusion XL is the ideal choice for applications that demand:
- High-fidelity, photorealistic image generation from textual descriptions.
- Creative brainstorming and asset creation for advertising, media, and entertainment domains.
- Innovative character and world-building in gaming and metaverse environments.
With its advanced architecture and fine-tuning capabilities, Stable Diffusion XL on AWS Bedrock serves as a powerful tool for anyone looking to harness the power of generative AI for visually driven projects.
Amazon Titan on AWS Bedrock
Amazon Titan represents a comprehensive suite of foundation models (FMs) exclusive to Amazon Bedrock, embodying Amazon’s extensive experience in AI and machine learning. This family of models includes high-performing image, multimodal, and text models, each designed to empower a broad spectrum of generative AI applications.
These models stand out for their built-in support for responsible AI, including content filtering and input rejection mechanisms, which prioritize safety over every other command or input.
Titan on Bedrock Applications and Use Cases
Amazon Titan’s models integrate seamlessly into AWS Bedrock, offering versatile applications across various sectors:
- Text Models: Enhance productivity in tasks like blog post creation, article classification, open-ended Q&A, and conversational chat.
- Multimodal Embeddings: Improve the accuracy of multimodal searches, recommendations, and personalization experiences.
- Image Generator: Support content creators in generating high-quality images quickly and efficiently, ideal for industries like advertising, e-commerce, and media.
Titan Model Versions and Pricing on AWS Bedrock
Amazon Titan models come in various versions, each tailored to specific application needs. Below is a detailed overview of these versions, including their capabilities, supported use cases, and pricing:
Titan Model Versions | Max Tokens/Images | Languages | Supported Use Cases | On-Demand and Batch Pricing |
---|---|---|---|---|
Titan Text Express | 8K tokens | English (GA), 100+ languages (Preview) | Retrieval generation, text generation, brainstorming, Q&A, chat | Input: $0.0008/1,000 tokens; Output: $0.0016/1,000 tokens |
Titan Text Lite | 4K tokens | English | Summarization, copywriting | Input: $0.0003/1,000 tokens; Output: $0.0004/1,000 tokens |
Titan Text Embeddings | 8K tokens | 25+ languages | Text retrieval, semantic similarity, clustering | Input: $0.0001/1,000 tokens; N/A for output |
Titan Multimodal Embeddings | 128 tokens, 25 MB images | English | Search, recommendation, personalization | Input: $0.0008/1,000 tokens; Input image: $0.00006 |
Titan Image Generator | 77 tokens, 25 MB images | English | Text to image generation, image editing, variations | 512×512: $0.008 (Standard), $0.01 (Premium); 1024×1024: $0.01 (Standard), $0.012 (Premium) |
When to Use Amazon Titan on AWS Bedrock?
Opt for Amazon Titan’s models when your application requires:
- Diverse and high-performing AI solutions for text, image, and multimodal tasks.
- A responsible AI approach with built-in mechanisms for content safety.
- Customization options to tailor models to specific organizational needs and domains.
- Generating high-quality, realistic images or enhancing search and recommendation systems.
Conclusion
Each Foundation Model with its unique offerings and tailored features, caters to specific needs within AI-driven applications. For cloud engineers and decision-makers, understanding the distinct characteristics of these models is crucial in harnessing their full potential. Users can leverage Bedrock’s model evaluation feature to understand each model’s performance and suitability for their specific use cases.
However, the decision-making process should not stop at capabilities alone.
Cost considerations, applicability to current and future projects, model updating frequency, and code reusability are critical factors that must be weighed. It’s essential to consider the total cost of ownership, including the ongoing costs associated with model training, maintenance, and scaling.
Struggling with AI Costs?
Organizations today use a multitude of cloud, AI, and SaaS services to keep their business operations running smoothly. However, managing these resources can become overwhelming, and organizations may find themselves overpaying for services or accumulating expenditures from underutilized resources.
Economize offers an end-to-end FinOps solution that enables organizations to develop a cost optimization strategy to reduce their cloud, AI, and SaaS costs. With Economize, organizations can achieve their business objectives without breaking the bank on cloud and AI services.
Sign up for a free demo to start saving on your cloud services today.