Monday, July 31, 2023

AWS Week in Review – Agents for Amazon Bedrock, Amazon SageMaker Canvas New Capabilities, and More – July 31, 2023

This July, AWS communities in ASEAN wrote a new history. First, the AWS User Group Malaysia recently held the first AWS Community Day in Malaysia.

Another significant milestone has been achieved by the AWS User Group Philippines. They just celebrated their tenth anniversary by running 2 days of AWS Community Day Philippines. Here are a few photos from the event, including Jeff Barr sharing his experiences attending AWS User Group meetup, in Manila, Philippines 10 years ago.

Big congratulations to AWS Community Heroes, AWS Community Builders, AWS User Group leaders and all volunteers who organized and delivered AWS Community Days! Also, thank you to everyone who attended and help support our AWS communities.

Last Week’s Launches
We had interesting launches last week, including from AWS Summit, New York. Here are some of my personal highlights:

(Preview) Agents for Amazon Bedrock – You can now create managed agents for Amazon Bedrock to handle tasks using API calls to company systems, understand user requests, break down complex tasks into steps, hold conversations to gather more information, and take actions to fulfill requests.

(Coming Soon) New LLM Capabilities in Amazon QuickSight Q – We are expanding the innovation in QuickSight Q by introducing new LLM capabilities through Amazon Bedrock. These Generative BI capabilities will allow organizations to easily explore data, uncover insights, and facilitate sharing of insights.

AWS Glue Studio support for Amazon CodeWhisperer – You can now write specific tasks in natural language (English) as comments in the Glue Studio notebook, and Amazon CodeWhisperer provides code recommendations for you.

(Preview) Vector Engine for Amazon OpenSearch Serverless – This capability empowers you to create modern ML-augmented search experiences and generative AI applications without the need to handle the complexities of managing the underlying vector database infrastructure.

Last week, Amazon SageMaker Canvas also released a set of new capabilities:

AWS Open-Source Updates
As always, my colleague Ricardo has curated the latest updates for open-source news at AWS. Here are some of the highlights.

cdk-aws-observability-accelerator is a set of opinionated modules to help you set up observability for your AWS environments with AWS native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.

iac-devtools-cli-for-cdk is a command line interface tool that automates many of the tedious tasks of building, adding to, documenting, and extending AWS CDK applications.

Upcoming AWS Events
There are upcoming events that you can join to learn. Let’s start with AWS events:

And let’s learn from our fellow builders and join AWS Community Days:

Open for Registration for AWS re:Invent
We want to be sure you know that AWS re:Invent registration is now open!


This learning conference hosted by AWS for the global cloud computing community will be held from November 27 to December 1, 2023, in Las Vegas.

Pro-tip: You can use information on the Justify Your Trip page to prove the value of your trip to AWS re:Invent trip.

Give Us Your Feedback
We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

That’s all for this week. Check back next Monday for another Week in Review.

Happy building!

Donnie

This post is part of our Week in Review series. Check back each week for a quick round-up of interesting news and announcements from AWS!


P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.



from AWS News Blog https://ift.tt/hOrF13m
via IFTTT

Friday, July 28, 2023

New – AWS Public IPv4 Address Charge + Public IP Insights

We are introducing a new charge for public IPv4 addresses. Effective February 1, 2024 there will be a charge of $0.005 per IP per hour for all public IPv4 addresses, whether attached to a service or not (there is already a charge for public IPv4 addresses you allocate in your account but don’t attach to an EC2 instance).

Public IPv4 Charge
As you may know, IPv4 addresses are an increasingly scarce resource and the cost to acquire a single public IPv4 address has risen more than 300% over the past 5 years. This change reflects our own costs and is also intended to encourage you to be a bit more frugal with your use of public IPv4 addresses and to think about accelerating your adoption of IPv6 as a modernization and conservation measure.

This change applies to all AWS services including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (RDS) database instances, Amazon Elastic Kubernetes Service (EKS) nodes, and other AWS services that can have a public IPv4 address allocated and attached, in all AWS regions (commercial, AWS China, and GovCloud). Here’s a summary in tabular form:

Public IP Address Type Current Price/Hour (USD) New Price/Hour (USD)
(Effective February 1, 2024)
In-use Public IPv4 address (including Amazon provided public IPv4 and Elastic IP) assigned to resources in your VPC, Amazon Global Accelerator, and AWS Site-to-site VPN tunnel No charge $0.005
Additional (secondary) Elastic IP Address on a running EC2 instance $0.005 $0.005
Idle Elastic IP Address in account $0.005 $0.005

The AWS Free Tier for EC2 will include 750 hours of public IPv4 address usage per month for the first 12 months, effective February 1, 2024. You will not be charged for IP addresses that you own and bring to AWS using Amazon BYOIP.

Starting today, your AWS Cost and Usage Reports automatically include public IPv4 address usage. When this price change goes in to effect next year you will also be able to use AWS Cost Explorer to see and better understand your usage.

As I noted earlier in this post, I would like to encourage you to consider accelerating your adoption of IPv6. A new blog post shows you how to use Elastic Load Balancers and NAT Gateways for ingress and egress traffic, while avoiding the use of a public IPv4 address for each instance that you launch. Here are some resources to show you how you can use IPv6 with widely used services such as EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Kubernetes Service (EKS), Elastic Load Balancing, and Amazon Relational Database Service (RDS):

Earlier this year we enhanced EC2 Instance Connect and gave it the ability to connect to your instances using private IPv4 addresses. As a result, you no longer need to use public IPv4 addresses for administrative purposes (generally using SSH or RDP).

Public IP Insights
In order to make it easier for you to monitor, analyze, and audit your use of public IPv4 addresses, today we are launching Public IP Insights, a new feature of Amazon VPC IP Address Manager that is available to you at no cost. In addition to helping you to make efficient use of public IPv4 addresses, Public IP Insights will give you a better understanding of your security profile. You can see the breakdown of public IP types and EIP usage, with multiple filtering options:

You can also see, sort, filter, and learn more about each of the public IPv4 addresses that you are using:

Using IPv4 Addresses Efficiently
By using the new IP Insights tool and following the guidance that I shared above, you should be ready to update your application to minimize the effect of the new charge. You may also want to consider using AWS Direct Connect to set up a dedicated network connection to AWS.

Finally, be sure to read our new blog post, Identify and Optimize Public IPv4 Address Usage on AWS, for more information on how to make the best use of public IPv4 addresses.

Jeff;



from AWS News Blog https://ift.tt/dscxJmv
via IFTTT

Thursday, July 27, 2023

New Amazon EC2 Instances (C7gd, M7gd, and R7gd) Powered by AWS Graviton3 Processor with Local NVMe-based SSD Storage

We launched Amazon EC2 C7g instances in May 2022 and M7g and R7g instances in February 2023. Powered by the latest AWS Graviton3 processors, the new instances deliver up to 25 percent higher performance, up to two times higher floating-point performance, and up to 2 times faster cryptographic workload performance compared to AWS Graviton2 processors.

Graviton3 processors deliver up to 3 times better performance compared to AWS Graviton2 processors for machine learning (ML) workloads, including support for bfloat16. They also support DDR5 memory that provides 50 percent more memory bandwidth compared to DDR4. Graviton3 also uses up to 60 percent less energy for the same performance as comparable EC2 instances, which helps you reduce your carbon footprint.

The C7g instances are well suited for compute-intensive workloads, such as high performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modeling, distributed analytics, and CPU-based machine learning inference. The M7g instances are for general purpose workloads such as application servers, microservices, gaming servers, mid-sized data stores, and caching fleets. The R7g instances are a great fit for memory-intensive workloads such as open-source databases, in-memory caches, and real-time big data analytics.

Today, we’re adding a d variant to all three instance families. The new Amazon EC2 C7gd, M7gd, and R7gd instance types have NVM Express (NVMe) locally attached up to 2 x 1.9 TB SSD drives that are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the instance. These instances have up to 45 percent better real-time NVMe storage performance than comparable Graviton2-based instances.

These are a great fit for applications that need access to high-speed, low-latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. The data on an instance store volume persists only during the life of the associated EC2 instance.

Here are the specs for these instances:

Instance Size vCPU Memory
(GiB)
Local NVMe Storage (GB) Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
C7gd/M7gd/R7gd C7gd/M7gd/R7gd C7gd/M7gd/R7gd
medium 1 2/ 4 / 8 1 x 59 Up to 12.5 Up to 10
large 2 4 / 8 / 16 1 x 118 Up to 12.5 Up to 10
xlarge 4 8 / 16 / 32 1 x 237 Up to 12.5 Up to 10
2xlarge 8 16 / 32 / 64 1 x 474 Up to 15 Up to 10
4xlarge 16 32 / 64 / 128 1 x 950 Up to 15 Up to 10
8xlarge 32 64 / 128 / 256 1 x 1900 15 10
12xlarge 48 96 / 192/ 384 2 x 1425 22.5 15
16xlarge 64 128 / 256 / 512 2 x 1900 30 20

These instances are built on the AWS Nitro System, a combination of AWS-designed dedicated hardware and a lightweight hypervisor that allows the delivery of isolated multitenancy, private networking, and fast local storage. They provide up to 20 Gbps Amazon Elastic Block Store (Amazon EBS) bandwidth and up to 30 Gbps network bandwidth. The 16xlarge instances also support Elastic Fabric Adapter (EFA) for applications that need a high level of inter-node communication.

Now Available
Amazon EC2 C7gd, M7gd, and R7gd instances are now available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and Europe (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

If you’re optimizing applications for Arm architecture, be sure to have a look at our Getting Started collection of resources or learn more about AWS Graviton3-based EC2 instances.

To learn more, visit our Amazon EC2 C7g instances, M7g instances or R7g instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy



from AWS News Blog https://ift.tt/FhECxTS
via IFTTT

New: AWS Local Zone in Phoenix, Arizona – More Instance Types, More EBS Storage Classes, and More Services

I am happy to announce that a new AWS Local Zone in Phoenix, Arizona is now open and ready for you to use, with more instance types, storage classes, and services than ever before.

We launched the first AWS Local Zone in 2019 (AWS Now Available from a Local Zone in Los Angeles) with the goal of making a select set of EC2 instance types, EBS volume types, and other AWS services available with single-digit millisecond when accessed from Los Angeles and other locations in Southern California. Since then, we have launched a second Local Zone in Los Angeles, along with 15 more in other parts of the United States and another 17 around the world, 34 in all. We are also planning to build 19 more Local Zones outside of the US (see the Local Zones Locations page for a complete list).

Local Zones In Action
Our customers make use of Local Zones in many different ways. Popular use cases include real-time gaming, hybrid migrations, content creation for media & entertainment, live video streaming, engineering simulations, and AR/VR at the edge. Here are a couple of great examples that will give you a taste of what is possible:

Arizona State University (ASU) – Known for its innovation and research, ASU is among the largest universities in the U.S. with 173,000 students and 20,000 faculty and staff. Local Zones help them to accelerate the delivery of online services and storage, giving them a level of performance that is helping them to transform the educational experience for students and staff.

DISH Wireless -Two years ago they began to build a cloud-native, fully virtualized 5G network on AWS, making use of Local Zones to support latency-sensitive real-time 5G applications and workloads at the network edge (read Telco Meets AWS Cloud to learn more). The new Local Zone in Phoenix will allow them to further enhance the strength and reliability of their network by extending their 5G core to the edge.

We work closely with these and many other customers to make sure that the Local Zone(s) that they use are a great fit for their use cases. In addition to the already-strong set of instance types, storage classes, and services that are part-and-parcel of every Local Zone, we add others on an as-needed basis.

For example, Local Zones in Los Angeles, Miami, and other locations have additional instance types; several Local Zones have additional Amazon Elastic Block Store (Amazon EBS) storage classes, and others have extra services such as Application Load Balancer, Amazon FSx, Amazon EMR, Amazon ElastiCache, Amazon Relational Database Service (RDS), Amazon GameLift, and AWS Application Migration Service (AWS MGN). You can see this first-hand on the Local Zones Features page.

And Now, Phoenix
As I mentioned earlier, this Local Zone has more instance types, storage classes, and services than earlier Local Zones. Here’s what’s inside:

Instance Types – Compared to all other Local Zones with the T3, C5(d), R5(d), and G4dn instance types, the Phoenix Local Zone includes C6i, M6i, R6i, and Cg6n instances.

EBS Volume Types  – In addition to the gp2 volumes that are available in all Local Zones, the Phoenix Local Zone includes gp3 (General Purpose SSD) , io1 (Provisioned IOPS SSD) , st1 (Throughput Optimized HDD), and sc1 (Cold HDD) storage.

Services – In addition to Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Store (Amazon EBS), AWS Shield, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Container Service (Amazon ECS). Amazon Elastic Kubernetes Service (EKS), Application Load Balancer, and AWS Direct Connect, the Phoenix LZ includes NAT Gateway.

Pricing Models – In addition to On-Demand and Savings Plans, the Phoenix Local Zone includes Spot.

Going forward, we plan to launch more Local Zones that are similarly equipped.

Opting-In to the Phoenix Local Zone
The original Phoenix Local Zone was launched in 2022 and remains available to customers who have already enabled it. The Zone that we are announcing today can be enabled by new and existing customers.

To get started with this or any other Local Zone, I must first enable it. To do this, I open the EC2 Console, select the parent region (US West (Oregon)) from the menu, and then click EC2 Dashboard in the left-side navigation:

Then I click on Zones in the Account attributes box:

Next, I scroll down to the new Phoenix Local Zone (us-west-2-phx-2), and click Manage:

I click Enabled, and then Update zone group:

I confirm that I want to enable the Zone Group, and click Ok:

And I am all set. I can create EBS volumes, launch EC2 instances, and make use of the other services in this Local Zone.

Jeff;



from AWS News Blog https://ift.tt/bNWls0L
via IFTTT

Wednesday, July 26, 2023

New – Amazon EC2 P5 Instances Powered by NVIDIA H100 Tensor Core GPUs for Accelerating Generative AI and HPC Applications

In March 2023, AWS and NVIDIA announced a multipart collaboration focused on building the most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

We preannounced Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWS’s latest networking and scalability that will deliver up to 20 exaflops of compute performance for building and training the largest machine learning (ML) models. This announcement is the product of more than a decade of collaboration between AWS and NVIDIA, delivering the visual computing, AI, and high performance computing (HPC) clusters across the Cluster GPU (cg1) instances (2010), G2 (2013), P2 (2016), P3 (2017), G3 (2017), P3dn (2018), G4 (2019), P4 (2020), G5 (2021), and P4de instances (2022).

Most notably, ML model sizes are now reaching trillions of parameters. But this complexity has increased customers’ time to train, where the latest LLMs are now trained over the course of multiple months. HPC customers also exhibit similar trends. With the fidelity of HPC customer data collection increasing and data sets reaching exabyte scale, customers are looking for ways to enable faster time to solution across increasingly complex applications.

Introducing EC2 P5 Instances
Today, we are announcing the general availability of Amazon EC2 P5 instances, the next-generation GPU instances to address those customer needs for high performance and scalability in AI/ML and HPC workloads. P5 instances are powered by the latest NVIDIA H100 Tensor Core GPUs and will provide a reduction of up to 6 times in training time (from days to hours) compared to previous generation GPU-based instances. This performance increase will enable customers to see up to 40 percent lower training costs.

P5 instances provide 8 x NVIDIA H100 Tensor Core GPUs with 640 GB of high bandwidth GPU memory, 3rd Gen AMD EPYC processors, 2 TB of system memory, and 30 TB of local NVMe storage. P5 instances also provide 3200 Gbps of aggregate network bandwidth with support for GPUDirect RDMA, enabling lower latency and efficient scale-out performance by bypassing the CPU on internode communication.

Here are the specs for these instances:

Instance
Size
vCPUs Memory
(GiB)
GPUs
(H100)
Network Bandwidth
(Gbps)
EBS Bandwidth
(Gbps)
Local Storage
(TB)
P5.48xlarge 192 2048 8 3200 80 8 x 3.84

Here’s a quick infographic that shows you how the P5 instances and NVIDIA H100 Tensor Core GPUs compare to previous instances and processors:

P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models behind the most demanding and compute-intensive generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more. P5 will provide up to 6 times lower time to train compared with previous generation GPU-based instances across those applications. Customers who can use lower precision FP8 data types in their workloads, common in many language models that use a transformer model backbone, will see further benefit at up to 6 times performance increase through support for the NVIDIA transformer engine.

HPC customers using P5 instances can deploy demanding applications at greater scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling. Customers using dynamic programming (DP) algorithms for applications like genome sequencing or accelerated data analytics will also see further benefit from P5 through support for a new DPX instruction set.

This enables customers to explore problem spaces that previously seemed unreachable, iterate on their solutions at a faster clip, and get to market more quickly.

You can see the detail of instance specifications along with comparisons of instance types between p4d.24xlarge and new p5.48xlarge below:

Feature p4d.24xlarge p5.48xlarge Comparision
Number & Type of Accelerators 8 x NVIDIA A100 8 x NVIDIA H100
FP8 TFLOPS per Server 16,000 640% vs.A100 FP16
FP16 TFLOPS per Server 2,496 8,000
GPU Memory 40 GB 80 GB 200%
GPU Memory Bandwidth 12.8 TB/s 26.8 TB/s 200%
CPU Family Intel Cascade Lake AMD Milan
vCPUs 96  192 200%
Total System Memory 1152 GB 2048 GB 200%
Networking Throughput 400 Gbps 3200 Gbps 800%
EBS Throughput 19 Gbps 80 Gbps 400%
Local Instance Storage 8 TBs NVMe 30 TBs NVMe 375%
GPU to GPU Interconnect 600 GB/s 900 GB/s 150%

Second-generation Amazon EC2 UltraClusters and Elastic Fabric Adaptor
P5 instances provide market-leading scale-out capability for multi-node distributed training and tightly coupled HPC workloads. They offer up to 3,200 Gbps of networking using the second-generation Elastic Fabric Adaptor (EFA) technology, 8 times compared with P4d instances.

To address customer needs for large-scale and low latency, P5 instances are deployed in the second-generation EC2 UltraClusters, which now provide customers with lower latency across up to 20,000+ NVIDIA H100 Tensor Core GPUs. Providing the largest scale of ML infrastructure in the cloud, P5 instances in EC2 UltraClusters deliver up to 20 exaflops of aggregate compute capability.

EC2 UltraClusters use Amazon FSx for Lustre, fully managed shared storage built on the most popular high-performance parallel file system. With FSx for Lustre, you can quickly process massive datasets on demand and at scale and deliver sub-millisecond latencies. The low-latency and high-throughput characteristics of FSx for Lustre are optimized for deep learning, generative AI, and HPC workloads on EC2 UltraClusters.

FSx for Lustre keeps the GPUs and ML accelerators in EC2 UltraClusters fed with data, accelerating the most demanding workloads. These workloads include LLM training, generative AI inferencing, and HPC workloads, such as genomics and financial risk modeling.

Getting Started with EC2 P5 Instances
To get started, you can use P5 instances in the US East (N. Virginia) and US West (Oregon) Region.

When launching P5 instances, you will choose AWS Deep Learning AMIs (DLAMIs) to support P5 instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure distributed ML applications in preconfigured environments.

You will be able to run containerized applications on P5 instances with AWS Deep Learning Containers using libraries for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service  (Amazon EKS).  For a more managed experience, you can also use P5 instances via Amazon SageMaker, which helps developers and data scientists easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. HPC customers can leverage AWS Batch and ParallelCluster with P5 to help orchestrate jobs and clusters efficiently.

Existing P4 customers will need to update their AMIs to use P5 instances. Specifically, you will need to update your AMIs to include the latest NVIDIA driver with support for NVIDIA H100 Tensor Core GPUs. They will also need to install the latest CUDA version (CUDA 12), CuDNN version, framework versions (e.g., PyTorch, Tensorflow), and EFA driver with updated topology files. To make this process easy for you, we will provide new DLAMIs and Deep Learning Containers that come prepackaged with all the needed software and frameworks to use P5 instances out of the box.

Now Available
Amazon EC2 P5 instances are available today in AWS Regions: US East (N. Virginia) and US West (Oregon). For more information, see the Amazon EC2 pricing page. To learn more, visit our P5 instance page and explore AWS re:Post for EC2 or through your usual AWS Support contacts.

You can choose a broad range of AWS services that have generative AI built in, all running on the most cost-effective cloud infrastructure for generative AI. To learn more, visit Generative AI on AWS to innovate faster and reinvent your applications.

Channy



from AWS News Blog https://ift.tt/k76ohDZ
via IFTTT

Preview – Enable Foundation Models to Complete Tasks With Agents for Amazon Bedrock

This April, Swami Sivasubramanian, Vice President of Data and Machine Learning at AWS, announced Amazon Bedrock and Amazon Titan models as part of new tools for building with generative AI on AWS. Amazon Bedrock, currently available in preview, is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups—such as AI21 Labs, Anthropic, Cohere, and Stability AI—available through an API.

Today, I’m excited to announce the preview of agents for Amazon Bedrock, a new capability for developers to create fully managed agents in a few clicks. Agents for Amazon Bedrock accelerate the delivery of generative AI applications that can manage and perform tasks by making API calls to your company systems. Agents extend FMs to understand user requests, break down complex tasks into multiple steps, carry on a conversation to collect additional information, and take actions to fulfill the request.

Agents for Amazon Bedrock

Using agents for Amazon Bedrock, you can automate tasks for your internal or external customers, such as managing retail orders or processing insurance claims. For example, an agent-powered generative AI e-commerce application can not only respond to the question, “Do you have this jacket in blue?” with a simple answer but can also help you with the task of updating your order or managing an exchange.

For this to work, you first need to give the agent access to external data sources and connect it to existing APIs of other applications. This allows the FM that powers the agent to interact with the broader world and extend its utility beyond just language processing tasks. Second, the FM needs to figure out what actions to take, what information to use, and in which sequence to perform these actions. This is possible thanks to an exciting emerging behavior of FMs—their ability to reason. You can show FMs how to handle such interactions and how to reason through tasks by building prompts that include definitions and instructions. The process of designing prompts to guide the model towards desired outputs is known as prompt engineering.

Introducing Agents for Amazon Bedrock
Agents for Amazon Bedrock automate the prompt engineering and orchestration of user-requested tasks. Once configured, an agent automatically builds the prompt and securely augments it with your company-specific information to provide responses back to the user in natural language. The agent is able to figure out the actions required to automatically process user-requested tasks. It breaks the task into multiple steps, orchestrates a sequence of API calls and data lookups, and maintains memory to complete the action for the user.

With fully managed agents, you don’t have to worry about provisioning or managing infrastructure. You’ll have seamless support for monitoring, encryption, user permissions, and API invocation management without writing custom code. As a developer, you can use the Bedrock console or SDK to upload the API schema. The agent then orchestrates the tasks with the help of FMs and performs API calls using AWS Lambda functions.

Primer on Advanced Reasoning and ReAct
You can help FMs to reason and figure out how to solve user-requested tasks with a reasoning technique called ReAct (synergizing reasoning and acting). Using ReAct, you can structure prompts to show an FM how to reason through a task and decide on actions that help find a solution. The structured prompts include a sequence of question-thought-action-observation examples.

The question is the user-requested task or problem to solve. The thought is a reasoning step that helps demonstrate to the FM how to tackle the problem and identify an action to take. The action is an API that the model can invoke from an allowed set of APIs. The observation is the result of carrying out the action. The actions that the FM is able to choose from are defined by a set of instructions that are prepended to the example prompt text. Here is an illustration of how you would build up a ReAct prompt:

Building up a ReAct prompt

The good news is that Bedrock performs the heavy lifting for you! Behind the scenes, agents for Amazon Bedrock build the prompts based on the information and actions you provide.

Now, let me show you how to get started with agents for Amazon Bedrock.

Create an Agent for Amazon Bedrock
Let’s assume you’re a developer at an insurance company and want to provide a generative AI application that helps the insurance agency owners automate repetitive tasks. You create an agent in Bedrock and integrate it into your application.

To get started with the agent, open the Bedrock console, select Agents in the left navigation panel, then choose Create Agent.

Agents for Amazon Bedrock

This starts the agent creation workflow.

  1. Provide agent details including agent name, description (optional), whether the agent is allowed to request additional user inputs, and the AWS Identity and Access Management (IAM) service role that gives your agent access to other required services, such as Amazon Simple Storage Service (Amazon S3) and AWS Lambda.Agents for Amazon Bedrock
  2. Select a foundation model from Bedrock that fits your use case. Here, you provide an instruction to your agent in natural language. The instruction tells the agent what task it’s supposed to perform and the person it’s supposed to assume. For example, “You are an agent designed to help with processing insurance claims and managing pending paperwork.”Agents for Amazon Bedrock
  3. Add action groups. An action is a task that the agent can perform automatically by making API calls to your company systems. A set of actions is defined in an action group. Here, you provide an API schema that defines the APIs for all the actions in the group. You also must provide a Lambda function that represents the business logic for each API. For example, let’s define an action group called ClaimManagementActionGroup that manages insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders. Make sure to capture this information in the action group description. The FM will use this information to reason and decide which action to take.Agents for Amazon BedrockThe business logic for my action group is captured in the Lambda function InsuranceClaimsLambda. This AWS Lambda function implements methods for the following API calls: open-claims, identify-missing-documents, and send-reminders.Here’s a short extract from my OrderManagementLambda:
    import json
    import time
     
    def open_claims():
        ...
    
    def identify_missing_documents(parameters):
        ...
     
    def send_reminders():
        ...
     
    def lambda_handler(event, context):
        responses = []
     
        for prediction in event['actionGroups']:
            response_code = ...
            action = prediction['actionGroup']
            api_path = prediction['apiPath']
            
            if api_path == '/claims':
                body = open_claims() 
            elif api_path == '/claims/{claimId}/identify-missing-documents':
                            parameters = prediction['parameters']
                body = identify_missing_documents(parameters)
            elif api_path == '/send-reminders':
                body =  send_reminders()
            else:
                body = {"{}::{} is not a valid api, try another one.".format(action, api_path)}
     
            response_body = {
                'application/json': {
                    'body': str(body)
                }
            }
            
            action_response = {
                'actionGroup': prediction['actionGroup'],
                'apiPath': prediction['apiPath'],
                'httpMethod': prediction['httpMethod'],
                'responseCode': response_code,
                'responseBody': response_body
            }
            
            responses.append(action_response)
     
        api_response = {'response': responses}
     
        return api_response

    Note that you also must provide an API schema in the OpenAPI schema JSON format. Here’s what my API schema file insurance_claim_schema.json looks like:

    {"openapi": "3.0.0",
        "info": {
            "title": "Insurance Claims Automation API",
            "version": "1.0.0",
            "description": "APIs for managing insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders."
        },
        "paths": {
            "/claims": {
                "get": {
                    "summary": "Get a list of all open claims",
                    "description": "Get the list of all open insurance claims. Return all the open claimIds.",
                    "operationId": "getAllOpenClaims",
                    "responses": {
                        "200": {
                            "description": "Gets the list of all open insurance claims for policy holders",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "array",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "claimId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the claim."
                                                },
                                                "policyHolderId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the policy holder who has filed the claim."
                                                },
                                                "claimStatus": {
                                                    "type": "string",
                                                    "description": "The status of the claim. Claim can be in Open or Closed state"
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            },
            "/claims/{claimId}/identify-missing-documents": {
                "get": {
                    "summary": "Identify missing documents for a specific claim",
                    "description": "Get the list of pending documents that need to be uploaded by policy holder before the claim can be processed. The API takes in only one claim id and returns the list of documents that are pending to be uploaded by policy holder for that claim. This API should be called for each claim id",
                    "operationId": "identifyMissingDocuments",
                    "parameters": [{
                        "name": "claimId",
                        "in": "path",
                        "description": "Unique ID of the open insurance claim",
                        "required": true,
                        "schema": {
                            "type": "string"
                        }
                    }],
                    "responses": {
                        "200": {
                            "description": "List of documents that are pending to be uploaded by policy holder for insurance claim",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "pendingDocuments": {
                                                "type": "string",
                                                "description": "The list of pending documents for the claim."
                                            }
                                        }
                                    }
                                }
                            }
    
                        }
                    }
                }
            },
            "/send-reminders": {
                "post": {
                    "summary": "API to send reminder to the customer about pending documents for open claim",
                    "description": "Send reminder to the customer about pending documents for open claim. The API takes in only one claim id and its pending documents at a time, sends the reminder and returns the tracking details for the reminder. This API should be called for each claim id you want to send reminders for.",
                    "operationId": "sendReminders",
                    "requestBody": {
                        "required": true,
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "claimId": {
                                            "type": "string",
                                            "description": "Unique ID of open claims to send reminders for."
                                        },
                                        "pendingDocuments": {
                                            "type": "string",
                                            "description": "The list of pending documents for the claim."
                                        }
                                    },
                                    "required": [
                                        "claimId",
                                        "pendingDocuments"
                                    ]
                                }
                            }
                        }
                    },
                    "responses": {
                        "200": {
                            "description": "Reminders sent successfully",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "sendReminderTrackingId": {
                                                "type": "string",
                                                "description": "Unique Id to track the status of the send reminder Call"
                                            },
                                            "sendReminderStatus": {
                                                "type": "string",
                                                "description": "Status of send reminder notifications"
                                            }
                                        }
                                    }
                                }
                            }
                        },
                        "400": {
                            "description": "Bad request. One or more required fields are missing or invalid."
                        }
                    }
                }
            }
        }
    }

    When a user asks your agent to complete a task, Bedrock will use the FM you configured for the agent to identify the sequence of actions and invoke the corresponding Lambda functions in the right order to solve the user-requested task.

  4. In the final step, review your agent configuration and choose Create Agent.Agents for Amazon Bedrock
  5. Congratulations, you’ve just created your first agent in Amazon Bedrock!Agents for Amazon Bedrock

Deploy an Agent for Amazon Bedrock
To deploy an agent in your application, you must create an alias. Bedrock then automatically creates a version for that alias.

  1. In the Bedrock console, select your agent, then select Deploy, and choose Create to create an alias.Agents for Amazon Bedrock
  2. Provide an alias name and description and choose whether to create a new version or use an existing version of your agent to associate with this alias.
    Agents for Amazon Bedrock
  3. This saves a snapshot of the agent code and configuration and associates an alias with this snapshot or version. You can use the alias to integrate the agent into your applications.
    Agents for Amazon Bedrock

Now, let’s test the insurance agent! You can do this right in the Bedrock console.

Let’s ask the agent to “Send reminder to all policy holders with open claims and pending paper work.” You can see how the FM-powered agent is able to understand the user request, break down the task into steps (collect the open insurance claims, lookup the claim IDs, send reminders), and perform the corresponding actions.

Agents for Amazon Bedrock

Agents for Amazon Bedrock can help you increase productivity, improve your customer service experience, or automate DevOps tasks. I’m excited to see what use cases you will implement!

Generative AI with large language modelsLearn the Fundamentals of Generative AI
If you’re interested in the fundamentals of generative AI and how to work with FMs, including advanced prompting techniques and agents, check out this this new hands-on course that I developed with AWS colleagues and industry experts in collaboration with DeepLearning.AI:

Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out to us if you’d like access to agents for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. Visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje


P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.



from AWS News Blog https://ift.tt/ErtSaRN
via IFTTT

Tuesday, July 25, 2023

Top Announcements of the AWS Summit in New York, 2023

It’ll be a full house as the AWS Summit gets underway in New York City on Wednesday, July 26, 2023. The cloud event has something for everyone including a keynote, breakout sessions, opportunities to network, and of course, to learn about the latest exciting AWS product announcements.

Today, we’re sharing a selection of announcements to get the fun started. We’ll also share major updates from Wednesday’s keynote, so check back for more exciting news to come soon.

If you want to attend the event virtually, you can still register for the keynote livestream.

(This post was last updated: 5:35 p.m. PST, July 25, 2023.)

AWS product announcements from July 25, 2023

Announcing the general availability of AWS HealthImaging
This new HIPAA-eligible service empowers healthcare providers and their software partners to store, analyze, and share medical images at petabyte scale.

Amazon Redshift now supports querying Apache Iceberg tables (preview)
Apache Iceberg, one of the most recent open table formats, has been used by many customers to simplify data processing on rapidly expanding and evolving tables stored in data lakes.

AWS Glue Studio now supports Amazon Redshift Serverless
Before this launch, developers using Glue Studio only had access to Redshift tables in Redshift clusters. Now, those same developers can connect to Redshift Serverless tables directly without manual configuration.

Snowflake connectivity for AWS Glue for Apache Spark is now generally available
AWS Glue for Apache Spark now supports native connectivity to Snowflake, which enables users to read and write data without the need to install or manage Snowflake connector libraries.

AWS Glue jobs can now include AWS Glue DataBrew Recipes
The new integration makes it simpler to deploy and scale DataBrew jobs and gives DataBrew users access to AWS Glue features not available in DataBrew.



from AWS News Blog https://ift.tt/Xx7IHbl
via IFTTT

Monday, July 24, 2023

AWS Week in Review – Redshift+Forecast, CodeCatalyst+GitHub, Lex Analytics, Llama 2, and Much More – July 24, 2023

Summer is in full swing here in Seattle and we are spending more time outside and less at the keyboard. Nevertheless, the launch machine is running at full speed and I have plenty to share with you today. Let’s dive in and take a look!

Last Week’s Launches
Here are some launches that caught my eye:

Amazon Redshift – Amazon Redshift ML can now make use of an integrated connection to Amazon Forecast. You can now use SQL statements of the form CREATE MODEL to create and train forecasting models from your time series data stored in Redshift, and then use these models to make forecasts for revenue, inventory, demand, and so forth. You can also define probability metrics and use them to generate forecasts. To learn more, read the What’s New and the Developer’s Guide.

Amazon CodeCatalyst – You can now trigger Amazon CodeCatalyst workflows from pull request events in linked GitHub repositories. The workflows can perform build, test, and deployment operations, and can be triggered when the pull requests in the linked repositories are opened, revised, or closed. To learn more, read Using GitHub Repositories with CodeCatalyst.

Amazon Lex – You can now use the Analytics on Amazon Lex dashboard to review data-driven insights that will help you to improve the performance of your Lex bots. You get a snapshot of your key metrics, and the ability to drill down for more. You can use conversational flow visualizations to see how users navigate across intents, and you can review individual conversations to make qualitative assessments. To learn more, read the What’s New and the Analytics Overview.

Llama2 Foundation Models – The brand-new Llama 2 foundation models from Meta are now available in Amazon SageMaker JumpStart. The Llama 2 model is available in three parameter sizes (7B, 13B, and 70B) with pretrained and fine-tuned variations. You can deploy and use the models with a few clicks in Amazon SageMaker Studio, and you can also use the SageMaker Python SDK (code and docs) to access them programmatically. To learn more, read Llama 2 Foundation Models from Meta are Now Available in Amazon SageMaker JumpStart and the What’s New.

X in Y – We launched some existing services and instances types in additional AWS Regions:

For a full list of AWS announcements, be sure to keep an eye on the What's New at AWS page.

Other AWS News
Here are some additional blog posts and news items that you might find interesting:

AWS Open Source News and Updates – My colleague Ricardo has published issue 166 of his legendary and highly informative AWS Open Source Newsletter!

CodeWhisperer in Action – My colleague Danilo wrote an interesting post to show you how to Reimagine Software Development With CodeWhisperer as Your AI Coding Companion.

News Blog Survey – If you have read this far, please consider taking the AWS Blog Customer Survey. Your responses will help us to gauge your satisfaction with this blog, and will help us to do a better job in the future. This survey is hosted by an external company, so the link does not lead to our web site. AWS handles your information as described in the AWS Privacy Notice.

CDK Integration Tests – The AWS Application Management Blog wrote a post to show you How to Write and Execute Integration Tests for AWS CDK Applications.

Event-Driven Architectures – The AWS Architecture Blog shared some Best Practices for Implementing Event-Driven Architectures in Your Organization.

Amazon Connect – The AWS Contact Center Blog explained how to Manage Prompts Programmatically with Amazon Connect.

Rodents – The AWS Machine Learning Blog showed you how to Analyze Rodent Infestation Using Amazon SageMaker Geospatial Capabilities.

Secrets Migration – The AWS Security Blog published a two-part series that discusses migrating your secrets to AWS Secrets Manager (Part 1: Discovery and Design, Part 2: Implementation).

Upcoming AWS Events
Check your calendar and sign up for these AWS events:

AWS Storage Day – Join us virtually on August 9th to learn about how to prepare for AI/ML, deliver holistic data protection, and optimize storage costs for your on-premises and cloud data. Register now.

AWS Global Summits – Attend the upcoming AWS Summits in New York (July 26), Taiwan (August 2 & 3), São Paulo (August 3), and Mexico City (August 30).

AWS Community Days – Attend upcoming AWS Community Days in The Philippines (July 29-30), Colombia (August 12), and West Africa (August 19).

re:InventRegister now for re:Invent 2023 in Las Vegas (November 27 to December 1).

That’s a Wrap
And that’s about it for this week. I’ll be sharing additional news this coming Friday on AWS on Air – tune in and say hello!

Jeff;



from AWS News Blog https://ift.tt/CTVrOJU
via IFTTT

Thursday, July 20, 2023

Amazon Route 53 Resolver Now Available on AWS Outposts Rack

Starting today, Amazon Route 53 Resolver is now available on AWS Outposts rack, providing your on-premises services and applications with local DNS resolution directly from Outposts. Local Route 53 Resolver endpoints also enable DNS resolution between Outposts and your on-premises DNS server. Route 53 Resolver on Outposts helps to improve your on-premises applications availability and performance.

AWS Outposts provides a hybrid cloud solution that allows you to extend your AWS infrastructure and services to your on-premises data centers. This enables you to build and operate hybrid applications that seamlessly integrate with your existing on-premises infrastructure. Your applications deployed on Outposts benefit from low-latency access to on-premises systems. You also get a consistent management experience across AWS Regions and your on-premises environments. This includes access to the same AWS management tools, APIs, and services that you use when managing AWS services in a Region. Outposts uses the same security controls and policies as AWS in the cloud, providing you with a consistent security posture across your hybrid cloud environment. This includes data encryption, identity and access management, and network security.

One of the typical use cases for Outposts is to deploy applications that require low-latency access to on-premises systems, such as factory equipment, high-frequency trading applications, or medical diagnosis systems.

DNS stands for Domain Name System, which is the system that translates human-readable domain names like “example.com” into IP addresses like “93.184.216.34” that computers use to communicate with each other on the internet. A Route 53 Resolver is a component that is responsible for resolving domain names to IP addresses.

Until today, applications and services running on an Outpost forwarded their DNS queries to the parent AWS Region the Outpost is connected to. But remember, as Amazon CTO Dr Werner Vogels says: everything fails all the time. There can be temporary site disconnections—think about fiber cuts or weather events. When the on-premises facility becomes temporarily disconnected from the internet, local DNS resolution fails, making it difficult for applications and services to discover other services, even when they are running on the same Outposts rack. For example, applications running locally on the Outpost won’t be able to discover the IP address of a local database running on the same Outpost, or a microservice won’t be able to locate other microservices running locally.

Starting today, when you opt in for local Route 53 Resolvers on Outposts, applications and services will continue to benefit from local DNS resolution to discover other services—even in a parent AWS Region connectivity loss event. Local Resolvers also help to reduce latency for DNS resolutions as query results are cached and served locally from the Outposts, eliminating unnecessary round-trips to the parent AWS Region. All the DNS resolutions for applications in Outposts VPCs using private DNS are served locally.

In addition to local Resolvers, this launch also enables local Resolver endpoints. Route 53 Resolver endpoints are not new; creating inbound or outbound Resolver endpoints in a VPC has been available since November 2018. Today, you can also create endpoints inside the VPC on Outposts. Route 53 Resolver outbound endpoints enable Route 53 Resolvers to forward DNS queries to DNS resolvers that you manage, for example, on your on-premises network. In contrast, Route 53 Resolver inbound endpoints forward the DNS queries they receive from outside the VPC to the Resolver running on Outposts. It allows sending DNS queries for services deployed on a private Outposts VPC from outside of that VPC.

Let’s See It in Action
To create and test a local Resolver on Outposts, I first connect to the Outpost section of the AWS Management Console. I navigate to the Route 53 Outposts section and select Create Resolver.

Create local resolver on outpost

I select the Outpost on which I want to create the Resolver and enter a Resolver name. Then, I select the size of the instances to deploy the Resolver and the number of instances. The selection of instance size impacts the performance of the Resolver (the number of resolutions it can process per second). The default is an m5.large instance able to handle up to 7,000 queries per second. The number of instances impacts the availability of the Resolver, the default is four instances. I select Create Resolver to create the Resolver instances.

Create local resolver - choose instance type and number

After a few minutes, I should see the Resolver status becoming ✅ Operational.

Local resolver is operationalThe next step is to create the Resolver endpoint. Inbound endpoints allow to forward external DNS queries to the local Resolver on the Outpost. Outbound endpoints allow to forward locally initiated DNS queries to external DNS resolvers you manage. For this demo, I choose to create an inbound endpoint.

Under the Inbound endpoints section, I select Create inbound endpoint.

Local resolver - create inbound endpoint

I enter an Endpoint name, I choose the VPC in the Region to attach this endpoint to, and I select the previously created Security group for this endpoint.

Create inbound endpoint details

I select the IP address the endpoint will consume in each subnet. I can select to Use an IP address that is selected automatically or Use an IP address that I specify.

Create inbound endpoint - select an IP addressFinally, I select the instance type to bind to the inbound endpoint. The larger the instance, the more queries per second it will handle. The service creates two endpoint instances for high availability.

When I am ready, I select the Create inbound endpoint to start the creation process.

Create inbound endpoint - select the instance type

After a few minutes, the endpoint Status becomes ✅ Operational.

Create inbound endpoint sttaus operational

The setup is now ready to test. I therefore SSH-connect to an EC2 instance running on the Outpost, and I test the time it takes to resolve an external DNS name. Local Resolvers cache queries on the Outpost itself. I therefore expect my first query to take a few milliseconds and the second one to be served immediately from the cache.

Indeed, the first query resolves in 13 ms (see the line ;; Query time: 13 msec).

➜  ~ dig amazon.com

; <<>> DiG 9.16.38-RH <<>> amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35859
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;amazon.com.                    IN      A

;; ANSWER SECTION:
amazon.com.             797     IN      A       52.94.236.248
amazon.com.             797     IN      A       205.251.242.103
amazon.com.             797     IN      A       54.239.28.85

;; Query time: 13 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Sun May 28 09:47:27 CEST 2023
;; MSG SIZE  rcvd: 87

And when I repeat the same query, it resolves in zero milliseconds, showing it is now served from a local cache.

➜  ~ dig amazon.com

; <<>> DiG 9.16.38-RH <<>> amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63500
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;amazon.com.                    IN      A

;; ANSWER SECTION:
amazon.com.             586     IN      A       54.239.28.85
amazon.com.             586     IN      A       205.251.242.103
amazon.com.             586     IN      A       52.94.236.248

;; Query time: 0 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Sun May 28 09:50:58 CEST 2023
;; MSG SIZE  rcvd: 87

Pricing and Availability
Remember that only the Resolver and the VPC endpoints are deployed on your Outposts. You continue to manage your Route 53 zones and records from the AWS Regions. The local Resolver and its endpoints will consume some capacity on the Outposts. You will need to provide four EC2 instances from your Outposts for the Route 53 Resolver and two other instances for each Resolver endpoint.

Your existing Outposts racks must have the latest Outposts software for you to use the local Route 53 Resolver and the Resolver endpoints. You can raise a ticket with us to have your Outpost updated (the console will also remind you to do so when needed).

The local Resolvers are provided without additional cost. The endpoints are charged per elastic network interface (ENI) per hour, as is already the case today.

You can configure local Resolvers and local endpoints in all AWS Regions where Outposts racks are available, except in AWS GovCloud (US) Regions. That’s a list of 22 AWS Regions as of today.

Go and configure local Route 53 Resolvers on Outposts now!

-- seb

 

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.



from AWS News Blog https://ift.tt/4597leu
via IFTTT