Monday, March 9, 2026

AWS Weekly Roundup: Amazon Connect Health, Bedrock AgentCore Policy, GameDay Europe, and more (March 9, 2026)

Fiti AWS Student Community Kenya!

Last week was an incredible whirlwind: a round of meetups, hands-on workshops, and career discussions across Kenya that culminated with the AWS Student Community Day at Meru University of Science and Technology, with keynotes from my colleagues Veliswa and Tiffany, and sessions on everything from GitOps to cloud-native engineering, and a whole lot of AI agent building.

JAWS Days 2026 is the largest AWS Community Day in the world, with over 1,500 attendees on March 7th. This event started with a keynote speech on building an AI-driven development team by Jeff Barr, and included over 100 technical and community experience sessions, lightning talks, and workshops such as Game Days, Builders Card Challenges, and networking parties.

Now, let’s get into this week’s AWS news…

Last week’s launches
Here are some launches and updates from this past week that caught my attention:

  • Introducing Amazon Connect Health, Agentic AI Built for Healthcare — Amazon Connect Health is now generally available with five purpose-built AI agents for healthcare: patient verification, appointment management, patient insights, ambient documentation, and medical coding. All features are HIPAA-eligible and deployable within existing clinical workflows in days.
  • Policy in Amazon Bedrock AgentCore is now generally available — You can now use centralized, fine-grained controls for agent-tool interactions that operate outside your agent code. Security and compliance teams can define tool access and input validation rules using natural language that automatically converts to Cedar, the AWS open-source policy language.
  • Introducing OpenClaw on Amazon Lightsail to run your autonomous private AI agents — You can deploy a private AI assistant on your own cloud infrastructure with built-in security controls, sandboxed agent sessions, one-click HTTPS, and device pairing authentication. Amazon Bedrock serves as the default model provider, and you can connect to Slack, Telegram, WhatsApp, and Discord.
  • AWS announces pricing for VPC Encryption Controls — Starting March 1, 2026, VPC Encryption Controls transitions from free preview to a paid feature. You can audit and enforce encryption-in-transit of all traffic flows within and across VPCs in a region, with monitor mode to detect unencrypted traffic and enforce mode to prevent it.
  • Database Savings Plans now supports Amazon OpenSearch Service and Amazon Neptune Analytics — You can save up to 35% on eligible serverless and provisioned instance usage with a one-year commitment. Savings Plans automatically apply regardless of engine, instance family, size, or AWS Region.
  • AWS Elastic Beanstalk now offers AI-powered environment analysis — When your environment health is degraded, Elastic Beanstalk can now collect recent events, instance health, and logs and send them to Amazon Bedrock for analysis, providing step-by-step troubleshooting recommendations tailored to your environment’s current state.
  • AWS simplifies IAM role creation and setup in service workflows — You can now create and configure IAM roles directly within service workflows through a new in-console panel, without switching to the IAM console. The feature supports Amazon EC2, Lambda, EKS, ECS, Glue, CloudFormation, and more.
  • Accelerate Lambda durable functions development with new Kiro power — You can now build resilient, long-running multi-step applications and AI workflows faster with AI agent-assisted development in Kiro. The power dynamically loads guidance on replay models, step and wait operations, concurrent execution patterns, error handling, and deployment best practices.
  • Amazon GameLift Servers launches DDoS Protection — You can now protect session-based multiplayer games against DDoS attacks with a co-located relay network that authenticates client traffic using access tokens and enforces per-player traffic limits, at no additional cost to GameLift Servers customers.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

From AWS community
Here are my personal favorite posts from AWS community and my colleagues:

  • I Built a Portable AI Memory Layer with MCP, AWS Bedrock, and a Chrome Extension — Learn how to build a persistent memory layer for AI agents using MCP and Amazon Bedrock, packaged as a Chrome extension that carries context across sessions and applications.
  • When the Model Is the Machine — Mike Chambers built an experimental app where an AI agent generates a complete, interactive web application at runtime from a single prompt — no codebase, no framework, no persistent state. A thought-provoking exploration of what happens when the model becomes the runtime.

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

  • AWS Community GameDay Europe — Think you know AWS? Prove it at the AWS Community GameDay Europe on March 17, a gamified learning event where teams compete to solve real-world technical challenges using AWS services.
  • AWS at NVIDIA GTC 2026 — Join us at our AWS sessions, booths, demos, and ancillary events in NVIDIA GTC 2026 on March 16 – 19, 2026 in San Jose. You can receive 20% off event passes through AWS and request a 1:1 meeting at GTC.
  • AWS Summits — Join AWS Summits in 2026: free in-person events where you can explore emerging cloud and AI technologies, learn best practices, and network with industry peers and experts. Upcoming Summits include Paris (April 1), London (April 22), and Bengaluru (April 23–24).
  • AWS Community Days — Community-led conferences where content is planned, sourced, and delivered by community leaders. Upcoming events include Slovakia (March 11), Pune (March 21), and the AWSome Women Summit LATAM in Mexico City (March 28)

Browse here for upcoming AWS led in-person and virtual events, startup events, and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

from AWS News Blog https://ift.tt/HY6MKFr
via IFTTT

Wednesday, March 4, 2026

Introducing OpenClaw on Amazon Lightsail to run your autonomous private AI agents

Today, we’re announcing the general availability of OpenClaw on Amazon Lightsail to launch OpenClaw instance, pairing your browser, enabling AI capabilities, and optionally connecting messaging channels. Your Lightsail OpenClaw instance is pre-configured with Amazon Bedrock as the default AI model provider. Once you complete setup, you can start chatting with your AI assistant immediately — no additional configuration required.

OpenClaw is an open-source self-hosted autonomous private AI agent that acts as a personal digital assistant by running directly on your computer. You can AI agents on OpenClaw through your browser to connect to messaging apps like WhatsApp, Discord, or Telegram to perform tasks such as managing emails, browsing the web, and organizing files, rather than just answering questions.

AWS customers have asked if they can run OpenClaw on AWS. Some of them blogged about running OpenClaw on Amazon EC2 instances. As someone who has experienced installing OpenClaw directly on my home device, I learned that this is not easy and that there are many security considerations.

So, let me introduce how to launch a pre-configured OpenClaw instance on Amazon Lightsail more easily and run it securely.

OpenClaw on Amazon Lightsail in action
To get started, go to the Amazon Lightsail console and choose Create instance on the Instances section. After choosing your preferred AWS Region and Availability Zone, Linux/Unix platform to run your instance, choose OpenClaw under Select a blueprint.

You can choose your instance plan (4 GB memory plan is recommended for optimal performance) and enter a name for your instance. Finally choose Create instance. Your instance will be in a Running state in a few minutes.

Before you can use the OpenClaw dashboard, you should pair your browser with OpenClaw. This creates a secure connection between your browser session and OpenClaw. To pair your browser with OpenClaw, choose Connect using SSH in the Getting started tab.

When a browser-based SSH terminal opens, you can see the dashboard URL, security credentials displayed in the welcome message. Copy them and open the dashboard in a new browser tab. In the OpenClaw dashboard, you can paste the copied access token into the Gateway Token field in the OpenClaw dashboard.

When prompted, press y to continue and a to approve with device pairing in the SSH terminal. When pairing is complete, you can see the OK status in the OpenClaw dashboard and your browser is now connected to your OpenClaw instance.

Your OpenClaw instance on Lightsail is configured to use Amazon Bedrock to power its AI assistant. To enable Bedrock API access, copy the script in the Getting started tab and run copied script into the AWS CloudShell terminal.

Once the script is complete, go to Chat in the OpenClaw dashboard to start using your AI assistant!

You can set up OpenClaw to work with messaging apps like Telegram and WhatsApp for interacting with your AI assistant directly from your phone or messaging client. To learn more, visit Get started with OpenClaw on Lightsail in the Amazon Lightsail User Guide.

Things to know
Here are key considerations to know about this feature:

  • Permission — You can customize AWS IAM permissions granted to your OpenClaw instance. The setup script creates an IAM role with a policy that grants access to Amazon Bedrock. You can customize this policy at any time. But, you should be careful when modifying permissions because it may prevent OpenClaw from generating AI responses. To learn more, visit AWS IAM policies in the AWS documentation
  • Cost — You pay for the instance plan you selected on an on-demand hourly rate only for what you use. Every message sent to and received from the OpenClaw assistant is processed through Amazon Bedrock using a token-based pricing model. If you select a third-party model distributed through AWS Marketplace such as Anthropic Claude or Cohere, there may be additional software fees on top of the per-token cost.
  • Security — Running a personal AI agent on OpenClaw is powerful, but it may cause security threat if you are careless. I recommend to hide your OpenClaw gateway never to expose it to open internet. The gateway auth token is your password, so rotate it often and store it in your envirnment file not hardcoded in config file. To learn more about security tips, visit Security on OpenClaw gateway.

Now available
OpenClaw on Amazon Lightsail is now available in all AWS commercial Regions where Amazon Lightsail is available. For Regional availability and a future roadmap, visit the AWS Capabilities by Region.

Give a try in the Lightsail console and send feedback to AWS re:Post for Amazon Lightsail or through your usual AWS support contacts.

Channy



from AWS News Blog https://ift.tt/NR12Vkm
via IFTTT

Monday, March 2, 2026

AWS Weekly Roundup: OpenAI partnership, AWS Elemental Inference, Strands Labs, and more (March 2, 2026)

This past week, I’ve been deep in the trenches helping customers transform their businesses through AI-DLC (AI-Driven Lifecycle) workshops. Throughout 2026, I’ve had the privilege of facilitating these sessions for numerous customers, guiding them through a structured framework that helps organizations identify, prioritize, and implement AI use cases that deliver measurable business value.

Screenshot of GenAI Developer Hour

AI-DLC is a methodology that takes companies from AI experimentation to production-ready solutions by aligning technical capabilities with business outcomes. If you’re interested in learning more, check out this blog post that dives deeper into the framework, or watch as Riya Dani teaches me all about AI-DLC on our recent GenAI Developer Hour livestream!

Now, let’s get into this week’s AWS news…

OpenAI and Amazon announced a multi-year strategic partnership to accelerate AI innovation for enterprises, startups, and end consumers around the world. Amazon will invest $50 billion in OpenAI, starting with an initial $15 billion investment and followed by another $35 billion in the coming months when certain conditions are met. AWS and OpenAI are co-creating a Stateful Runtime Environment powered by OpenAI models, available through Amazon Bedrock, which allows developers to keep context, remember prior work, work across software tools and data sources, and access compute.

AWS will serve as the exclusive third-party cloud distribution provider for OpenAI Frontier, enabling organizations to build, deploy, and manage teams of AI agents. OpenAI and AWS are expanding their existing $38 billion multi-year agreement by $100 billion over 8 years, with OpenAI committing to consume approximately 2 gigawatts of Trainium capacity, spanning both Trainium3 and next-generation Trainium4 chips.

Last week’s launches
Here are some launches and updates from this past week that caught my attention:

  • AWS Security Hub Extended offers full-stack enterprise security with curated partner solutions — AWS launched Security Hub Extended, a plan that simplifies procurement, deployment, and integration of full-stack enterprise security solutions including 7AI, Britive, CrowdStrike, Cyera, Island, Noma, Okta, Oligo, Opti, Proofpoint, SailPoint, Splunk, Upwind, and Zscaler. With AWS as the seller of record, customers benefit from pre-negotiated pay-as-you-go pricing, a single bill, no long-term commitments, unified security operations within Security Hub, and unified Level 1 support for AWS Enterprise Support customers.
  • Transform live video for mobile audiences with AWS Elemental Inference — AWS launched Elemental Inference, a fully managed AI service that automatically transforms live and on-demand video for mobile and social platforms in real time. The service uses AI-powered cropping to create vertical formats optimized for TikTok, Instagram Reels, and YouTube Shorts, and automatically extracts highlight clips with 6-10 second latency. Beta testing showed large media companies achieved 34% or more savings on AI-powered live video workflows. Deep dive into the Fox Sports implementation.
  • MediaConvert introduces new video probe API — AWS Elemental MediaConvert introduced a free Probe API for quick metadata analysis of media files, reading header metadata to return codec specifications, pixel formats, and color space details without processing video content.
  • OpenAI-compatible Projects API in Amazon Bedrock — Projects API provides application-level isolation for your generative AI workloads using OpenAI-compatible APIs in the Mantle inference engine in Amazon Bedrock. You can organize and manage your AI applications with improved access control, cost tracking, and observability across your organization.
  • Amazon Location Service introduces LLM Context — Amazon Location launched curated AI Agent context as a Kiro power, Claude Code plugin, and agent skill in the open Agent Skills format, improving code accuracy and accelerating feature implementation for location-based capabilities.
  • Amazon EKS Node Monitoring Agent is now open source — The Amazon EKS Node Monitoring Agent is now open source on GitHub, allowing visibility into implementation, customization, and community contributions.
  • AWS AppConfig integrates with New Relic — AWS AppConfig launched integration with New Relic Workflow Automation for automated, intelligent rollbacks during feature flag deployments, reducing detection-to-remediation time from minutes to seconds.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Other AWS news
Here are some additional posts and resources that you might find interesting:

From AWS community
Here are my personal favorite posts from AWS community:

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

  • AWS at NVIDIA GTC 2026 — Join us at our AWS sessions, booths, demos, ancillary events in NVIDIA GTC 2026 on March 16 – 19, 2026 in San Jose. You can receive 20% off event passes through AWS and request a 1:1 meeting at GTC.
  • AWS Summits — Join AWS Summits in 2026, free in-person events where you can explore emerging cloud and AI technologies, learn best practices, and network with industry peers and experts. Upcoming Summits include Paris (April 1), London (April 22), and Bengaluru (April 23–24).
  • AWS Community Days — Community-led conferences where content is planned, sourced, and delivered by community leaders. Upcoming events include JAWS Days in Tokyo (March 7), Chennai (March 7), Slovakia (March 11), and Pune (March 21).

Browse here for upcoming AWS led in-person and virtual events, startup events, and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

 



from AWS News Blog https://ift.tt/WX2o1Fa
via IFTTT

Thursday, February 26, 2026

AWS Security Hub Extended offers full-stack enterprise security with curated partner solutions

At re:Invent 2025, we introduced a completely re-imagined AWS Security Hub that unifies AWS security services, including Amazon GuardDuty and Amazon Inspector into a single experience. This unified experience automatically and continuously analyzes security findings in combination to help you prioritize and respond to your critical security risks.

Today, we’re announcing AWS Security Hub Extended, a plan of Security Hub that simplifies how you procure, deploy, and integrate a full-stack enterprise security solution across endpoint, identity, email, network, data, browser, cloud, AI, and security operations. With the Extended plan, you can expand your security portfolio beyond AWS to help protect your enterprise estate through a curated selection of AWS Partner solutions, including 7AI, Britive, CrowdStrike, Cyera, Island, Noma, Okta, Oligo, Opti, Proofpoint, SailPoint, Splunk, a Cisco company, Upwind, and Zscaler.

With AWS as the seller of record, you benefit from pre-negotiated pay-as-you-go pricing, a single bill, and no long-term commitments. You can also get unified security operations experience within Security Hub and unified Level 1 support for AWS Enterprise Support customers. You told us that managing multiple procurement cycles and vendor negotiations was creating unnecessary complexity, costing you time and resources. In response, we’ve curated these partner offerings for you to establish more comprehensive protection across your entire technology stack through a single, simplified experience.

Security findings from all participating solutions are emitted in the Open Cybersecurity Schema Framework (OCSF) schema and automatically aggregated in AWS Security Hub. With the Extended plan, you can combine AWS and partner security solutions to quickly identify and respond to risks that span boundaries.

The Security Hub Extended plan in action
You can access the partner solutions directly within the Security Hub console by selecting Extended plan under the Management menu. From there, you can review and deploy any combination of curated and partner offerings.

You can review details of each partner offering directly in the Security Hub console and subscribe. When you subscribe, you’ll be directed to an automated on-boarding experience from each partner. Once onboarded, consumption-based metering is automatic and you are billed monthly as part of your Security Hub bill.

Security findings from all solutions are automatically consolidated in AWS Security Hub. This gives you immediate and direct access to all security findings in normalized OCSF schema.

To learn more about how to enhance your security posture with these integrations for AWS Security Hub, visit the AWS Security Hub User Guide.

Now available
The AWS Security Hub Extended plan is now generally available in all AWS commercial Regions where Security Hub is available. You can use flexible pay-as-you-go or flat-rate pricing—no upfront investments or long-term commitments required. For more information about pricing, visit the AWS Security Hub pricing page.

Give it a try today in the Security Hub console and send feedback to AWS re:Post for Security Hub or through your usual AWS Support contacts.

Channy



from AWS News Blog https://ift.tt/7EZbQng
via IFTTT

Wednesday, February 25, 2026

Tuesday, February 24, 2026

Transform live video for mobile audiences with AWS Elemental Inference

Today, we’re announcing AWS Elemental Inference, a fully managed AI service that automatically transforms and maximizes live and on-demand video broadcasts to engage audiences at scale. At launch, you’ll be able to use AWS Elemental Inference to adapt video content into vertical formats optimized for mobile and social platforms in real time.

With AWS Elemental Inference, broadcasters and streamers can reach audiences on social and mobile platforms such as TikTok, Instagram Reels, and YouTube Shorts without manual postproduction work or AI expertise.

Today’s viewers consume content differently than they did even a few years ago. However, most broadcasts are produced in landscape format for traditional viewing. Converting these broadcasts into vertical formats for mobile platforms typically requires time-consuming manual editing that causes broadcasters and streamers to miss viral moments and lose audiences to mobile-first destinations.

Let’s try it out
AWS Elemental Inference offers flexible deployment options to fit your existing workflow. You can choose to create a feed through the standalone console or configure AWS Elemental Inference through the AWS Elemental MediaLive console.

AWS Elemental Inference console

To get started with AWS Elemental Inference, navigate to the AWS Management Console and choose AWS Elemental Inference. From the dashboard, choose Create feed to establish your top-level resource for AI-powered video processing. A feed contains your feature configurations and begins in CREATING state before transitioning to AVAILABLE when ready.

AWS Elemental Inference console

After creating your feed, you can configure outputs for either vertical video cropping or clip generation. For cropping, you can start with an empty feed. The service automatically manages cropping parameters based on your video specifications. For clip generation, choose Add output, provide a name (such as “highlight-clips”), select Clipping as the output type, and set the status to ENABLED.

This standalone interface provides a streamlined experience for configuring and managing your AI-powered video transformations, making it straightforward to get started with vertical video creation and clip generation.

AWS MediaLive inference

Alternatively, you can enable AWS Elemental Inference directly within your AWS Elemental MediaLive channel configuration. You can use this integrated approach to add AI capabilities to your existing live video workflows without modifying your architecture. Enable the features you need as part of your channel setup, and AWS Elemental Inference will work in parallel with your video encoding.

AWS MediaLive inference console

After it’s enabled, you can configure Smart Crop with outputs for different resolution specifications within an Output group.

AWS MediaLive inference console

AWS Elemental MediaLive now includes a dedicated AWS Elemental Inference tab on the channel details page, providing a centralized view of your AI-powered video transformation configuration. The tab displays the service Amazon Resource Name (ARN), data endpoints, and feed output details, including which features, such as Smart Crop, are enabled and their current operational status.

How AWS Elemental Inference works
The service uses an agentic AI application that analyses video in real time and automatically applies the right optimizations at the right moments. Detection of vertical video cropping and clip generation happens independently, executing multistep transformations that require no human intervention to extract value.

AWS Elemental Inference analyzes video and automatically applies AI capabilities with no human-in-the-loop prompting required. While you focus on quality video production, the service autonomously optimizes content to create personalized content experiences for your audience.

AWS Elemental Inference applies AI capabilities in parallel with live video, achieving 6–10 second latency compared to minutes for traditional postprocessing approaches. This “process once, optimize everywhere” method runs multiple AI features simultaneously on the same video stream, eliminating the need to reprocess content for each capability.

The service integrates seamlessly with AWS Elemental MediaLive, so you can enable AI features without modifying your existing video architecture. AWS Elemental Inference uses fully managed foundation models (FMs) that are automatically updated and optimized, so you don’t need dedicated AI teams or specialized expertise.

Key features at launch
Enjoy the following key features when AWS Elemental Inference launches:

  • Vertical video creation – AI-powered cropping intelligently transforms landscape broadcasts into vertical formats (9:16 aspect ratio) optimized for social and mobile platforms. The service tracks subjects and keeps key action visible, maintaining broadcast quality while automatically reformatting content for mobile viewing.
  • Clip generation with advanced metadata analysis – Automatically detects and extracts clips from live content, highlighting moments for real-time distribution. For live broadcasts, this means identifying game-winning plays in soccer and basketball—reducing manual editing from hours to minutes.

Keep an eye on this space as more features and capabilities will be introduced throughout this year, including tighter integration with core AWS Elemental services and features to help customers monetize their video content.

Now available
AWS Elemental Inference is available today in 4 AWS Regions: US East (N. Virginia), US West (Oregon), Europe (Ireland), and Asia Pacific (Mumbai). You can enable AWS Elemental Inference through the AWS Elemental MediaLive console or integrate it into your workflows using the AWS Elemental MediaLive APIs.

With consumption-based pricing, you pay only for the features you use and the video you process, with no upfront costs or commitments. This means you can scale during peak events and optimize costs during quieter periods.

To learn more about AWS Elemental Inference, visit the AWS Elemental Inference product page. For technical implementation details, see the AWS Elemental Inference documentation.

 



from AWS News Blog https://ift.tt/FPVUkcO
via IFTTT

Monday, February 23, 2026

AWS Weekly Roundup: Claude Sonnet 4.6 in Amazon Bedrock, Kiro in GovCloud Regions, new Agent Plugins, and more (February 23, 2026)

Last week, my team met many developers at Developer Week in San Jose. My colleague, Vinicius Senger delivered a great keynote about renascent software—a new way of building and evolving applications where humans and AI collaborate as co-developers using Kiro. Other colleagues spoke about building and deploying production-ready AI agents. Everyone stayed to ask and hear the questions related to agent memory, multi-agent patterns, meta-tooling and hooks. It was interesting how many developers were actually building agents.

We are continuing to meet developers and hear their feedback at third-party developer conferences. You can meet us at the dev/nexus, the largest and longest-running Java ecosystem conference on March 4-6 in Atlanta. My colleague, James Ward will speak about building AI Agents with Spring and MCP, and Vinicius Senger and Jonathan Vogel will speak about 10 tools and tips to upgrade your Java code with AI. I’ll keep sharing places for you to connect with us.

Last week’s launches
Here are some of the other announcements from last week:

  • Claude Sonnet 4.6 model in Amazon Bedrock – You can now use Claude Sonnet 4.6 which offers frontier performance across coding, agents, and professional work at scale. Claude Sonnet 4.6 approaches Opus 4.6 intelligence at a lower cost. It enables faster, high-quality task completion, making it ideal for high-volume coding and knowledge work use cases.
  • Amazon EC2 Hpc8a instances powered by 5th Gen AMD EPYC processors – You can use new Hpc8a instances delivering up to 40% higher performance, increased memory bandwidth, and 300 Gbps Elastic Fabric Adapter networking. You can accelerate compute-intensive simulations, engineering workloads, and tightly coupled HPC applications.
  • Amazon SageMaker Inference for custom Amazon Nova models – You can now configure the instance types, auto-scaling policies, and concurrency settings for custom Nova model deployments with Amazon SageMaker Inference to best meet your needs.
  • Nested virtualization on virtual Amazon EC2 instances – You can create nested virtual machines by running KVM or Hyper-V on virtual EC2 instances. You can leverage this capability for use cases such as running emulators for mobile applications, simulating in-vehicle hardware for automobiles, and running Windows Subsystem for Linux on Windows workstations.
  • Server-Side Encryption by default in Amazon Aurora – Amazon Aurora further strengthens your security posture by automatically applying server-side encryption by default to all new databases clusters using AWS-owned keys. This encryption is fully managed, transparent to users, and with no cost or performance impact.
  • Kiro in AWS GovCloud (US) Regions – You can use Kiro for the development teams behind government missions. Developers in regulated environments can now leverage Kiro’s agentic AI tool with the rigorous security controls required.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Additional updates
Here are some additional news items that you might find interesting:

  • Introducing Agent Plugins for AWS – You can see how new open-source Agent Plugins for AWS extend coding agents with skills for deploying applications to AWS. Using the deploy-on-aws plugin, you can generate architecture recommendations, cost estimates, and infrastructure-as-code directly from your coding agent.
  • A chat with Byron Cook on automated reasoning and trust in AI systems – You can hear how to verify AI systems doing the right thing using automated reasoning when they generate code or manage critical decisions. Byron Cook’s team has spent a decade proving correctness in AWS and apply those techniques to agentic systems.
  • Best practices for deploying AWS DevOps Agent in production – You can read best practices for setting up DevOps Agent Spaces that balance investigation capability with operational efficiency. According to Swami Sivasubramanian, AWS DevOps Agent, a frontier agent that resolves and proactively prevents incidents, has handled thousands of escalations, with an estimated root cause identification rate of over 86% within Amazon.

From AWS community
Here are my personal favorite posts from AWS community:

Join the AWS Builder Center to connect with community, share knowledge, and access content that supports your development.

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

  • AWS Summits – Join AWS Summits in 2026, free in-person events where you can explore emerging cloud and AI technologies, learn best practices, and network with industry peers and experts. Upcoming Summits include Paris (April 1), London (April 22), and Bengaluru (April 23–24).
  • Amazon Nova AI Hackathon – Join developers worldwide to build innovative generative AI solutions using frontier foundation models and compete for $40,000 in prizes across five categories including agentic AI, multimodal understanding, UI automation, and voice experiences during this six-week challenge from February 2nd to March 16th, 2026.
  • AWS Community Days – Community-led conferences where content is planned, sourced, and delivered by community leaders, featuring technical discussions, workshops, and hands-on labs. Upcoming events include Ahmedabad (February 28), JAWS Days in Tokyo (March 7), Chennai (March 7), Slovakia (March 11), and Pune (March 21).

Browse here for upcoming AWS led in-person and virtual events, startup events, and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy



from AWS News Blog https://ift.tt/BszQFro
via IFTTT

Monday, February 16, 2026

Amazon EC2 Hpc8a Instances powered by 5th Gen AMD EPYC processors are now available

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) Hpc8a instances, a new high performance computing (HPC) optimized instance type powered by latest 5th Generation AMD EPYC processors with a maximum frequency of up to 4.5 GHz. These instances are ideal for compute-intensive tightly coupled HPC workloads, including computational fluid dynamics, simulations for faster design iterations, high-resolution weather modeling within tight operational windows, and complex crash simulations that require rapid time-to-results.

The new Hpc8a instances deliver up to 40% higher performance, 42% greater memory bandwidth, and up to 25% better price-performance compared to previous generation Hpc7a instances. Customers benefit from the high core density, memory bandwidth, and low-latency networking that helped them scale efficiently and reduce job completion times for their compute-intensive simulation workloads.

Hpc8a instances
Hpc8a instances are available with 192 cores, 768 GiB memory, and 300 Gbps Elastic Fabric Adapter (EFA) networking to run applications requiring high levels of inter node communications at scale.

Instance Name Physical Cores Memory (Gib) EFA Network Bandwidth (Gbps) Network Bandwidth (Gbps) Attached Storage
Hpc8a.96xlarge 192 768 Up to 300 75 EBS Only

Hpc8a instances are available in a single 96xlarge size with a 1:4 core-to-memory ratio. You will have the capability to right size based on HPC workload requirements by customizing the number of cores needed at launch instances. These instances also use sixth-generation AWS Nitro cards, which offload CPU virtualization, storage, and networking functions to dedicated hardware and software, enhancing performance and security for your workloads.

You can use Hpc8a instances with AWS ParallelCluster and AWS Parallel Computing Service (AWS PCS) to simplify workload submission and cluster creation and Amazon FSx for Lustre for sub-millisecond latencies and up to hundreds of gigabytes per second of throughput for storage. To achieve the best performance for HPC workloads, these instances have Simultaneous Multithreading (SMT) disabled.

Now available
Amazon EC2 Hpc8a instances are now available in US East (Ohio) and Europe (Stockholm) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

You can purchase these instances as On-Demand Instances and Savings Plan. To learn more, visit the Amazon EC2 Pricing page.

Give Hpc8a instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 Hpc8a instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy



from AWS News Blog https://ift.tt/FJEuRfM
via IFTTT

Announcing Amazon SageMaker Inference for custom Amazon Nova models

Since we launched Amazon Nova customization in Amazon SageMaker AI at AWS NY Summit 2025, customers have been asking for the same capabilities with Amazon Nova as they do when they customize open weights models in Amazon SageMaker Inference. They also wanted have more control and flexibility in custom model inference over instance types, auto-scaling policies, context length, and concurrency settings that production workloads demand.

Today, we’re announcing the general availability of custom Nova model support in Amazon SageMaker Inference, a production-grade, configurable, and cost-efficient managed inference service to deploy and scale full-rank customized Nova models. You can now experience an end-to-end customization journey to train Nova Micro, Nova Lite, and Nova 2 Lite models with reasoning capabilities using Amazon SageMaker Training Jobs or Amazon HyperPod and seamlessly deploy them with managed inference infrastructure of Amazon SageMaker AI.

With Amazon SageMaker Inference for custom Nova models, you can reduce inference cost through optimized GPU utilization using Amazon Elastic Compute Cloud (Amazon EC2) G5 and G6 instances over P5 instances, auto-scaling based on 5-minute usage patterns, and configurable inference parameters. This feature enables deployment of customized Nova models with continued pre-training, supervised fine-tuning, or reinforcement fine-tuning for your use cases. You can also set advanced configurations about context length, concurrency, and batch size for optimizing the latency-cost-accuracy tradeoff for your specific workloads.

Let’s see how to deploy customized Nova models on SageMaker AI real-time endpoints, configure inference parameters, and invoke your models for testing.

Deploy custom Nova models in SageMaker Inference
At AWS re:Invent 2025, we introduced new serverless customization in Amazon SageMaker AI for popular AI models including Nova models. With a few clicks, you can seamlessly select a model and customization technique, and handle model evaluation and deployment. If you already have a trained custom Nova model artifact, you can deploy the models on SageMaker Inference through the SageMaker Studio or SageMaker AI SDK.

In the SageMaker Studio, choose a trained Nova model in Models in your models in the Models menu. You can deploy the model by choosing Deploy button, SageMaker AI and Create new endpoint.

Choose the endpoint name, instance type, and advanced options such as instance count, max instance count, permission and networking, and Deploy button. At GA launch, you can use g5.12xlarge, g5.24xlarge, g5.48xlarge, g6.12xlarge, g6.24xlarge, g6.48xlarge, and p5.48xlarge instance types for the Nova Micro model, g5.24xlarge, g5.48xlarge, g6.24xlarge, g6.48xlarge, and p5.48xlarge for the Nova Lite model, and p5.48xlarge for the Nova 2 Lite model.

Creating your endpoint requires time to provision the infrastructure, download your model artifacts, and initialize the inference container.

After model deployment completes and the endpoint status shows InService, you can perform real-time inference using the new endpoint. To test the model, choose the Playground tab and input your prompt in the Chat mode.

You can also use the SageMaker AI SDK to create two resources: a SageMaker AI model object that references your Nova model artifacts, and an endpoint configuration that defines how the model will be deployed.

The following code creates a SageMaker AI model that references your Nova model artifacts:

# Create a SageMaker AI model
    model_response = sagemaker.create_model(
        ModelName= 'Nova-micro-ml-g5-12xlarge',
        PrimaryContainer={
            'Image': '123456789012.dkr.ecr.us-east-1.amazonaws.com/nova-inference-repo:v1.0.0',
            'ModelDataSource': {
                'S3DataSource': {
                   'S3Uri': 's3://your-bucket-name/path/to/model/artifacts/',
                   'S3DataType': 'S3Prefix',
                   'CompressionType': 'None'
                }
            },
            # Model Parameters
            'Environment': {
                'CONTEXT_LENGTH': 8000,
                'CONCURRENCY': 16,
                'DEFAULT_TEMPERATURE': 0.0,
                'DEFAULT_TOP_P': 1.0
            }
        },
        ExecutionRoleArn=SAGEMAKER_EXECUTION_ROLE_ARN,
        EnableNetworkIsolation=True
    )
    print("Model created successfully!")

Next, create an endpoint configuration that defines your deployment infrastructure and deploy your Nova model by creating a SageMaker AI real-time endpoint. This endpoint will host your model and provide a secure HTTPS endpoint for making inference requests.

# Create Endpoint Configuration
    production_variant = {
        'VariantName': 'primary',
        'ModelName': 'Nova-micro-ml-g5-12xlarge',
        'InitialInstanceCount': 1,
        'InstanceType': 'ml.g5.12xlarge',
    }
    
    config_response = sagemaker.create_endpoint_config(
        EndpointConfigName= 'Nova-micro-ml-g5-12xlarge-Config',
        ProductionVariants= production_variant
    )
    print("Endpoint configuration created successfully!")
    
# Deploy your Noval model
    endpoint_response = sagemaker.create_endpoint(
        EndpointName= 'Nova-micro-ml-g5-12xlarge-endpoint',
        EndpointConfigName= 'Nova-micro-ml-g5-12xlarge-Config'
    )
    print("Endpoint creation initiated successfully!")

After the endpoint is created, you can send inference requests to generate predictions from your custom Nova model. Amazon SageMaker AI supports synchronous endpoints for real-time with streaming/non-streaming modes and asynchronous endpoints for batch processing.

For example, the following code creates streaming completion format for text generation:

# Streaming chat request with comprehensive parameters
streaming_request = {
"messages": [
        {"role": "user", "content": "Compare our Q4 2025 actual spend against budget across all departments and highlight variances exceeding 10%"}
    ],
    "max_tokens": 512,
    "stream": True,
    "temperature": 0.7,
    "top_p": 0.95,
    "top_k": 40,
    "logprobs": True,
    "top_logprobs": 2,
    "reasoning_effort": "low",  # Options: "low", "high"
    "stream_options": {"include_usage": True}
}

invoke_nova_endpoint(streaming_request)

def invoke_nova_endpoint(request_body):
"""
    Invoke Nova endpoint with automatic streaming detection.
    
    Args:
        request_body (dict): Request payload containing prompt and parameters
    
    Returns:
        dict: Response from the model (for non-streaming requests)
        None: For streaming requests (prints output directly)
    """
    body = json.dumps(request_body)
    is_streaming = request_body.get("stream", False)
    
    try:
        print(f"Invoking endpoint ({'streaming' if is_streaming else 'non-streaming'})...")
        
        if is_streaming:
            response = runtime_client.invoke_endpoint_with_response_stream(
                EndpointName=ENDPOINT_NAME,
                ContentType='application/json',
                Body=body
            )
            
            event_stream = response['Body']
            for event in event_stream:
                if 'PayloadPart' in event:
                    chunk = event['PayloadPart']
                    if 'Bytes' in chunk:
                        data = chunk['Bytes'].decode()
                        print("Chunk:", data)
        else:
            # Non-streaming inference
            response = runtime_client.invoke_endpoint(
                EndpointName=ENDPOINT_NAME,
                ContentType='application/json',
                Accept='application/json',
                Body=body
            )
            
            response_body = response['Body'].read().decode('utf-8')
            result = json.loads(response_body)
            print("✅ Response received successfully")
            return result
    
    except ClientError as e:
        error_code = e.response['Error']['Code']
        error_message = e.response['Error']['Message']
        print(f"❌ AWS Error: {error_code} - {error_message}")
    except Exception as e:
        print(f"❌ Unexpected error: {str(e)}")

To use full code examples, visit Customizing Amazon Nova models on Amazon SageMaker AI. To learn more about best practices on deploying and managing models, visit Best Practices for SageMaker AI.

Now available
Amazon SageMaker Inference for custom Nova models is available today in US East (N. Virginia) and US West (Oregon) AWS Regions. For Regional availability and a future roadmap, visit the AWS Capabilities by Region.

The feature supports Nova Micro, Nova Lite, and Nova 2 Lite models with reasoning capabilities, running on EC2 G5, G6, and P5 instances with auto-scaling support. You pay only for the compute instances you use, with per-hour billing and no minimum commitments. For more information, visit Amazon SageMaker AI Pricing page.

Give it a try in Amazon SageMaker AI console and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy



from AWS News Blog https://ift.tt/kBbR3fh
via IFTTT

AWS Weekly Roundup: Amazon EC2 M8azn instances, new open weights models in Amazon Bedrock, and more (February 16, 2026)

I joined AWS in 2021, and since then I’ve watched the Amazon Elastic Compute Cloud (Amazon EC2) instance family grow at a pace that still surprises me. From AWS Graviton-powered instances to specialized accelerated computing options, it feels like every few months there’s a new instance type landing that pushes performance boundaries further. As of February 2026, AWS offers over 1,160 Amazon EC2 instance types, and that number keeps climbing.

This week’s opening news is a good example: The general availability of Amazon EC2 M8azn instances. These are general purpose, high-frequency, high-network instances powered by fifth generation AMD EPYC processors, offering the highest maximum CPU frequency in the cloud at 5 GHz. Compared to the previous generation M5zn instances, M8azn instances deliver up to 2x compute performance, 4.3x higher memory bandwidth, and a 10x larger L3 cache. They also provide up to 2x networking throughput and up to 3x Amazon Elastic Block Store (Amazon EBS) throughput compared with M5zn.

Built on the AWS Nitro System using sixth generation Nitro Cards, M8azn instances target workloads such as real-time financial analytics, high-performance computing, high-frequency trading, CI/CD pipelines, gaming, and simulation modeling across automotive, aerospace, energy, and telecommunications. The instances feature a 4:1 ratio of memory to vCPU and are available in 9 sizes ranging from 2 to 96 vCPUs with up to 384 GiB of memory, including two bare metal variants. For more information visit the Amazon EC2 M8azn instance page.

Last week’s launches
Here are some of the other announcements from last week:

  • Amazon Bedrock adds support for six fully managed open weights models – Amazon Bedrock now supports DeepSeek V3.2, MiniMax M2.1, GLM 4.7, GLM 4.7 Flash, Kimi K2.5, and Qwen3 Coder Next. These models span frontier reasoning and agentic coding workloads. DeepSeek V3.2 and Kimi K2.5 target reasoning and agentic intelligence, GLM 4.7 and MiniMax M2.1 support autonomous coding with large output windows, and Qwen3 Coder Next and GLM 4.7 Flash provide cost-efficient alternatives for production deployment. These models are powered by Project Mantle and provide out-of-the-box compatibility with OpenAI API specifications. With the launch, you can also use new open weight models–DeepSeek v3.2 , MiniMax 2.1, and Qwen3 Coder Next in Kiro, a spec-driven AI development tool.
  • Amazon Bedrock expands support for AWS PrivateLink – Amazon Bedrock now supports AWS PrivateLink for the bedrock-mantle endpoint, in addition to existing support for the bedrock-runtime endpoint. The bedrock-mantle endpoint is powered by Project Mantle, a distributed inference engine for large-scale machine learning model serving on Amazon Bedrock. Project Mantle provides serverless inference with quality of service controls, higher default customer quotas with automated capacity management, and out-of-the-box compatibility with OpenAI API specifications. AWS PrivateLink support for OpenAI API-compatible endpoints is available in 14 AWS Regions. To get started, visit the Amazon Bedrock console or the OpenAI API compatibility documentation.
  • Amazon EKS Auto Mode announces enhanced logging for managed Kubernetes capabilities – You can now configure log delivery sources using Amazon CloudWatch Vended Logs in Amazon EKS Auto Mode. This helps you collect logs from Auto Mode’s managed Kubernetes capabilities for compute autoscaling, block storage, load balancing, and pod networking. Each Auto Mode capability can be configured as a CloudWatch Vended Logs delivery source with built-in AWS authentication and authorization at a reduced price compared to standard CloudWatch Logs. You can deliver logs to CloudWatch Logs, Amazon S3, or Amazon Data Firehose destinations. This feature is available in all Regions where EKS Auto Mode is available.
  • Amazon OpenSearch Serverless now supports Collection Groups – You can use new Collection Groups to share OpenSearch Compute Units (OCUs) across collections with different AWS Key Management Service (AWS KMS) keys. Collection Groups reduce overall OCU costs through a shared compute model while maintaining collection-level security and access controls. They also introduce the ability to specify minimum OCU allocations alongside maximum OCU limits, providing guaranteed baseline capacity at startup for latency-sensitive applications. Collection Groups are available in all Regions where Amazon OpenSearch Serverless is currently available.
  • Amazon RDS now supports backup configuration when restoring snapshots – You can view and modify the backup retention period and preferred backup window before and during snapshot restore operations. Previously, restored database instances and clusters inherited backup parameter values from snapshot metadata and could only be modified after restore was complete. You can now view backup settings as part of automated backups and snapshots, and specify or modify these values when restoring, eliminating the need for post-restoration modifications. This is available for all Amazon RDS database engines (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Db2) and Amazon Aurora (MySQL-Compatible and PostgreSQL-Compatible editions) in all AWS commercial Regions and AWS GovCloud (US) Regions at no additional cost.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

AWS Summits – Join AWS Summits in 2026, free in-person events where you can explore emerging cloud and AI technologies, learn best practices, and network with industry peers and experts. Upcoming Summits include Paris (April 1), London (April 22), and Bengaluru (April 23–24).

AWS AI and Data Conference 2026 – A free, single-day in-person event on March 12 at the Lyrath Convention Centre in Ireland. The conference covers designing, training, and deploying agents with Amazon Bedrock, Amazon SageMaker, and QuickSight, integrating them with AWS data services, and applying governance practices to operate them at scale. The agenda includes strategic guidance and hands-on labs for architects, developers, and business leaders.

AWS Community Days – Community-led conferences where content is planned, sourced, and delivered by community leaders, featuring technical discussions, workshops, and hands-on labs. Upcoming events include Ahmedabad (February 28), Slovakia (March 11), and Pune (March 21).

Join the AWS Builder Center to connect with builders, share solutions, and access content that supports your development. Browse here for upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Esra

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

 



from AWS News Blog https://ift.tt/ZlRk2cT
via IFTTT