Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA, United States • Posted May 29, 2026

Job Type: Full-time

Location: Cupertino, CA

Posted: May 29, 2026

Category: other-general

Application Deadline: June 08, 2026

Role Description

                    Description
The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.
  
The AWS Neuron SDK, developed by the Annapurna Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance.
  
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML accelerators. Working across the stack from PyTorch till the hardware-software boundary, our engineers build systematic infrastructure, ...

Interested in this role?

Click the button below to start your application for Software Development Engineer, AI/ML, AWS Neuron, Model Inference at Amazon.

Apply Now