Sr. Tech Product Manager - Neuron Runtime and ML Infrastructure - Amazon
Dallas, TX 75215
About the Job
Sr. Tech Product Manager - Neuron Runtime and ML Infrastructure
AWS Neuron is looking for an experienced Technical Product Manager to define and drive product strategy for Neuron Runtime and ML Infrastructure integration. You will be part of the AWS Neuron Product Management team, driving innovation in machine learning acceleration software. AWS Neuron is the software stack for Trainium and Inferentia, the AWS Machine Learning chips, delivering best-in-class ML performance in the cloud. You will lead runtime and infrastructure requirements working backward from customer needs, drive performance and scalability features across Neuron Runtime and container ecosystems, enabling ML training and inference at scale, optimal orchestration and efficient resource management, efficient execution of machine learning models and integration with AWS services. This role will empower customers to successfully deploy and scale ML workloads on AWS Neuron through deep understanding of runtime systems, infrastructure design, and cloud service integration.
The ideal candidate will have deep understanding of runtime systems, distributed computing, container orchestration, and ML infrastructure, with expertise in performance optimization, collective communication, Kubernetes ecosystems, Linux systems and enterprise distributions, and ML infrastructure deployment.
Key Responsibilities
- Drive and execute product strategy and roadmap working backwards from customer requirements in collaboration with engineering technical leadership.
- Assess technical implications of product architecture and optimization decisions.
- Drive technical alignment across Neuron components, Neuron workflows and dependencies.
- Work directly with software engineering teams to define and execute on new features.
- Produce clear and concise documents such as PRFAQ and PRD documents.
- Write user stories, and validate features meet developer needs.
- Drive feature discussions with customers, engineering, and other stakeholders.
- Anticipate bottlenecks, manage risk and escalations, balancing technical constraints.
- Find opportunities to innovate on behalf of our customers, design features related to these opportunities, and always push to improve our product developer experience.
- Build ecosystem partnerships and stay connected with industry trends.
- Represent the product in relevant industry events.
BASIC QUALIFICATIONS
- Bachelor's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent.
- 10+ years of industry experience with at least 5+ years in Technical product management and 3+ years of software development.
- Solid knowledge in container orchestration and Kubernetes.
- Solid knowledge in computer architecture fundamentals and operating systems concepts.
- Excellent written and verbal communication abilities.
PREFERRED QUALIFICATIONS
- Experience with Linux systems and kernel development.
- Track record of driving developer libraries.
- Experience with Machine Learning accelerators.
- Experience with concepts such as performance optimization, profiling and tooling.
- Experience with Deep Learning model training or inference.
- Experience with distributed computing and parallel processing.
- Hands on experience with major ML framework: JAX or PyTorch.
- Familiarity with AWS services and cloud infrastructure engineering.
- Track record of driving open standards and ecosystem integration.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
#J-18808-Ljbffr