Sr Staff AI Platform Engineer - Golang and Kubernetes
Synopsys
Sr Staff AI Platform Engineer - Golang and Kubernetes
Hyderabad, Telangana, India Apply NowSynopsys’ Generative AI Center of Excellence defines the technology strategy to advance applications of Generative AI across the company. The Gen AI COE pioneers the core technologies – platforms, processes, data, and foundation models – to enable generative AI solutions, and partners with business groups and corporate functions to advance AI-focused roadmaps.
We are looking for an experienced, passionate, and self-driven individual who possesses both a broad technical strategy and the ability to tackle architectural and modernization challenges. As an Ideal candidate will help build enterprise Machine Learning platform. They will work with a team of enthusiastic and dynamic ML engineers and Data scientists in building a platform to help Synopsys R&D teams to experiment, train models and build Gen AI & ML products.
You will be responsible for:
- Building AI Platform for Synopsys to orchestrate enterprise-wide Data pipelines, ML training, and inferencing servers.
- Develop "AI App Store" eco system to enable R&D teams to host Gen AI applications in Cloud
- Develop capabilities to ship Cloud Native (Containerized) AI applications/AI systems to on-premises customers
- Orchestrate GPU Scheduling from within Kubernetes eco-system (e.g. Nvidia GPU Operator, MIG, and so on)
- Create reliable and cost-effective Hybrid cloud architecture using cutting edge technologies (E.g. Kubernetes Cluster Federation, Azure Arc and so on)
Required Qualifications
- BS/MS/PhD in Computer Science/Software Engineering or an equivalent degree
- 12+ years of total experience building systems software, enterprise software applications, and microservices
- Expertise and/or experience in following programming languages : Go and Python
- Experience building highly scalable REST API
- Experience with event driven software architecture and message brokers (NATS / Kafka)
- Design complex distributed systems (High-level and low-level systems design)
- Knowing CAP theorem in depth and applying it in building real-world distributed systems.
- In-Depth Kubernetes knowledge: Be able to deploy Kubernetes on-prem,working experience with managed Kubernetes services (AKS/EKS/GKE) and Kubernetes APIs
- Strong systems knowledge in Linux Kernel, CGroups, namespaces, and Docker
- Experience with at least one cloud provider (AWS/GCP/Azure)
- Ability to solve complex problems using efficient algorithms
- Experience with using RDBMS (PostgreSQL preferred) for storing and queuing large sets of data
Nice to have:
- Experience with service meshes (Istio)
- Experience with Kubernetes cluster federation
- Prior experience with AI/ML workflows and tools (PyTorch, ML Flow, AirFlow, …)
- Experience prototyping, experimenting, and testing with large datasets, and analytic data flows in production
- Strong fundamentals in Statistics, Machine Learning, and/or Deep Learning
Inclusion and Diversity are important to us. Synopsys considers all applicants for employment without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, military veteran status, or disability.
Apply Now