Principal Engineer, GKE Platform for AI Inference Workloads
Company: Google
Location: Seattle
Posted on: April 2, 2026
|
|
|
Job Description:
info_outline X In accordance with Washington state law, we are
highlighting our comprehensive benefits package, which is available
to all eligible US based employees. Benefits for this role include:
Health, dental, vision, life, disability insurance Retirement
Benefits: 401(k) with company match Paid Time Off: 20 days of
vacation per year, accruing at a rate of 6.15 hours per pay period
for the first five years of employment Sick Time: 40 hours/year
(statutory, where applicable); 5 days/event (discretionary)
Maternity Leave (Short-Term Disability Baby Bonding): 28-30 weeks
Baby Bonding Leave: 18 weeks Holidays: 13 paid days per year Note:
By applying to this position you will have an opportunity to share
your preferred working location from the following: Seattle, WA,
USA; Kirkland, WA, USA; Sunnyvale, CA, USA . Minimum
qualifications: Bachelor's degree in Computer Science, a related
technical field, or equivalent practical experience. 15 years of
experience in software engineering, or 15 years of experience with
an advanced degree. Experience building distributed systems and
driving technical strategy for platform-level infrastructure.
Experience with Kubernetes, container runtimes, and AI/ML
infrastructure (e.g., inference serving, LLM, hardware
accelerators). Preferred qualifications: Master's degree or PhD in
Computer Science or related technical field. Experience interacting
with senior customer stakeholders (CTOs, Chief Architects) to
represent the technical vision of the organization. Deep technical
understanding of high-performance networking (RDMA, NCCL),
storage/caching architectures for massive model weights, and
accelerator virtualization/sharing mechanisms. Demonstrated track
record of significant technical contributions to the Kubernetes
open-source project or related CNCF AI/ML projects (e.g., Kueue).
Demonstrated track record of influencing cross-functional teams
(Product, Engineering, Research) to deliver complex technical
outcomes. About the job Google Kubernetes Engine (GKE) is the
industry standard for container orchestration and the core of
Google Cloud’s modernization strategy. We are now embarking on a
mission to reinvent GKE and Kubernetes as the premier substrate for
the next generation of computing: AI Inference at massive scale. We
believe that serving foundation models and large language models
represents a paradigm shift in cloud computing. These workloads
demand a fundamental rethink of orchestration, moving from
CPU-bound microservices to accelerator-bound, memory-bandwidth
intensive workloads that require specialized scheduling,
heterogeneous compute pools, and ultra-high-speed networking. As
the Principal Engineer you will lead the technical and
architectural reinvention of GKE to become the "Inference Engine"
for the world. This leader will provide critical LLM Debugger
(llm-d) leadership, defining and driving the long-term strategic
technical priorities for integrating high-scale AI Inference and
the llm-d stack as a core competency into the GKE platform, while
leading our contributions to the broader open-source ecosystem.
Google Cloud accelerates every organization’s ability to digitally
transform its business and industry. We deliver enterprise-grade
solutions that leverage Google’s cutting-edge technology, and tools
that help developers build more sustainably. Customers in more than
200 countries and territories turn to Google Cloud as their trusted
partner to enable growth and solve their most critical business
problems. The US base salary range for this full-time position is
$307,000-$427,000 bonus equity benefits. Our salary ranges are
determined by role, level, and location. Within the range,
individual pay is determined by work location and additional
factors, including job-related skills, experience, and relevant
education or training. Your recruiter can share more about the
specific salary range for your preferred location during the hiring
process. Please note that the compensation details listed in US
role postings reflect the base salary only, and do not include
bonus, equity, or benefits. Learn more about benefits at Google .
Responsibilities Lead the architectural direction for llm-d ,
ensuring a highly optimized, scalable foundation for distributed
LLM and Reinforcement Learning (RL) serving across the GKE fleet.
Define GKE's evolution to support massive-scale inference and RL,
solving novel orchestration problems in dynamic resource
allocation, multi-host TPU/GPU scheduling, and high-throughput
networking. Partner with strategic AI model builders, Google
DeepMind, and Vertex AI to co-develop an AI-first roadmap,
leveraging Google's custom silicon to optimize throughput and
compute density. Lead the broader Kubernetes ecosystem and Open
Source Software (OSS) community, driving key upstream initiatives
to establish industry standards for AI, RL, and accelerator
orchestration.
Keywords: Google, Everett , Principal Engineer, GKE Platform for AI Inference Workloads, IT / Software / Systems , Seattle, Washington