Description
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE TEAM:
AMD's Data Center GPU organization is transforming the industry with our Instinct AI Accelerator Solutions. Our primary objective is to design exceptional products that drive the evolution of computing experiences, serving as the cornerstone for cutting edge AI model development and inference in Mega Data Centers, Enterprise Data Centers, HPC and Embedded systems. If this resonates with you, come and join our Data Center GPU organization where we are building amazing AI powered products with amazing people.
THE ROLE:
This is a unique opportunity to own and define the most technically demanding and business-critical aspects of AI datacenter hardware platforms. You will set the standard for system-level customer debug, directly shaping the reliability and success of our Instinct GPU product deployments worldwide. Your work will span silicon, board-level hardware, platform, high-speed interconnect, memory subsystems (DDR, HBM), power management, networking and data center infrastructure integration.
If you are a technical executive with deep expertise in hardware and platform debug—thriving at the intersection of leadership, architecture, and execution—we want to speak with you.
THE PERSON:
We are seeking a highly accomplished Technical Director to lead the Instinct Customer Engineering Platform Debug and Qualification team, with a primary focus on hardware and platform excellence for next-generation AI datacenter products. This high-impact, high-visibility role drives critical debug efforts from Day 0, ensuring the reliability, stability, and customer success of advanced datacenter hardware solutions.
This is not a typical management role. We need a deep technical leader with proven expertise in solving complex, multi-layered hardware and platform challenges, collaborating with world-class architects, engineers and customers.
KEY RESPONSIBILITIES:
- Lead customer debug engagements for end-to-end datacenter hardware platforms—including silicon, board-level hardware, platform, and system integration.
- Collaborate with silicon, firmware, and hardware architects, post-silicon bring-up leaders, and validation experts to define and execute robust debug and co-engineering strategies for technologies such as PCIe, CXL, Ethernet, DDR/HBM memory, RAS, and power management.
- Drive hands-on platform and system level debug/validation, including complex stress testing, irritator-driven reproductions, and root cause analysis at the hardware and platform level.
- Oversee functional areas such as GPU and system management (BMC) firmware, networking/switch hardware (Ethernet, InfiniBand), OS/hardware interfaces, kernel, drivers, security, telemetry, PCIe, memory, and schedulers.
- Represent hardware debug progress and insights with clarity and impact at the executive level, ensuring alignment and accountability across cross-functional teams.
- Build a culture of ownership, accountability, and technical excellence, while mentoring senior engineers and technical leaders in hardware and platform domains.
PREFERRED SKILLS:
- Proven leadership in datacenter/platform hardware debug within large-scale, complex hardware/software systems.
- Hands-on debug expertise across system architecture, platform firmware, hardware (board, silicon, data center rack).
- Strong track record in building and leading high-performing debug/customer engineering teams focused on hardware and platform reliability.
- Exceptional ability to articulate technical depth upward to executives and outward to customers, simplifying hardware complexity without losing precision.
- High adaptability, thriving in fast-paced, ambiguous environments where hardware priorities shift quickly. Extensive experience in debug and validation roles involving OS, firmware, silicon, and hardware issues, with a strong emphasis on GPU technologies.
- Expert knowledge of X86 architecture, SoC design, memory, RAS, power management, system management, BMC, PCIe, CXL, and Data Center Infrastructure.
- Detail-oriented, highly organized, and capable of leading multiple hardware work streams within tight deadlines.
ACEDEMIC CREDENTIALS:
- Bachelor's/Master's in Computer Engineering with 15+ years of applicable experience, with a significant portion in GPU technologies and customer-facing roles.
LOCATION:
Austin, TX (Remote is an option)
#LI-RW1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
Apply on company website