Senior Compiler Engineer, GPU Code Object Rewriting & Tooling
San Jose
Thursday, 07 May 2026
We are building first-class compilation and code-object tooling for HIP, Open. CL, Open. MP, and the broader RO - Cm stack. Our compilers, loaders, and post-link tools underpin every HPC application and AI framework that runs on AMD GP - Us. We are investing heavily in on-the-fly ISA rewriting and hot-patching infrastructure — inside the Code Object Manager (COMGR) and the AMDGPU backend — that lets us ship hardware fixes, errata workarounds, instrumentation, and performance experiments without recompiling user code. We are looking for a versatile Senior Compiler Engineer who can move fluidly between LLVM MC-level rewriting, ELF/ DWARF manipulation, AMDGPU codegen, and the tooling that ties it all together. This is a multi-year investment area: the rewriting infrastructure starts as an errata-mitigation platform and grows into a long-term foundation for post-link transformation, binary instrumentation, and experimentation across multiple generations of AMD GPU silicon. You will own this codebase as it matures. THE PERSON:If you are a Compiler Engineer who is equally comfortable reading an LLVM IR pass, disassembling a .text section by hand, reasoning about VGPR liveness across a CFG, and debugging an ELF loader, we would love to talk to you. You enjoy owning problems end-to-end — from hardware erratum to shipped mitigation — and you thrive when the work crosses the traditional compiler/runtime/loader boundaries. You are energized by deep, long-lived systems — the kind of engineer who wants to master a domain across multiple architecture generations, not rotate between short-lived projects. KEY RESPONSIBILITIES:Design, implement, and maintain the Hot. Swap ISA rewriting subsystem in COMGR (amd/comgr/src/comgr-hotswap- - ) — including ELF patching, DWARF debug-line adjustment, trampoline growth, NOP-sled management, and branch encoding. Build and extend LLVM MC-based disassembly, assembly, and re-encoding pipelines used by post-link transformation tools. Prototype and evaluate raising-based rewriting pipelines — lifting disassembled AMDGPU machine code into a structured intermediate representation (LLVM Machine. IR or a domain-specific in-tree IR) for analysis and transformation, then lowering back to valid code objects. Author ISA-specific rewrite policies (e.g., GFX 1250 B 0-to-A 0 style errata mitigations) and generalize them into reusable, ISA-parametric infrastructure. Implement and harden CFG construction, backward liveness analysis, and scratch VGPR allocation on raw AMDGPU machine code. Adjust ELF section/program headers, AMDGPU notes, kernel descriptors, and code-object metadata safely on malformed or adversarial inputs. Contribute to the AMDGPU LLVM backend, Clang driver, and LLD where rewriting needs first-class compiler support. Participate in new architecture and silicon bring-ups — owning the compiler/tooling path from bring-up workarounds to long-term codegen quality. Analyze, reproduce, and fix issues across the compiler, loader, and runtime boundary; build unit tests, fuzzers, and regressions for each fix. Collaborate with RO - Cm runtime, HSA, and hardware architecture teams spread across geographic locations. Represent AMD in open-source communities (e.g., LLVM) and relevant standards bodies (e.g., DWARF Committee) through upstream patches, RF - Cs, and design reviews. PREFERRED EXPERIENCE:Strong C/ C programming skills, with a demonstrated ability to write careful, bounds-checked code against untrusted binary input. Strong background in compilers and compiler I - Rs — LLVM IR, Machine. IR, or an equivalent production compiler stack. Hands-on experience with the LLVM MC layer (MC - Inst, MC - Disassembler, MC - Code. Emitter, MC - Streamer, Target. Registry)Experience designing or extending custom in-tree I - Rs — pass infrastructure, dataflow analyses, SSA construction, dominance, and target-specific lowering — particularly in the context of lifting low-level code into a more analyzable form. Exposure to binary lifting / raising — llvm-mctoll, QEMU TCG lifting, Ret. Dec, BAP, angr, or Ghidra P-code — and the practical challenges of reconstructing SSA and control flow from disassembled machine code. Working knowledge of ELF, DWARF, and related object-file formats; comfort reading and modifying binaries at the byte level. Familiarity with GPU IS - As (AMDGPU / GCN / RDNA / CDNA, or NVIDIA PTX/ SASS) — registers, encodings, branch ranges, scheduling constraints. Experience with dataflow analyses (liveness, reaching-definitions, dominance) and basic register allocation. Understanding of GPU execution models: waves/warps, VGP - Rs/ SGP - Rs, LDS, kernel descriptors, launch bounds, occupancy. Clang/ LLVM upstream contribution experience. Exposure to the RO - Cm stack (COMGR, HIP, HSA runtime, hipify) or an equivalent heterogeneous toolchain. Background in any of: debug information (DWARF/ PDB), binary instrumentation, dynamic binary translation, JIT engines, linker internals, or code-object loaders. ACADEMIC CREDENTIALS:Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent. This role is not eligible for visa sponsorship. #LI-G 11 #LI-HYBRID