Senior Compiler Engineer, GPU Code Object Rewriting & Tooling

San Jose

Thursday, 07 May 2026

We are building first-class compilation and code-object tooling for HIP, Open. CL, Open. MP, and the broader RO - Cm stack. Our compilers, loaders, and post-link tools underpin every HPC application and AI framework that runs on AMD GP - Us. We are investing heavily in on-the-fly ISA rewriting and hot-patching infrastructure — inside the Code Object Manager (COMGR) and the AMDGPU backend — that lets us ship hardware fixes, errata workarounds, instrumentation, and performance experiments without recompiling user code. We are looking for a versatile Senior Compiler Engineer who can move fluidly between LLVM MC-level rewriting, ELF/ DWARF manipulation, AMDGPU codegen, and the tooling that ties it all together. This is a multi-year investment area: the rewriting infrastructure starts as an errata-mitigation platform and grows into a long-term foundation for post-link transformation, binary instrumentation, and experimentation across multiple generations of AMD GPU silicon. You will own this codebase as it matures. THE PERSON:If you are a Compiler Engineer who is equally comfortable reading an LLVM IR pass, disassembling a .text section by hand, reasoning about VGPR liveness across a CFG, and debugging an ELF loader, we would love to talk to you. You enjoy owning problems end-to-end — from hardware erratum to shipped mitigation — and you thrive when the work crosses the traditional compiler/runtime/loader boundaries. You are energized by deep, long-lived systems — the kind of engineer who wants to master a domain across multiple architecture generations, not rotate between short-lived projects. KEY RESPONSIBILITIES:Design, implement, and maintain the Hot. Swap ISA rewriting subsystem in COMGR (amd/comgr/src/comgr-hotswap- - ) — including ELF patching, DWARF debug-line adjustment, trampoline growth, NOP-sled management, and branch encoding. Build and extend LLVM MC-based disassembly, assembly, and re-encoding pipelines used by post-link transformation tools. Prototype and evaluate raising-based rewriting pipelines — lifting disassembled AMDGPU machine code into a structured intermediate representation (LLVM Machine. IR or a domain-specific in-tree IR) for analysis and transformation, then lowering back to valid code objects. Author ISA-specific rewrite policies (e.g., GFX 1250 B 0-to-A 0 style errata mitigations) and generalize them into reusable, ISA-parametric infrastructure. Implement and harden CFG construction, backward liveness analysis, and scratch VGPR allocation on raw AMDGPU machine code. Adjust ELF section/program headers, AMDGPU notes, kernel descriptors, and code-object metadata safely on malformed or adversarial inputs. Contribute to the AMDGPU LLVM backend, Clang driver, and LLD where rewriting needs first-class compiler support. Participate in new architecture and silicon bring-ups — owning the compiler/tooling path from bring-up workarounds to long-term codegen quality. Analyze, reproduce, and fix issues across the compiler, loader, and runtime boundary; build unit tests, fuzzers, and regressions for each fix. Collaborate with RO - Cm runtime, HSA, and hardware architecture teams spread across geographic locations. Represent AMD in open-source communities (e.g., LLVM) and relevant standards bodies (e.g., DWARF Committee) through upstream patches, RF - Cs, and design reviews. PREFERRED EXPERIENCE:Strong C/ C programming skills, with a demonstrated ability to write careful, bounds-checked code against untrusted binary input. Strong background in compilers and compiler I - Rs — LLVM IR, Machine. IR, or an equivalent production compiler stack. Hands-on experience with the LLVM MC layer (MC - Inst, MC - Disassembler, MC - Code. Emitter, MC - Streamer, Target. Registry)Experience designing or extending custom in-tree I - Rs — pass infrastructure, dataflow analyses, SSA construction, dominance, and target-specific lowering — particularly in the context of lifting low-level code into a more analyzable form. Exposure to binary lifting / raising — llvm-mctoll, QEMU TCG lifting, Ret. Dec, BAP, angr, or Ghidra P-code — and the practical challenges of reconstructing SSA and control flow from disassembled machine code. Working knowledge of ELF, DWARF, and related object-file formats; comfort reading and modifying binaries at the byte level. Familiarity with GPU IS - As (AMDGPU / GCN / RDNA / CDNA, or NVIDIA PTX/ SASS) — registers, encodings, branch ranges, scheduling constraints. Experience with dataflow analyses (liveness, reaching-definitions, dominance) and basic register allocation. Understanding of GPU execution models: waves/warps, VGP - Rs/ SGP - Rs, LDS, kernel descriptors, launch bounds, occupancy. Clang/ LLVM upstream contribution experience. Exposure to the RO - Cm stack (COMGR, HIP, HSA runtime, hipify) or an equivalent heterogeneous toolchain. Background in any of: debug information (DWARF/ PDB), binary instrumentation, dynamic binary translation, JIT engines, linker internals, or code-object loaders. ACADEMIC CREDENTIALS:Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent. This role is not eligible for visa sponsorship. #LI-G 11 #LI-HYBRID

Loading Similar Jobs...

JOBZ is an independent Job Search Engine. JOBZ is not an agent or representative and is not endorsed, sponsored or affiliated with any employer. JOBZ uses proprietary technology to keep the availability and accuracy of its job listings and their details. All trademarks, service marks, logos, domain names, job descriptions and other company descriptions / details are the property of their respective holder. JOBZ does not have its users apply for a job on the J-O-B-Z.com website. Additionally, JOBZ may provide a list of third-party job listings that may not be affiliated with any employer. Please make sure you understand and agree to the website's Terms & Conditions and Privacy Policies you are applying on as they may differ from ours and are not in our control.