/Senior Performance & Infrastructure Engineer - HPC

Senior Performance & Infrastructure Engineer - HPC

's-Hertogenbosch, Europe (wider region), NLnlvia direct
// Job Type
Full Time
// Salary
Not disclosed
// Posted
2 weeks ago

About the Role

<h5>The organization</h5><p>Our client operates one of the largest GPU infrastructures in the world — 100,000+ GPUs. Their infrastructure doubles in size every year. We’re looking for engineers who love getting deep into Linux systems, pushing hardware and software to their limits, and making the world’s fastest AI and HPC workloads run even faster</p><h5>The role</h5><p>You’ll join a small, senior team that works between the hardware and Linux OS layers, solving performance problems that affect tens of thousands of GPUs. This is hands-on, high-impact engineering where microsecond gains matter and every optimization is felt at global scale.</p><h5>What you’ll do</h5><ul><li><p><strong>Trace, profile, tune and optimize</strong> Linux kernel &amp; subsystems (CPU scheduling, memory management, networking stack) for GPU clusters and InfiniBand fabrics</p></li><li><p><strong>Troubleshoot</strong> and resolve complex performance bottlenecks</p></li><li><p><strong>Integrate and validate</strong> new GPU hardware &amp; infra (KVM/QEMU, PCIe devices, Kubernetes)</p></li><li><p>Improve <strong>monitoring, alerting, and automation</strong> for large-scale, distributed systems</p></li><li><p>Occasionally assist customers in optimizing workloads</p></li></ul><h5>Your profile</h5><p>Key requirements (non-negotiable):</p><ul><li><p><strong>Solid Linux internals knowledge</strong>, with kernel <strong>tracing, profiling and tuning</strong> experience (eg. <strong>perf, ftrace</strong>, eBPF, sysctl, kgdb etc.)</p></li><li><p>Excellent programming skills, <strong>C or C++ system-level code</strong>, with a good grasp of data structures &amp; algorithms</p></li><li><p>Experience in <strong>performance optimization</strong> (eg. high-load/high-throughput, low-latency, low-jitter, memory bypasses, zero-copy, lock-free, synchronization across large-scale clusters etc.)</p></li><li><p>Scripting or development skills in Go, Python, or similar</p></li></ul><p>Nice-to-haves (not key):</p><ul><li><p>Large-scale clusters (GPU or CPU)</p></li><li><p>Virtualization stacks (KVM/QEMU), Slurm, Kubernetes</p></li><li><p>Deep learning frameworks (eg. PyTortch, Tensorflow...)</p></li><li><p>GPU-specific stack (eg. CUDA, NCCL....)</p></li></ul><h5>This is for you if you</h5><p>Love solving deep technical challenges, care about performance downto the microsecond, and want to work on infrastructure that pushes the limits of what’s possible.</p><h5>What's offered</h5><ul><li><p>Salary: up to 160k + 25% bonus.</p></li><li><p>Flexible working arrangements.</p></li><li><p>A dynamic and collaborative work environment that values initiative and innovation.</p></li><li><p>Location: Amsterdam or full-remote from anywhere within the EU/EER</p></li></ul>

Interested in this job?

Login to Apply

Use our AI to tailor your resume for this Senior Performance & Infrastructure Engineer - HPC position at The Next Chapter W&S.