/RESEARCH ENGINEER

RESEARCH ENGINEER

United KingdomRemotegbvia direct
// Job Type
Full Time
// Salary
USD 280,000 - 370,000/year
// Salary Range
280,000–370,000 USD / year
// Posted
1 week ago
// Work Mode
remote

About the Role

Research Engineer

USA Remote / $280000 - $370000 annum

INFO

$280000 - $370000

LOCATION

USA Remote

Permanent

Title: Research Engineer, GPU

Location: USA Remote

Compensation: Up to $400k + Equity

We're partnered with a well-funded AI research company focused on building next-generation multimodal models for media and interactive experiences. Their work spans cutting-edge generative systems and is increasingly moving toward real-time, interactive environments, pushing beyond static outputs into dynamic, AI-driven applications.

This is a deeply technical, high-impact role focused on making large-scale AI systems faster, more efficient, and capable of running in real time. You'll work across the stack, from low-level GPU kernels to distributed training systems, directly influencing what is computationally possible for next-generation AI models.

What You'll Do

  • Optimize training throughput across large GPU clusters, improving efficiency and utilization
  • Implement techniques such as mixed precision (FP8, BF16), memory-efficient attention, and checkpointing
  • Design and scale distributed training systems (tensor parallelism, FSDP, multi-node setups)
  • Profile and optimize inference pipelines for real-time multimodal generation
  • Improve latency through CUDA graphing, KV cache optimization, and operator fusion
  • Contribute across the stack, from kernel-level optimization to system-level architecture

Requirements

  • 4+ years of experience in systems engineering, ML infrastructure, or performance optimization
  • Strong experience with GPU programming (CUDA, Triton, or similar)
  • Experience with distributed systems and large-scale training (NCCL, model parallelism)
  • Familiarity with ML framework internals such as PyTorch or JAX
  • Experience with mixed or low-precision techniques (FP8, INT8, BF16)
  • Proven experience building and operating scalable, fault-tolerant training systems
  • Strong interest in pushing the limits of performance for cutting-edge AI systems

Nice to Have

  • Experience with compiler optimizations or model compilation (e.g., PyTorch compile)
  • Background working on large multimodal or generative models
  • Exposure to real-time inference systems

If you're interested in working on the systems that enable next-generation AI models to train faster and run in real time, this is a rare opportunity to operate at the cutting edge of research and infrastructure.

CONTACT

George Ethell

Principal Recruitment Consultant

SIMILAR
JOB RESULTS

Research Engineer

USA Remote

$280000 - $370000

+ Data Science & AI

Permanent

USA

Title: Research Engineer, Data

Location: USA Remote

Compensation: Up to $400k + Equity

We’re partnered with a well-funded AI research company focused on building next-generation multimodal models for media and interactive experiences. Their work spans cutting-edge generative systems and is increasingly moving toward real-time, interactive environments, pushing beyond static outputs into dynamic, AI-driven applications.

This is a high-impact, research-meets-engineering role focused on the data that powers advanced AI systems. You’ll own how models learn, designing datasets, running experiments, and building data pipelines that directly shape model capabilities across a wide range of applications.

What You’ll Do

  • Design multimodal, multitask datasets to unlock new model capabilities
  • Run controlled experiments to understand how data impacts model performance
  • Build and scale pipelines for synthetic data generation, filtering, and quality control
  • Define evaluation frameworks and benchmarks to measure real-world model improvement
  • Partner with cross-functional teams to translate product goals into data strategies

Requirements

  • 4+ years of experience in machine learning, ideally with a data-centric focus
  • Experience working with large multimodal datasets and generative models
  • Strong intuition for how data quality and composition impact model behavior
  • Experience across the full ML lifecycle, from data to training to evaluation
  • Proficiency with ML frameworks such as PyTorch or JAX
  • Experience with distributed systems or compute tools (e.g., Ray, Kubernetes)
  • Strong interest in advancing next-generation AI systems

Nice to Have

  • Experience with synthetic data generation or data curation at scale
  • Background working on multimodal or video-based models
  • Exposure to evaluation and benchmarking for generative systems

If you’re interested in shaping the data that defines what AI models can learn and do, this is a unique opportunity to work at the forefront of multimodal AI.

Senior AI Software Engineer

San Francisco

$170000 - $240000

+ Data Science & AI

Permanent

San Francisco, California

To Apply for this Job Click Here

Title: Senior AI Engineer

Location: Remote, United States

Compensation: Up to $240,000 + Equity

We’re partnered with a high-growth, mission-driven SaaS company transforming how businesses build and maintain trust, with AI at the core of their next phase of innovation. The platform is redefining how critical enterprise workflows are automated, particularly in areas where reliability, auditability, and security are essential.

This is a high-impact role where you will help define how AI is architected across the company. You won’t just be building features. You’ll make foundational decisions around systems, evaluation, and long-term technical direction, working across LLMs, retrieval systems, and agent-based workflows in production environments.

What You’ll Do

  • Design and own production AI systems end-to-end, including LLM pipelines, retrieval systems, and orchestration layers
  • Build and scale RAG systems, reranking pipelines, and vector-based search infrastructure
  • Define evaluation frameworks to measure retrieval quality, reasoning accuracy, and system performance
  • Analyze production behavior, identify failure modes, and drive improvements based on data
  • Make key architectural decisions across model infrastructure, tooling, and workflows
  • Partner closely with product, platform, and domain teams to translate complex requirements into scalable systems
  • Lead best practices for building reliable, observable, and cost-efficient AI systems

Requirements

  • 6+ years of software engineering experience, including 3+ years working on ML or AI systems
  • Proven experience owning and deploying production LLM systems
  • Strong background in RAG, embeddings, reranking, and vector databases (e.g., Pinecone, FAISS, Chroma)
  • Experience designing evaluation systems and improving models through quantitative analysis
  • Strong Python skills, with solid software engineering fundamentals
  • Experience making architectural decisions that influence team or org direction
  • Strong understanding of production systems, including reliability, observability, and cost tradeoffs
  • Ability to break down ambiguous problems and operate with a high degree of ownership
  • Clear communication skills and experience working cross-functionally

Nice to Have

  • Experience in regulated domains such as compliance or security
  • Familiarity with data platforms or analytics tooling
  • Experience with orchestration frameworks (e.g., Temporal, Airflow)
  • Exposure to LLM evaluation platforms or tooling
  • Contributions to open source, research, or technical communities

If you’re interested in shaping how AI systems are built, evaluated, and deployed in high-trust environments, this is an opportunity to have direct influence on both technical direction and real-world impact at a fast-growing company.

To Apply for this Job Click Here

Research Engineer

USA Remote

$280000 - $370000

+ Data Science & AI

Permanent

USA

To Apply for this Job Click Here

Title: Research Engineer, GPU

Location: USA Remote

Compensation: Up to $400k + Equity

We’re partnered with a well-funded AI research company focused on building next-generation multimodal models for media and interactive experiences. Their work spans cutting-edge generative systems and is increasingly moving toward real-time, interactive environments, pushing beyond static outputs into dynamic, AI-driven applications.

This is a deeply technical, high-impact role focused on making large-scale AI systems faster, more efficient, and capable of running in real time. You’ll work across the stack, from low-level GPU kernels to distributed training systems, directly influencing what is computationally possible for next-generation AI models.

What You’ll Do

  • Optimize training throughput across large GPU clusters, improving efficiency and utilization
  • Implement techniques such as mixed precision (FP8, BF16), memory-efficient attention, and checkpointing
  • Design and scale distributed training systems (tensor parallelism, FSDP, multi-node setups)
  • Profile and optimize inference pipelines for real-time multimodal generation
  • Improve latency through CUDA graphing, KV cache optimization, and operator fusion
  • Contribute across the stack, from kernel-level optimization to system-level architecture

Requirements

  • 4+ years of experience in systems engineering, ML infrastructure, or performance optimization
  • Strong experience with GPU programming (CUDA, Triton, or similar)
  • Experience with distributed systems and large-scale training (NCCL, model parallelism)
  • Familiarity with ML framework internals such as PyTorch or JAX
  • Experience with mixed or low-precision techniques (FP8, INT8, BF16)
  • Proven experience building and operating scalable, fault-tolerant training systems
  • Strong interest in pushing the limits of performance for cutting-edge AI systems

Nice to Have

  • Experience with compiler optimizations or model compilation (e.g., PyTorch compile)
  • Background working on large multimodal or generative models
  • Exposure to real-time inference systems

If you’re interested in working on the systems that enable next-generation AI models to train faster and run in real time, this is a rare opportunity to operate at the cutting edge of research and infrastructure.

To Apply for this Job Click Here

Staff AI Software Engineer

San Francisco, CA

$220000 - $280000

+ Data Science & AI

Permanent

San Francisco, California

To Apply for this Job Click Here

Title: Staff AI Engineer

Location: Remote, United States

Compensation: Up to $280,000 + Equity

We’re partnered with a high-growth, mission-driven SaaS company transforming how businesses build and maintain trust, with AI at the core of their next phase of innovation. The platform is redefining how critical enterprise workflows are automated, particularly in areas where reliability, auditability, and security are essential.

This is a high-impact role where you will help define how AI is architected across the company. You won’t just be building features. You’ll make foundational decisions around systems, evaluation, and long-term technical direction, working across LLMs, retrieval systems, and agent-based workflows in production environments.

What You’ll Do

  • Design and own production AI systems end-to-end, including LLM pipelines, retrieval systems, and orchestration layers
  • Build and scale RAG systems, reranking pipelines, and vector-based search infrastructure
  • Define evaluation frameworks to measure retrieval quality, reasoning accuracy, and system performance
  • Analyze production behavior, identify failure modes, and drive improvements based on data
  • Make key architectural decisions across model infrastructure, tooling, and workflows
  • Partner closely with product, platform, and domain teams to translate complex requirements into scalable systems
  • Lead best practices for building reliable, observable, and cost-efficient AI systems

Requirements

  • 10+ years of software engineering experience, including 3+ years working on ML or AI systems
  • Proven experience owning and deploying production LLM systems
  • Strong background in RAG, embeddings, reranking, and vector databases (e.g., Pinecone, FAISS, Chroma)
  • Experience designing evaluation systems and improving models through quantitative analysis
  • Strong Python skills, with solid software engineering fundamentals
  • Experience making architectural decisions that influence team or org direction
  • Strong understanding of production systems, including reliability, observability, and cost tradeoffs
  • Ability to break down ambiguous problems and operate with a high degree of ownership
  • Clear communication skills and experience working cross-functionally

Nice to Have

  • Experience in regulated domains such as compliance or security
  • Familiarity with data platforms or analytics tooling
  • Experience with orchestration frameworks (e.g., Temporal, Airflow)
  • Exposure to LLM evaluation platforms or tooling
  • Contributions to open source, research, or technical communities

If you’re interested in shaping how AI systems are built, evaluated, and deployed in high-trust environments, this is an opportunity to have direct influence on both technical direction and real-world impact at a fast-growing company.

To Apply for this Job Click HereCAN’T FIND THE RIGHT OPPORTUNITY? STILL
LOOKING?

If you can’t see what you’re looking for right now, send us your CV anyway – we’re always getting fresh new roles through the door.

Interested in this job?

Login to Apply

Use our AI to tailor your resume for this RESEARCH ENGINEER position at Harnham - Data & Analytics Recruitment.