Skip to main content

Senior Research Engineer (AI Inference) in City of London

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.

We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.

Job Description

Role: Staff Software Engineer (Rust & AI Inference)

Location: London (2–3 days a week onsite)

Compensation: Up to £160,000 + equity


We’re supporting an emerging deep tech company building high-performance infrastructure to support AI model deployment at scale. They’re focused on enabling enterprises to run advanced AI systems in production, with a strong emphasis on privacy, performance, and full-stack control.


This is a hands-on senior IC role for an engineer who thrives in low-latency, compute-intensive environments, and enjoys mentoring others. You’ll play a key role in shaping the core inference layer powering complex real-time AI workflows.


If you're excited by the challenge of optimising distributed systems, designing for reliability, and scaling cutting-edge AI applications, this could be the role for you.


You'll be:

  • Building scalable, low-latency LLM inference infrastructure
  • Optimising performance with caching, quantisation, and speculative decoding
  • Contributing to core systems in Rust (we are happy for individuals to lean this on the job)
  • Designing distributed GPU orchestration and inference servers
  • Mentoring engineers and influencing technical direction across the team


You should bring:

  • 5+ years software engineering experience, with deep backend/system-level experience
  • Strong coding skills in a typed systems like C++ / Go / Rust (would also consider Python)
  • Familiarity with Kubernetes, and cloud infra
  • A strong engineering mindset with a bias for clean abstractions, reliability, and performance


Bonus points for:

  • Experience deploying open-source LLMs or VLMs in production
  • Experience with ML inference systems, PyTorch, Triton, or CUDA kernels
  • Background in document intelligence, enterprise search, or NLP pipelines
  • Prior exposure to multi-agent systems or complex orchestration workflows


This is a high-impact, high-autonomy role in a technically elite team that’s pushing the boundaries of what’s possible in AI infrastructure.

If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.

Senior Research Engineer (AI Inference) in City of London

City of London, UK
Full time

Published on 11/08/2025

Share this job now