AI Inference Engineer

Perplexity AI
London
Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world's leading AI platforms. Perplexity has raised over $1B in venture investment from some of the world's most visionary and successful leaders, including Elad Gil, Daniel Gross, Jeff Bezos, Accel, IVP, NEA, NVIDIA, Samsung, and many more. Our objective is to build accurate, trustworthy AI that powers decision-making for people and assistive AI wherever decisions are being made. Throughout human history, change and innovation have always been driven by curious people. Today, curious people use Perplexity to answer more than 780 million queries every month-a number that's growing rapidly for one simple reason: everyone can be curious.

We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.

Responsibilities
  • Develop APIs for AI inference that will be used by both internal and external customers
  • Benchmark and address bottlenecks throughout our inference stack
  • Improve the reliability and observability of our systems and respond to system outages
  • Explore novel research and implement LLM inference optimizations
Qualifications
  • Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
  • Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
  • Experience with deploying reliable, distributed, real-time model serving at scale
  • (Optional) Understanding of GPU architectures or experience with GPU kernel programming using CUDA
At Perplexity, we've experienced tremendous growth and adoption since publicly launching the world's first fully functional conversational answer engine just over a year ago. Our AI-powered search assistant has amassed 10 million monthly active users as of early 2024, with our mobile apps installed over 1 million times across iOS and Android devices. In 2023 alone, we served over 500 million queries from users around the globe.

To support our rapid expansion, we've raised significant funding from some of the most respected investors in technology. In January 2024, we raised $73.6 million in a Series B round led by IVP, with participation from NVIDIA, Jeff Bezos' investment fund, NEA, Databricks, and other prominent firms. We followed that up with a $62.7 million Series B1 round in April 2024 led by Daniel Gross, valuing Perplexity at over $1 billion.
Our prominent investor base includes IVP, NEA, Jeff Bezos, NVIDIA, Databricks, Bessemer Venture Partners, Elad Gil, Nat Friedman, Naval Ravikant, Tobi Lutke, and many other visionary individuals.

Final offer amounts are determined by multiple factors, including, experience and expertise, and may vary from the amounts listed above.

Equity: In addition to the base salary, equity may be part of the total compensation package.

Benefits: Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan.
Posted 2025-10-15

Recommended Jobs

Data Architect Fintech

Quant Capital
London

Data Architect Fintech £70,000 + benefits Quant Capital is urgently looking for a Data Architect to join our high profile fintech client. My client is a well known financi…

View Details
Posted 2025-10-30

Senior Process Technologist

Zest Recruitment
London

I am looking for hands on Senior Process Technologist who is looking to join a booming and growing business. This role will be fast paced, working across multiple top retailers and offers a large vari…

View Details
Posted 2025-10-15

Data Migration Lead

Hays Technology
City of London, Greater London

Data Migration LeadSurrey (2 days onsite)£650 day rate, outside IR35 Your new roleIn your new role, you will be leading data-related activities during transition and working with Data Mapping, ETL m…

View Details
Posted 2025-10-27

Director level - Salesforce Architect (Insurance...

Vantage Point
London

About the Role Our client, a leading global financial institution, is seeking an accomplished Salesforce Architect at Director level to play a pivotal role in a large-scale transformation program…

View Details
Posted 2025-10-16

Senior GoLang Software Engineer

London

Must-Have (Non-Negotiable): ~10+ years of professional software engineering experience, with 5+ years in Go. ~ Proven ability to solve complex problems end-to-end, not just implement tickets. ~…

View Details
Posted 2025-10-15

Head of Product - Payments

Reapit UK
London

Reapit – Who are we? Reapit is the original, end-to-end business technology provider for estate agencies of all sizes. We’ve been helping sales and lettings agents to build relationships and grow…

View Details
Posted 2025-10-21

Temporary Assistant - Online Order Management

BIMBA Y LOLA
London

We are looking for a temporary team member (30h/week) to support our Online Order operations for our stores Bimba y Lola Regent and Bimba y Lola Kings Road. This person will help manage all incomi…

View Details
Posted 2025-11-03

CDM Principal Designer

Brandon James
City of London, Greater London

An leading architectural practice is seeking a CDM Consultant to aid with the Design Risk Management on their portfolio of large-scale new-build projects within London. You will act as Principal Desig…

View Details
Posted 2025-10-27