Perplexity AI Inference Intern London | 2–3 Spots
| Company | Perplexity |
| Role | UK Internship Program — AI Inference Team |
| Location | London, UK (hybrid: 3 days office / 2 days WFH) |
| Duration | 13 weeks (full-time or part-time) |
| Class size | Only 2–3 intern spots (2026 class) |
| Eligibility | Master's or PhD in CS/Engineering; 2025–2026 academic year |
| Visa | No visa sponsorship; student visa holders need university work approval |
Overview
Perplexity is hiring 2–3 exceptional Master's or PhD interns for its AI Inference team in London. You'll optimize serving latency and throughput for models from single-node embeddings to distributed sparse Mixture-of-Experts deployments — from GPU kernels through networking and monitoring.
Key Requirements & Critical Rules
- Degree: Pursuing Master's or PhD in Computer Science or Engineering (2025–2026 academic year).
- Focus: Performance-related subjects — HPC, compilers, distributed systems.
- Technical depth: Strong systems fundamentals; multi-threading, networking, compilation, systems programming.
- ML/GPU: PyTorch/JAX; CUDA, Triton; OpenMPI / HPC experience.
- Work: Improve inference latency/throughput; new model support; quantization and stack-wide optimization.
- Schedule: 13 weeks; hybrid 3 days office / 2 days WFH in London.
- Spots: Only 2–3 interns in the 2026 class — highly selective.
- Visa: No visa sponsorship; on student visa → university must approve work eligibility.
- Not provided: No housing or health insurance for interns (FT employees get benefits).
- Outcome: Outstanding performers may receive full-time offers (no fixed cap).
- Apply: Official Perplexity website application required.