AI Engineer

What it's about

We're building the infrastructure that makes AI in the legal industry actually reliable, not as a demo but in production, with measurable legal quality. We're looking for people who want to dig deep into the mathematics and architecture of modern LLM systems, not just chain APIs.

If you've wondered in the past twelve months whether there is a job somewhere where you actually need the full depth of modern AI, all at once: this is it.

What you'll build

A multi-provider LLM council architecture where multiple models work in parallel and vote on contested legal questions, with deterministic replay for reproducible evaluations.
An evaluation framework from the ground up: stratified gold-standard datasets, inter-rater reliability against human lawyers, backtesting pipelines, confidence calibration.
A mathematically defensible answer to the core research question: how do you quantify "legal reliability"?
Original research output. Conference papers and open-source releases are an explicit project goal.
Production-grade orchestration infrastructure coordinating tens of thousands of inference calls per day across multiple model families, with full observability and fault tolerance.

Why this is exciting

Research depth like an academic group, with production leverage like a top startup. Most AI roles are API wrappers. Here you build the layer underneath.
Law is one of the largest still-unautomated industries. Whoever makes AI work in a traditional, highly regulated sector has a meaningfully harder problem than web demos, and a correspondingly larger lever.
Clear deadlines, clear deliverables. A 24-month plan with a hard milestone at 12 months and concrete research output.
In 24 months you will have learned more about production-grade LLM systems than five years at a large company. That's the promise.

What you should bring

Required

Master's or PhD in computer science, mathematics, physics, statistics, or a related quantitative field.
Several years of experience building production-grade software. Not a pure researcher, but not a pure MLE either.
Deep understanding of probability, statistics, and calibration. Mathematical proofs don't scare you.
Deep experience in a technically demanding field where math, scale, and reliability all matter simultaneously.
Comfortable with modern LLMs and agent frameworks.
High pace and team orientation. You want to work with other ambitious people in a lean team toward a big goal, not at corporate cadence.

Strongly preferred

Scientific publications.
Open-source contributions to AI frameworks.
Experience in a regulated domain (e.g. legal, medical, financial).
Familiarity with modern workflow systems and uncertainty quantification.

(The full technical requirements profile is available on request as a separate document.)

What we offer

A problem you won't find at this depth anywhere else. AI in law is greenfield.
Competitive salary + equity. You build with us, you participate in the upside.
Berlin/Potsdam.
A dedicated compute budget for your research workloads.
Direct input on architecture and research direction. You sit at the table where decisions get made, from day one.
Plus 1 month per year work from anywhere.

Application

Send us your CV, links to relevant code or publications, and one concrete example of a technical problem you found interesting in the past 12 months, and why.

We respond to every serious application within two weeks.

What it's about

What you'll build

Why this is exciting

What you should bring

Required

Strongly preferred

What we offer

Application

Apply now

About you

Documents

More