AI Candidate Screening — Signal-Based Evaluation at Scale

Automated two-stage screening pipeline: Claude evaluates candidates against configurable binary signals, then deterministic tier logic shortlists the top performers.

Technologies: Python, Claude API, Airtable, Anthropic Batches API

Project Tags

AI/MLClaude APIAirtablePythonRecruitment

Before

Manual Screening Doesn’t Scale When Every Signal Matters

Hiring for AI safety roles requires evaluating candidates across dozens of nuanced dimensions—alignment research intuition, policy fluency, technical depth, publication quality. Each signal demands careful reading of CVs, LinkedIn profiles, GitHub repos, and personal websites. A single recruiter screening 200 candidates against 15 signals is looking at 3,000 individual judgement calls.

The process was slow, inconsistent, and expensive. Different reviewers weighted the same evidence differently. Re-screening when role requirements shifted meant starting from scratch. And the best candidates—the ones who clear every bar—were buried in the same queue as everyone else.

After

Two-Stage Pipeline: LLM Signals, Then Deterministic Shortlisting

Stage one uses Claude to evaluate each candidate against configurable binary signals—pass, fail, or unknown—with evidence citations drawn from assembled context: Airtable profiles, CV PDFs, LinkedIn data, GitHub, and personal websites. Signal definitions live in Airtable as the source of truth. Evaluations can be scoped by role, run in batch mode via Anthropic’s Batches API for 50% cost savings, or executed synchronously for quick iterations.

Stage two is pure threshold math. Role configs define three tiers—hard requirements (all must pass), core competencies (configurable threshold), and differentiators (configurable threshold)—and the shortlister applies them deterministically against existing verdicts. No API calls, fully auditable, free to re-run whenever requirements change. The two stages are decoupled: evaluate incrementally, shortlist on demand.

Next Project

Recruiting Data Infrastructure→

Scalable recruiting infrastructure: Cloud SQL, BigQuery, Elasticsearch. 100+ data sources harvested into an 80K-record dataset at >90% accuracy.