Back to Index
ML Research Talent Map
Projects/ML Research Talent Map

ML Research Talent Map — 50K Profiles, Sub-100ms Search

Built a 50,000-profile ML research talent map from 5 years of top-tier conference proceedings with semantic search via vector embeddings.

Technologies: Python, Vector Embeddings, Elasticsearch, ICLR/ICML/CVPR Data

Project Tags

Data EngineeringNLPVector SearchPythonElasticsearch
Before

50,000 Researchers, Zero Map: The Invisible Talent Pool

Five years of ICLR, ICML, and CVPR proceedings. Tens of thousands of researchers publishing cutting-edge work in machine learning. And yet no systematic way to find who among them was interested in AI safety, available for new roles, or a match for a specific lab’s needs.

Sourcing meant manual keyword searches, trawling Google Scholar pages, and maintaining sprawling spreadsheets that went stale within weeks. Labs were hiring from networks, not from the full landscape of available talent.

After

Semantic Search Across 50K Profiles in Under 100 Milliseconds

Built a 50,000-profile ML research talent map from five years of top-tier conference proceedings. Each profile is enriched with publication history, co-author networks, and research vectors.

A two-stage candidate matching system using vector embeddings delivers sub-100ms semantic search. Recruiters can now query by research area, methodology, or even the conceptual neighbourhood of a specific paper—and get ranked results in the time it takes to blink.