An LLM-native psychometric instrument does not predict LLM behavior: Evidence across 25 models
A psychometric instrument designed for LLMs yields coherent, reliable self-reports that nonetheless fail to predict model behavior across 25 models.
AI evaluation and data science leader, hands-on with model behavior, safety, and measurement.
Co-Founder & CEO of Aymara·Ph.D., Harvard·ex-Uber Applied Science
I build and lead teams that measure how AI systems behave, and I design the methodology, benchmarks, and infrastructure that determine when a system is ready to deploy in high-stakes settings.
I'm Co-Founder and CEO of Aymara, a bootstrapped AI safety evaluation company serving Fortune 50 enterprises and leading AI labs. Before this, I led applied science for Uber's legal and regulatory function, where my team's analyses were cited in court decisions and shaped regulatory outcomes across jurisdictions.
PhD in cognitive neuroscience from Harvard and AB in social psychology from Princeton.
Born in New York City and raised in Bolivia and the U.S., I run marathons, write fiction, and am starting a band with AI musicians.
Co-Founder & CEO
2023–Present
Data Science Manager → Applied Science Manager I → Applied Science Manager II
2019–2023
Head of Data Science
2018–2019
Principal Data Scientist
2016–2018
Associate Fellow
2015–2016
Senior Data Scientist
2014–2015
Selected publications on AI safety evaluation and model behavior.
A psychometric instrument designed for LLMs yields coherent, reliable self-reports that nonetheless fail to predict model behavior across 25 models.
Automated evaluation of gender bias in image generation across 13 leading multimodal models.
Framework and tooling for evaluating risk and responsibility of large language models across 10 real-world safety dimensions.
Essays and creative work on AI, cognition, and culture.
The Flynn Effect — rising IQ scores across generations — suggests that cognitive tools make us smarter, not dumber. AI may be the next one.
Podcast interviews and conference talks on AI safety, research, and technology.
Email me at jm.contreras.phd [at] gmail [dot] com or send me a message below.