About

I build the evaluation infrastructure that keeps advanced AI aligned — from chain-of-thought faithfulness to multi-agent governance, from research prototypes to production systems.

Current Projects (2026)

CoT Faithfulness Evaluator: Is LLM reasoning actually driving outputs or post-hoc rationalization?
Inter-Query Attack Detector: Detecting harmful intent spread across multiple innocent-looking queries
LLM Judge Bias Profiler: Systematic bias probing and de-biasing tools for LLM-as-judge systems
Deceptive Alignment Detection Suite: Measuring when models behave differently in deployment vs. evaluation

Research Focus

I focus on practical evaluation tools that bridge the gap between AI evaluation research and real-world deployment. My work addresses:

Behavioral anomaly detection in production AI systems
Capability gap measurement between benchmarks and reality
Unlearning verification and red-teaming
Multi-agent alignment in collaborative AI systems

Connect

GitHub — All projects are open source
Twitter — Updates and thoughts
LinkedIn — Professional updates