§5 — Writing

Notes & Essays

On AI evaluation, causal inference, and doing rigorous data science in settings where it actually matters.

Coming soon

What good LLM evaluation actually looks like

Lessons from building evaluation pipelines in a regulated healthcare setting — what vibes-based testing misses and why it matters.

AI EvaluationLLM

Coming soon

A bridge for researchers moving into data science: what transfers, what doesn't, and where the real differences lie.

Causal InferenceMethods

Coming soon

Why standard accuracy metrics are the wrong frame for high-stakes classification, and how to think about it instead.

MLSafeguardingEvaluation