This blog is a field journal, not a polished publication. Expect rough edges, failure reports, whacky ideas and the occasional thing that actually worked.

What I write about:

  • Eval experiments — testing model claims against real tasks
  • Tooling and workflows — building with LLMs, not just talking to them
  • Lab notes — shorter observations that don’t warrant a full post
  • Math - occasionally, I dabble in mathematics and write some stuff here

If something here is useful to you, I’d love to hear about it.