
Google Scholar / GitHub / X / Bluesky
Blogposts
-
Procedural Knowledge in Pretraining Drives LLM Reasoning
-
Large language models are not zero-shot communicators
-
Learning in High Dimension Always Amounts to Extrapolation
-
Structured Prediction part three - Training a linear-chain CRF
-
Structured Prediction part two - Implementing a linear-chain CRF
-
Structured Prediction part one - Deriving a Linear-chain CRF