Is a LLM fine-tuned on transcribed, behavioral tasks better at predicting Behavior from transcribed intention than the base version of the LLM?

LLM

I compare the CENTAUR model (Binz et al., 2025) against the base version Llama 3.1 (70b) when predicting decisions in true out-of-sample data for the centaur model.

Author

Sabou Rani Stocker

Published

September 28, 2025

to follow :)