Do Language Models Decide Like Us? Evidence from Biases in Decision-Making under Risk and Uncertainty
Sun 29.03 12:30 - 13:00
- Graduate Student Seminar
-
Bloomfield 526
Abstract: Large language models (LLMs) are commonly assumed to generate economically rational and unbiased decisions due to their probabilistic architecture. This study examined whether LLMs display systematic decision biases similar to those observed in human judgment. We focused on ChatGPT-4o, ChatGPT-3.5, and RoBERTa, evaluating their behaviour across classic decision-making paradigms under risk and uncertainty. The models were tested using decision-from-description problems derived from prospect theory, including lotteries designed to isolate probability weighting and outcome valuation, framing-effect tasks, and insurance dilemmas. In addition, decision-from-experience tasks were administered, including the Iowa Gambling Task and repeated choice versions of risk and loss-aversion problems. Results revealed consistent humanlike distortions in probability weighting. ChatGPT-4o overweighted small probabilities and underweighted moderate probabilities in description-based choices, while showing the reverse pattern - underweighting rare events - in experience-based decisions. Similar patterns emerged across models. In contrast, outcome-related biases diverged from human behaviour: LLMs did not reliably exhibit diminishing subjective returns and demonstrated only minimal loss aversion, indicating near-linear valuation of outcomes. A key explanation for this dissociation may lie in the models’ training environment. Human discourse frequently emphasizes rare events (e.g., lotteries, catastrophic risks, insurance), potentially reinforcing probability-related heuristics. By contrast, proportional valuation of monetary amounts is deeply embedded in language and cultural representations, encouraging linear treatment of outcome magnitudes. Consequently, LLMs appear to reproduce culturally salient probability biases while exhibiting more normatively consistent valuation of outcomes. These findings highlight both parallels and divergences between human and AI decision processes, with implications for human-AI interaction in contexts involving risk and uncertainty.

