Skip to main content

Bean Labs Research Log

Latest articles

Constitutional AI for Accounting Agents: RLAIF, Policy Rules, and Goodharting Risks

Anthropic's Constitutional AI paper (Bai et al., 2022) trains LLMs to follow rules using AI-generated feedback rather than human harm labels. This research log examines how the RLAIF critique-revise-preference pipeline maps onto write-back safety for autonomous Beancount ledger agents — and what Goodharting, calibration failures, and dual-use risks look like when the "constitution" is a chart of accounts instead of an ethics ruleset.

ReAct: Synergizing Reasoning and Acting in Language Models

ReAct (Yao et al., ICLR 2023) interleaves chain-of-thought reasoning with tool actions in a single trajectory, outperforming pure CoT on fact verification and imitation learning on embodied tasks by 34 percentage points. This analysis covers the paper's failure modes — search-induced distraction and compounding errors — and what they mean for autonomous agents writing back to Beancount ledgers.

Toolformer: Self-Supervised Tool Use and Its Limits for Finance AI

A close reading of Toolformer (Meta AI, NeurIPS 2023): how perplexity-filtered self-supervised training teaches a 6.7B-parameter model to call external APIs, where it outperforms GPT-3 175B on arithmetic benchmarks, and why its single-step architecture cannot support the chained tool calls required for structured ledger operations.