Using AI/LLMs with Beancount - Anyone Experimenting?

newbie_accountant · October 14, 2025, 8:51am

I’ve been hearing about people using Claude and ChatGPT to help with their Beancount workflows. Has anyone tried this?

Potential use cases I’m thinking about:

Help writing complex BQL queries
Categorizing unusual transactions
Learning Beancount syntax faster
Converting receipts to transaction entries
Explaining error messages

But I’m also worried about:

Privacy (sending financial data to AI)
Accuracy for tax-critical stuff
Hallucinated transactions

What’s been your experience?

helpful_veteran · October 14, 2025, 8:51am

I’ve been cautiously experimenting! Here’s what I’ve learned:

What works GREAT:

Query writing - Claude is amazing at generating BQL queries
Learning syntax - better than reading docs sometimes
Explaining errors - it understands Beancount errors well
Modeling edge cases - “how do I record X?” type questions

What I DON’T trust yet:

Auto-generating actual transactions (always verify!)
Tax calculations (too critical for mistakes)
Bulk categorization without review

My workflow:

Ask AI to draft something
Review carefully
Run bean-check immediately
Check in Fava for sanity

Privacy approach:
I sanitize data before sending to AI - replace real amounts with fake ones, use generic merchant names, etc.

It’s a tool, not autopilot!

practical_adviser · October 14, 2025, 8:51am

I use AI daily now - huge time saver!

Best use case: Receipt → Transaction

I take a photo, OCR it, then prompt:

AI generates:

2024-01-15 * "Whole Foods" "Groceries"
  Expenses:Groceries  67.43 USD
  Liabilities:CreditCard:Chase

90% accuracy! I just review and commit.

Privacy: I use local Ollama models for sensitive data. For learning/queries, ChatGPT is fine.

Productivity: Saves 2-3 hours/month on data entry and query writing.

newbie_accountant · October 14, 2025, 8:22pm

I’ve been experimenting with Claude for query generation and it’s incredible!

My workflow:

Describe what I want in plain English:

"Show me all dining expenses over $100 in the last 6 months,
including merchant name and category breakdown"

Claude generates BQL:

SELECT date, payee, narration, position
WHERE account ~ 'Expenses:Dining'
  AND number(position) > 100
  AND date >= 2024-07-01
ORDER BY date DESC

I review, run, and refine if needed

Success rate: ~95% for queries, ~90% for transaction drafts

Best use cases:

Complex queries I don’t know how to write
Modeling unusual transactions (stock options, crypto, etc.)
Understanding error messages
Learning Beancount syntax faster

Time saved: Easily 3-4 hours/month on query writing and learning

helpful_veteran · October 14, 2025, 8:22pm

For those concerned about privacy, here’s my setup:

Option 1: Local models (Ollama)

# Install Ollama
curl https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama2

# Use for sensitive queries
ollama run llama2 "Convert this transaction to Beancount..."

Completely local, no data leaves your machine.

Option 2: Data sanitization
Before sending to Claude/GPT:

Replace real amounts: $1,234.56 → $XXX.XX
Use generic merchants: “Coffee Shop” not “Starbucks on Main St”
Remove identifying info: account numbers, real names

Option 3: Use for learning only
Ask conceptual questions:

“How do I model X?”
“What’s the syntax for Y?”
“Explain this error: Z”

No actual financial data shared.

My approach: Ollama for sensitive, Claude for learning/queries with sanitized data.

practical_adviser · October 14, 2025, 8:22pm

Advanced technique: Using AI for smart categorization with context.

Instead of simple rule matching, I use AI to categorize based on:

Context-aware prompts:

Categorize this transaction considering:
- Merchant: "AMZN MKTP US*1A2B3C4D5"
- Amount: $47.23
- Previous Amazon purchases: Books, Kitchen supplies
- Date: Weekend
- My typical spending patterns

Return Beancount transaction.

AI considers:

Amount (books vs electronics)
Day of week (weekend = likely personal)
Historical patterns
Merchant variations

Results:

Better accuracy than simple regex (92% vs 85%)
Handles edge cases naturally
Learns from my patterns

Implementation:
Python script that:

Reads uncategorized transactions
Builds context from history
Queries AI with context
Outputs draft transactions
I review and commit

Caution: Always review AI output before committing to ledger!