Using AI/LLMs with Beancount - Anyone Experimenting?

I’ve been hearing about people using Claude and ChatGPT to help with their Beancount workflows. Has anyone tried this?

Potential use cases I’m thinking about:

  • Help writing complex BQL queries
  • Categorizing unusual transactions
  • Learning Beancount syntax faster
  • Converting receipts to transaction entries
  • Explaining error messages

But I’m also worried about:

  • Privacy (sending financial data to AI)
  • Accuracy for tax-critical stuff
  • Hallucinated transactions

What’s been your experience?

I’ve been cautiously experimenting! Here’s what I’ve learned:

What works GREAT:

  • Query writing - Claude is amazing at generating BQL queries
  • Learning syntax - better than reading docs sometimes
  • Explaining errors - it understands Beancount errors well
  • Modeling edge cases - “how do I record X?” type questions

What I DON’T trust yet:

  • Auto-generating actual transactions (always verify!)
  • Tax calculations (too critical for mistakes)
  • Bulk categorization without review

My workflow:

  1. Ask AI to draft something
  2. Review carefully
  3. Run bean-check immediately
  4. Check in Fava for sanity

Privacy approach:
I sanitize data before sending to AI - replace real amounts with fake ones, use generic merchant names, etc.

It’s a tool, not autopilot!

I use AI daily now - huge time saver!

Best use case: Receipt → Transaction

I take a photo, OCR it, then prompt:

AI generates:

2024-01-15 * "Whole Foods" "Groceries"
  Expenses:Groceries  67.43 USD
  Liabilities:CreditCard:Chase

90% accuracy! I just review and commit.

Privacy: I use local Ollama models for sensitive data. For learning/queries, ChatGPT is fine.

Productivity: Saves 2-3 hours/month on data entry and query writing.

I’ve been experimenting with Claude for query generation and it’s incredible!

My workflow:

  1. Describe what I want in plain English:
"Show me all dining expenses over $100 in the last 6 months,
including merchant name and category breakdown"
  1. Claude generates BQL:
SELECT date, payee, narration, position
WHERE account ~ 'Expenses:Dining'
  AND number(position) > 100
  AND date >= 2024-07-01
ORDER BY date DESC
  1. I review, run, and refine if needed

Success rate: ~95% for queries, ~90% for transaction drafts

Best use cases:

  • Complex queries I don’t know how to write
  • Modeling unusual transactions (stock options, crypto, etc.)
  • Understanding error messages
  • Learning Beancount syntax faster

Time saved: Easily 3-4 hours/month on query writing and learning

For those concerned about privacy, here’s my setup:

Option 1: Local models (Ollama)

# Install Ollama
curl https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama2

# Use for sensitive queries
ollama run llama2 "Convert this transaction to Beancount..."

Completely local, no data leaves your machine.

Option 2: Data sanitization
Before sending to Claude/GPT:

  • Replace real amounts: $1,234.56 → $XXX.XX
  • Use generic merchants: “Coffee Shop” not “Starbucks on Main St”
  • Remove identifying info: account numbers, real names

Option 3: Use for learning only
Ask conceptual questions:

  • “How do I model X?”
  • “What’s the syntax for Y?”
  • “Explain this error: Z”

No actual financial data shared.

My approach: Ollama for sensitive, Claude for learning/queries with sanitized data.

Advanced technique: Using AI for smart categorization with context.

Instead of simple rule matching, I use AI to categorize based on:

Context-aware prompts:

Categorize this transaction considering:
- Merchant: "AMZN MKTP US*1A2B3C4D5"
- Amount: $47.23
- Previous Amazon purchases: Books, Kitchen supplies
- Date: Weekend
- My typical spending patterns

Return Beancount transaction.

AI considers:

  • Amount (books vs electronics)
  • Day of week (weekend = likely personal)
  • Historical patterns
  • Merchant variations

Results:

  • Better accuracy than simple regex (92% vs 85%)
  • Handles edge cases naturally
  • Learns from my patterns

Implementation:
Python script that:

  1. Reads uncategorized transactions
  2. Builds context from history
  3. Queries AI with context
  4. Outputs draft transactions
  5. I review and commit

Caution: Always review AI output before committing to ledger!