I’ve been hearing about people using Claude and ChatGPT to help with their Beancount workflows. Has anyone tried this?
Potential use cases I’m thinking about:
- Help writing complex BQL queries
- Categorizing unusual transactions
- Learning Beancount syntax faster
- Converting receipts to transaction entries
- Explaining error messages
But I’m also worried about:
- Privacy (sending financial data to AI)
- Accuracy for tax-critical stuff
- Hallucinated transactions
What’s been your experience?
I’ve been cautiously experimenting! Here’s what I’ve learned:
What works GREAT:
- Query writing - Claude is amazing at generating BQL queries
- Learning syntax - better than reading docs sometimes
- Explaining errors - it understands Beancount errors well
- Modeling edge cases - “how do I record X?” type questions
What I DON’T trust yet:
- Auto-generating actual transactions (always verify!)
- Tax calculations (too critical for mistakes)
- Bulk categorization without review
My workflow:
- Ask AI to draft something
- Review carefully
- Run bean-check immediately
- Check in Fava for sanity
Privacy approach:
I sanitize data before sending to AI - replace real amounts with fake ones, use generic merchant names, etc.
It’s a tool, not autopilot!
I use AI daily now - huge time saver!
Best use case: Receipt → Transaction
I take a photo, OCR it, then prompt:
AI generates:
2024-01-15 * "Whole Foods" "Groceries"
Expenses:Groceries 67.43 USD
Liabilities:CreditCard:Chase
90% accuracy! I just review and commit.
Privacy: I use local Ollama models for sensitive data. For learning/queries, ChatGPT is fine.
Productivity: Saves 2-3 hours/month on data entry and query writing.
I’ve been experimenting with Claude for query generation and it’s incredible!
My workflow:
- Describe what I want in plain English:
"Show me all dining expenses over $100 in the last 6 months,
including merchant name and category breakdown"
- Claude generates BQL:
SELECT date, payee, narration, position
WHERE account ~ 'Expenses:Dining'
AND number(position) > 100
AND date >= 2024-07-01
ORDER BY date DESC
- I review, run, and refine if needed
Success rate: ~95% for queries, ~90% for transaction drafts
Best use cases:
- Complex queries I don’t know how to write
- Modeling unusual transactions (stock options, crypto, etc.)
- Understanding error messages
- Learning Beancount syntax faster
Time saved: Easily 3-4 hours/month on query writing and learning
For those concerned about privacy, here’s my setup:
Option 1: Local models (Ollama)
# Install Ollama
curl https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama2
# Use for sensitive queries
ollama run llama2 "Convert this transaction to Beancount..."
Completely local, no data leaves your machine.
Option 2: Data sanitization
Before sending to Claude/GPT:
- Replace real amounts: $1,234.56 → $XXX.XX
- Use generic merchants: “Coffee Shop” not “Starbucks on Main St”
- Remove identifying info: account numbers, real names
Option 3: Use for learning only
Ask conceptual questions:
- “How do I model X?”
- “What’s the syntax for Y?”
- “Explain this error: Z”
No actual financial data shared.
My approach: Ollama for sensitive, Claude for learning/queries with sanitized data.
Advanced technique: Using AI for smart categorization with context.
Instead of simple rule matching, I use AI to categorize based on:
Context-aware prompts:
Categorize this transaction considering:
- Merchant: "AMZN MKTP US*1A2B3C4D5"
- Amount: $47.23
- Previous Amazon purchases: Books, Kitchen supplies
- Date: Weekend
- My typical spending patterns
Return Beancount transaction.
AI considers:
- Amount (books vs electronics)
- Day of week (weekend = likely personal)
- Historical patterns
- Merchant variations
Results:
- Better accuracy than simple regex (92% vs 85%)
- Handles edge cases naturally
- Learns from my patterns
Implementation:
Python script that:
- Reads uncategorized transactions
- Builds context from history
- Queries AI with context
- Outputs draft transactions
- I review and commit
Caution: Always review AI output before committing to ledger!