The other day I was at Trader Joe’s, loaded down with groceries, and thought “I should log this $45 expense.” By the time I got to my car, I’d already forgotten the exact amount. By the time I got home, I didn’t log it at all.
Sound familiar?
Mobile friction is killing my expense tracking compliance. Pulling out my phone, unlocking it, opening a text editor, typing out a properly formatted Beancount transaction—it’s just enough friction that I skip it when I’m busy. And I’m busy most of the time.
What I Want: Voice-Powered Expense Entry
I want to speak my expense and have it magically append to my Beancount ledger. Something like:
- Voice: “Hey Siri, log 45 dollars for groceries at Trader Joe’s”
- iOS Shortcuts parses: amount=$45, category=Expenses:Groceries, payee=Trader Joe’s
- Result appends to my .beancount file or staging location
What I’ve Found (2026 State of the Art)
I’ve been researching voice-to-text expense automation, and here’s what’s possible in 2026:
iOS Shortcuts + Siri Integration
- iOS 26’s Apple Intelligence now supports LLM-based Shortcuts actions
- People are building voice-controlled expense trackers that talk to Siri, process via n8n, and write to Google Sheets or Notion
- The Transaction automation trigger can capture Apple Pay transactions automatically
- Voice commands can trigger shortcuts hands-free for immediate logging
Android: Tasker + IFTTT
- Tasker offers deep device control with custom triggers and parameters
- IFTTT provides 1000+ app integrations for cross-platform automation
- Tasker ranked #1 for Android automation in 2026, IFTTT #2 for ecosystem breadth
The Plain Text Accounting Challenge
- Most solutions target Google Sheets, Notion, or proprietary apps
- Very few bridge voice input to plain text .beancount file format
- Beancount’s human-readable format is perfect for this, but the tooling doesn’t exist yet
The Technical Puzzle
To build this, I’d need:
- Voice capture: Siri Shortcuts or Google Assistant or Tasker voice input
- Natural language parsing: Extract amount, category, payee from spoken text
- Transaction formatting: Convert to valid Beancount syntax
- File sync: Append to .beancount file via Git sync, Dropbox, or cloud service
- Validation: Catch malformed entries before they corrupt the ledger
- Review workflow: Staging area for questionable transactions
In 2026, steps 1 and 4 are solved problems. Step 2 is possible with LLM-based parsing. Steps 3-6 are where it gets interesting.
Has Anyone Built This?
I’m throwing this out to the community:
- Has anyone built a working voice-to-Beancount workflow? Even a hacky prototype?
- Are there existing projects I should look at or contribute to?
- What approaches have you tried that didn’t work?
- Would you use this if it existed? What’s your minimum bar for good enough?
I can hack together iOS Shortcuts and write basic parsers. But before I reinvent the wheel, I want to know if someone’s already solved this or if the community thinks this is even a good idea.
The dream is: Zero-friction expense logging that doesn’t compromise Beancount’s data integrity.
Is that dream achievable? Let’s discuss.
This is such a great idea! I’ve been thinking about this exact problem for years.
I actually tried building something similar about 6 months ago using iOS Shortcuts. My approach was pretty simple:
- Siri captures the voice command
- Shortcut parses it and appends to a note in Apple Notes
- At the end of each day, I copy-paste from Apple Notes into my main .beancount file
It’s not elegant, but it works. The key insight: Don’t aim for perfection on day one.
Start Simple, Iterate Later
My advice: Build an 80% solution now instead of dreaming about a 100% solution you’ll never finish. Here’s what I’d recommend:
Phase 1: Voice to Staging File
- Use iOS Shortcuts to capture: “log [amount] for [category] at [payee]”
- Append raw text to a staging file (Apple Notes, Dropbox text file, whatever)
- Don’t worry about perfect Beancount syntax yet
- Review and format manually once a day (takes 5 minutes)
Phase 2: Add Basic Parsing
- Once you have data flowing, add simple parsing
- Use regex or basic string manipulation to extract: amount, category, payee
- Generate properly formatted transactions
- Still review before merging to main ledger
Phase 3: Confidence and Automation
- After a month of using it daily, you’ll know what works
- Add validation: balance checks, duplicate detection
- Maybe auto-merge transactions that match patterns
- Keep manual review for anything ambiguous
My Current Shortcut
I can share my iOS Shortcut if you want it. It’s super basic:
- Takes dictation input
- Formats as: “[date] [text]”
- Appends to a shared note in iCloud
Then at night I go through that note and convert entries to Beancount format. Takes maybe 5 minutes for a day’s worth of transactions.
The 80/20 rule applies here: Voice capture solves 80% of the friction. You can handle the last 20% (formatting, categorization) in a quick daily review.
Don’t let perfect be the enemy of good! Start with something that captures the data, even if it’s not fully automated yet. You can always improve it once it’s working.
This sounds really useful! I’m pretty new to Beancount (only been using it for 2 months), and voice entry would make it way less intimidating.
Quick question though: Is this premature optimization?
I’m still struggling to build the habit of tracking expenses regularly. I wonder if adding voice automation before I have a solid manual workflow might be jumping ahead?
Alternative Idea: Photo Receipt Capture?
What if instead of voice, we focused on photo-to-transaction? Like:
- Take a photo of the receipt
- OCR extracts: date, merchant, amount, items
- Review and categorize
- Generate Beancount transaction
There are already receipt scanner apps that do this (Expensify, etc.). Could we adapt their approach for Beancount?
Seems like it might be more accurate than voice since you have the actual receipt data, not just what you remember saying.
Technical Skill Requirements
For voice-to-Beancount, do I need to know programming? I’m not a developer—just an accountant trying to level up my personal finance game.
If there’s a no-code solution (like a pre-built iOS Shortcut I can download), I’d try it. But if it requires writing Python scripts or running servers, that’s beyond my current skillset.
Would love to see:
- Step-by-step setup guides
- Pre-made shortcuts/templates
- Community-maintained solutions I can just install
Is anyone working on something like this that non-technical users can adopt?
As a CPA who deals with client records all day, I see both the appeal and the risks here.
The Professional Perspective
Voice-to-transaction automation sounds convenient, but in my experience, convenience without controls leads to messy books.
Here’s what I see go wrong with client data:
Voice Recognition Errors
- “Fifteen” vs “fifty” (10x difference!)
- “Marketing” vs “parking” (completely different categories)
- Missed words: “Coffee for client meeting” → “Coffee client” (context lost)
Categorization Ambiguity
- Same merchant, different purposes (Office Depot for office supplies vs client gifts)
- Context matters: Is this meal personal or business? Entertainment or travel?
- Tax implications: Some categories are fully deductible, others aren’t
Data Integrity Concerns
- If bad data gets auto-committed to your main ledger, you’re compounding errors
- During tax season, untangling months of mis-categorized transactions is expensive
- IRS audits require documentation—“I think I said that” isn’t sufficient
What Would Make This Work
That said, I’m not opposed to automation—I just want guardrails:
1. Pending Transaction Queue
Voice input should create draft transactions that require approval before becoming permanent. Think of it like a shopping cart: you can add items quickly, but you review before checkout.
2. Validation Rules
- Flag unusual amounts (“Did you really spend $4,700 at Starbucks?”)
- Detect duplicates (same amount, same merchant, same day)
- Require receipts for transactions over threshold (say, $50)
- Balance assertions to catch cumulative errors
3. Professional-Grade Audit Trail
- Timestamp when transaction was voice-captured vs when it was approved
- Store original voice transcription alongside formatted transaction
- Keep receipt images linked to transactions
- Generate audit reports for review
Would I Use This?
If someone built this with proper controls, yes, I’d recommend it to clients.
Small business owners often don’t track expenses because it’s tedious. A voice-first tool that still maintains data quality would be genuinely useful.
But it can’t be a black box. The user needs to understand what’s happening and have checkpoints to verify accuracy.
I’d even be willing to beta test such a tool if someone’s building it. Just make sure you build the boring stuff (validation, review workflows, error handling) alongside the exciting stuff (voice capture, AI parsing).