Beancount.io v3's Document Linking Changed How I Handle Receipts and Invoices

As someone who runs a small consulting business and used to dread receipt management, I need to share how beancount.io v3’s document linking feature has completely transformed my workflow.

The Old Way (Pre-v3): Receipt Hell

Before v3, my receipt management was a disaster:

My chaotic process:

  1. Receive invoice/receipt (email, paper, screenshot)
  2. Save to Downloads folder (random filenames)
  3. Enter transaction in beancount file
  4. ???
  5. 6 months later, need receipt for audit → Can’t find it

The problem:

  • Receipts scattered across: Email, Downloads, Desktop, Phone photos
  • No connection between beancount transaction and physical document
  • Tax time = panic searching for receipts
  • Audits = nightmare

Year-end tax prep:

Me: "Where's the receipt for that $1,200 software purchase?"
*Searches email for 30 minutes*
*Checks Downloads folder - 500 random PDFs*
*Gives up, hopes IRS doesn't ask*

The v3 Solution: Automatic Document Linking

Beancount.io v3 automatically discovers and links documents in your Git repository to transactions.

How it works:

1. Save document to /documents folder in your repo

Your repo structure:

my-finances/
├── personal.beancount
├── business.beancount
└── documents/
    ├── 2025-01-15-invoice-acme-software.pdf
    ├── 2025-02-03-receipt-office-depot.pdf
    └── 2025-03-20-contract-client-abc.pdf

2. Commit to Git

git add documents/2025-01-15-invoice-acme-software.pdf
git commit -m "Add Acme Corp software invoice"
git push

3. v3 auto-links based on date/amount

When you create this transaction:

2025-01-15 * "Acme Corp - Annual software license"
  Expenses:Business:Software  1200.00 USD
  Liabilities:CreditCard     -1200.00 USD

v3 automatically finds 2025-01-15-invoice-acme-software.pdf and links it!

The Magic: Fuzzy Matching

v3’s algorithm:

  1. Date matching: Documents with dates matching transaction dates
  2. Amount matching (if PDF parseable): Links documents with matching amounts
  3. Filename keywords: Matches keywords in narration to filename

Example:

Transaction:

2025-02-20 * "Office Depot - Printer paper and supplies"
  Expenses:Business:Office  85.43 USD
  Liabilities:CreditCard   -85.43 USD

Document: 2025-02-20-receipt-office-depot.pdf

v3 matches based on:

  • :white_check_mark: Date: 2025-02-20
  • :white_check_mark: Keyword: “Office Depot” in both
  • :white_check_mark: Amount: $85.43 (if parseable from PDF)

Result: Instant link

My New Workflow

Before (30 minutes per receipt):

  1. Receive invoice by email
  2. Download PDF
  3. Rename file manually: invoice-vendor-date-amount.pdf
  4. Move to organized folder structure
  5. Update spreadsheet tracking invoices
  6. Enter transaction in beancount
  7. Manually note where receipt is stored

After (2 minutes per receipt):

  1. Receive invoice by email
  2. Save to documents/ folder (any filename works, but I use date prefix for sorting)
  3. git add documents/2025-*.pdf && git commit -m "Add receipts" && git push
  4. Enter transaction in beancount (v3 web UI)
  5. v3 automatically links document → DONE

Time saved: 93%

Real Example: Tax Season

February 2025: Time to file 2024 taxes

The old way:

  • Spend 2 weeks gathering receipts
  • Search email for “invoice” (2,000 results)
  • Dig through Downloads folder
  • Call vendors to re-send invoices
  • Still missing 20% of receipts

With v3:

# Generate expense report with all linked documents
beancount-query business.beancount "
  SELECT
    date,
    narration,
    cost(position) as amount,
    account
  WHERE account ~ 'Expenses:Business'
    AND year = 2024
  ORDER BY date
"

Every transaction has linked PDF - click to view immediately in v3 UI.

Tax prep time: 4 hours (vs 2 weeks)

Document Linking in v3 Web UI

When you view a transaction in the v3 web interface:

Transaction view:

2025-01-15 * "Acme Corp - Annual software license"
  Expenses:Business:Software  1200.00 USD
  Liabilities:CreditCard     -1200.00 USD

📎 Linked Documents (1):
   📄 2025-01-15-invoice-acme-software.pdf
      [View] [Download]

Click “View” → PDF opens in browser

No more searching!

Supported File Formats

v3 links these document types:

  • PDFs: Invoices, receipts, contracts
  • Images: JPG, PNG (phone photos of receipts)
  • Office docs: DOCX, XLSX (contracts, expense reports)
  • Text: TXT, MD (notes, documentation)

My workflow for paper receipts:

  1. Take photo with phone
  2. Save to synced folder (iCloud/Dropbox → documents/)
  3. Auto-commits to Git when synced
  4. v3 auto-links

Advanced: Manual Linking

v3 auto-links most documents, but you can also manually specify document links in beancount syntax:

2025-03-15 * "ClientABC - Website redesign contract" ^contract-abc-2025.pdf
  Assets:Receivable:ClientABC  15000.00 USD
  Income:Projects:ClientABC   -15000.00 USD

The ^contract-abc-2025.pdf syntax explicitly links the document.

Use manual linking when:

  • Document name doesn’t match transaction date
  • Multiple documents for one transaction
  • Want to ensure specific document is linked

Version Control for Documents

Since documents live in Git:

Every change is tracked:

git log documents/2025-01-15-invoice-acme-software.pdf

See who changed what:

commit abc123
Author: patricia
Date:   2025-01-15

    Add Acme Corp invoice

commit def456
Author: patricia
Date:   2025-01-16

    Replace invoice with corrected version

Audit trail built-in!

Cloud Storage Integration

v3 works great with cloud-synced Git repos:

My setup:

~/Dropbox/my-finances/  ← Dropbox syncs
└── (Git repo)
    ├── business.beancount
    └── documents/

Workflow:

  1. Receive invoice on phone
  2. Save to Dropbox/my-finances/documents/ from phone
  3. Dropbox syncs to Mac
  4. Automatic Git commit (via cron script)
  5. v3 auto-links

Phone → v3 linked in 30 seconds

Benefits Over Traditional Filing

Old system (physical files):

  • Filing cabinet with folders
  • “Where did I file this?”
  • Can’t search
  • No backup
  • Lost in office fire/flood

Old system (digital folders):

  • Nested folder structure
  • “Was this in 2024/Q1/Software or Vendors/Acme?”
  • Search by filename only
  • No connection to accounting records

v3 Git + Document Linking:

  • :white_check_mark: Flat structure (just /documents/)
  • :white_check_mark: Connected to transactions
  • :white_check_mark: Full-text search (if PDFs)
  • :white_check_mark: Version control
  • :white_check_mark: Cloud backup
  • :white_check_mark: Accessible from v3 web UI anywhere

Compliance Benefits

For my business, I need to:

  • Keep receipts 7 years (IRS)
  • Provide receipts on demand (audits)
  • Track expense categories (tax deductions)

v3 makes this trivial:

IRS audit request: “Provide all software expense receipts for 2024”

beancount-query business.beancount "
  SELECT date, narration, cost(position), account
  WHERE account ~ 'Expenses:Business:Software'
    AND year = 2024
"

Result: List of all software expenses with linked PDFs

Click “Download All Documents” → Zip file with all receipts

Audit response time: 5 minutes (vs 2 days)

My Specific Use Cases

1. Client invoices (income)

2025-03-15 * "Invoice ClientABC - March deliverables" ^invoice-abc-2025-03.pdf
  Assets:Receivable:ClientABC  5000.00 USD
  Income:Projects:ClientABC   -5000.00 USD

Link my own invoice (outgoing)

2. Vendor invoices (expenses)

2025-03-20 * "Acme Hosting - Q1 2025 hosting bill" ^invoice-acme-q1.pdf
  Expenses:Business:Hosting  300.00 USD
  Liabilities:CreditCard    -300.00 USD

Link vendor invoice (incoming)

3. Contracts

2025-01-10 * "ClientXYZ - Annual retainer contract" ^contract-xyz-2025.pdf
  Assets:Receivable:ClientXYZ  60000.00 USD

Kevin, excellent questions! Let me answer all of them based on my experience with v3 document linking.

File Format Support

1. Scanned documents (TIFF, multi-page PDF):
:white_check_mark: YES - v3 links multi-page scans perfectly. I regularly scan 10-20 page vendor contracts as single PDF. Works great.

2. Email (.eml) files:
:white_check_mark: YES - v3 treats .eml as any other document. I save important email threads and link them:

2025-03-10 * "ClientABC contract negotiation" ^email-thread-abc.eml

3. Spreadsheets (.xlsx, Google Sheets exports):
:white_check_mark: YES - I link quarterly reports all the time:

2025-01-31 * "Q4 2024 vendor report" ^vendor-report-q4.xlsx

4. Compressed archives (.zip):
:warning: PARTIAL - v3 will link the .zip file, but can’t peek inside. I recommend extracting first, then linking individual files. Or link the zip if you want to preserve the bundle.

Repository Structure

Flat vs Nested:

v3 auto-linking works with nested directories recursively. I use year-based organization (not flat):

My structure:

documents/
├── 2024/
│   ├── 2024-01-15-invoice-acme-software.pdf
│   ├── 2024-02-03-receipt-office-depot.pdf
│   └── ... (500 files)
├── 2025/
│   ├── 2025-01-10-invoice-dell.pdf
│   ├── 2025-01-20-contract-client-abc.pdf
│   └── ... (growing)
└── contracts/
    ├── client-abc-2025.pdf
    ├── vendor-xyz-nda.pdf

v3 searches ALL subdirectories when auto-linking. Works perfectly!

Why I chose year-based over flat:

  • 500 files in one folder = hard to browse
  • Year folders = natural archive (move old years to cold storage)
  • Still get auto-linking (v3 recursive search)

Vendor-based folders also work:

documents/
├── vendors/
│   ├── acme-corp/
│   ├── office-depot/
├── clients/
│   ├── client-abc/

v3 finds documents regardless of structure. Use whatever makes sense for you!

Cloud Storage Integration

1. Google Drive sync:
:white_check_mark: YES - I use Google Drive! Here’s my setup:

~/Google Drive/MyFinances/  ← Google Drive syncs this folder
└── (Git repo)
    ├── business.beancount
    └── documents/

Workflow:

  1. Save receipt to Google Drive (phone or desktop)
  2. Google Drive syncs to Mac
  3. Git auto-commits (cron job)
  4. v3 auto-links

Works perfectly!

2. Sync conflicts on binary PDFs:

Good question! Git doesn’t merge binary files, so if you and your accountant both add different PDFs with same name, you get conflict.

Solution: Use different naming:

  • Me: 2025-03-15-invoice-acme-corp.pdf
  • Accountant: 2025-03-15-tax-forms.pdf

Different filenames = no conflict.

If you both edit SAME PDF (like renaming), Git shows conflict:

$ git pull
CONFLICT (content): Merge conflict in documents/receipt.pdf

Resolution: Pick one version:

# Keep mine
git checkout --ours documents/receipt.pdf

# Keep theirs
git checkout --theirs documents/receipt.pdf

git add documents/receipt.pdf
git commit

In practice: Rare issue. We name files differently.

3. Mobile workflow with Google Drive:

iPhone → Google Drive → Git:

  1. Take photo with Google Drive app
  2. Save to Google Drive/MyFinances/documents/2025/
  3. Google Drive syncs to Mac
  4. Cron job commits to Git
  5. v3 auto-links

Total time: 30 seconds

Alternative (my preferred):

  • Use Adobe Scan app (better OCR)
  • Scan to PDF
  • Save to Google Drive
  • Same flow from there

Auto-Linking Algorithm Details

1. Multiple transactions same day, same vendor:

v3 matches by amount parsing + filename keywords.

Example:

2025-03-15 * "Office Depot - Printer ink"
  Expenses:Supplies  45.00 USD
  ...

2025-03-15 * "Office Depot - Paper restock"
  Expenses:Supplies  30.00 USD
  ...

Documents:

  • 2025-03-15-receipt-office-depot-45.00.pdf (amount in filename)
  • 2025-03-15-receipt-office-depot-30.00.pdf

OR:

  • 2025-03-15-receipt-office-depot-ink.pdf (keyword “ink”)
  • 2025-03-15-receipt-office-depot-paper.pdf (keyword “paper”)

v3 matches:

  • First transaction → “ink” keyword OR “$45.00” amount
  • Second transaction → “paper” keyword OR “$30.00” amount

If ambiguous (no keywords/amounts), v3 shows both as candidates in web UI. You manually pick correct one.

2. Transactions without exact date match:

Invoice date vs payment date:

2025-03-20 * "Acme Corp - March hosting"
  Expenses:Hosting  100.00 USD
  ...

Document: 2025-03-01-invoice-acme-march-hosting.pdf

v3 matches based on:

  • :white_check_mark: Keywords: “Acme”, “March”, “hosting”
  • :white_check_mark: Amount: $100.00 (if parseable from PDF)
  • :warning: Date: 20-day difference

v3 allows date fuzzy matching (±30 days). It will link, but with lower confidence score.

Better approach: Include both dates in filename:
2025-03-01-invoice-2025-03-20-payment-acme.pdf

Or manually link:

2025-03-20 * "Acme Corp - March hosting" ^2025-03-01-invoice-acme.pdf

3. Recurring transactions:

Monthly subscription (same vendor, same amount):

2025-01-01 * "Acme SaaS - Monthly subscription"
2025-02-01 * "Acme SaaS - Monthly subscription"
2025-03-01 * "Acme SaaS - Monthly subscription"

Option A: Link contract to all (manual):

2025-01-01 * "Acme SaaS - Monthly subscription" ^contract-acme-saas.pdf
2025-02-01 * "Acme SaaS - Monthly subscription" ^contract-acme-saas.pdf
2025-03-01 * "Acme SaaS - Monthly subscription" ^contract-acme-saas.pdf

Option B: Link monthly invoices:

documents/
├── 2025-01-acme-saas-invoice.pdf
├── 2025-02-acme-saas-invoice.pdf
├── 2025-03-acme-saas-invoice.pdf

v3 auto-links by month (date matching).

I use Option A for contracts (reference), Option B for monthly invoices.

Manual Linking Syntax

1. Multiple documents:

2025-05-01 * "Dell - Laptop" ^invoice.pdf ^warranty.pdf ^receipt.pdf

:white_check_mark: No limit! I’ve linked 10+ documents to complex transactions.

2. Relative paths:

^vendors/dell/invoice-2025-05-01.pdf

:white_check_mark: Relative paths work! Relative to repo root.

3. Spaces in filenames:

^my invoice from vendor.pdf

:warning: Spaces work without quotes in beancount syntax. But I recommend dashes/underscores for shell-friendliness:

^my-invoice-from-vendor.pdf

PDF Parsing

1. Scanned PDFs (no OCR text layer):

v3 can’t parse amount from image-only PDFs, but:

  • :white_check_mark: Still auto-links by date + filename keywords
  • :warning: No amount matching (can’t read text from image)

Recommendation: Use scanner with OCR (Adobe Scan, etc.) to embed text layer.

2. Password-protected PDFs:

:cross_mark: v3 can’t parse password-protected PDFs for amounts.

Workaround:

  • Unlock PDF before saving
  • Or manually link (don’t rely on amount parsing)

Git Commit Automation Script

My exact script:

#!/bin/bash
# ~/bin/auto-commit-documents.sh

REPO_PATH="$HOME/Google Drive/MyFinances"
cd "$REPO_PATH" || exit 1

# Check for new documents
if [ -n "$(git status --porcelain documents/)" ]