Smart Defaults: Training Your Importers to Categorize Like You Would

After two years of refining my Beancount importers, I’ve finally achieved what I call “90/10 automation” - 90% of my transactions auto-categorize correctly, and I only manually review 10%. Here’s how I built a system that learns from my historical decisions.

The Problem: Starting from Zero

When I first started with Beancount, every transaction needed manual categorization:

2025-01-15 * "AMAZON" ""
  Assets:Bank:Checking  -47.23 USD
  Expenses:Uncategorized   ; What category?

I was spending 30+ minutes per month just categorizing transactions. With 100+ transactions monthly, that’s a lot of tedious work.

The Solution: Layered Categorization

I now use a three-layer approach that progressively handles transactions:

Layer 1: Deterministic Rules (The Foundation)

For payees I ALWAYS know the category, I use explicit rules:

# In my importer config
DETERMINISTIC_RULES = {
    # Exact matches - these never change
    "SPOTIFY": "Expenses:Subscriptions:Music",
    "NETFLIX": "Expenses:Subscriptions:Streaming",
    "VERIZON WIRELESS": "Expenses:Phone",
    "COMCAST": "Expenses:Internet",
    "GEICO": "Expenses:Insurance:Auto",

    # Rent and mortgage - critical to get right
    "AVALON APARTMENTS": "Expenses:Housing:Rent",

    # Payroll - always the same
    "ACME CORP PAYROLL": "Income:Salary",
}

def categorize_deterministic(payee):
    """Apply exact-match rules first."""
    payee_upper = payee.upper()
    for pattern, account in DETERMINISTIC_RULES.items():
        if pattern in payee_upper:
            return account
    return None

These rules are “set in stone” - they override everything else because I know they’re 100% correct.

Layer 2: Pattern Matching (The Workhorse)

For categories with variation, I use regex patterns:

import re

PATTERN_RULES = [
    # Groceries - multiple stores
    (r'WHOLE\s*FOODS|TRADER\s*JOE|SAFEWAY|KROGER|PUBLIX',
     'Expenses:Food:Groceries'),

    # Gas stations
    (r'SHELL|CHEVRON|EXXON|BP\s|MOBIL|COSTCO\s*GAS',
     'Expenses:Auto:Gas'),

    # Restaurants - harder to enumerate
    (r'DOORDASH|UBER\s*EATS|GRUBHUB|POSTMATES',
     'Expenses:Food:Delivery'),

    # General restaurant pattern
    (r'CAFE|BISTRO|GRILL|PIZZA|SUSHI|BURGER|TACO|BBQ',
     'Expenses:Food:Restaurant'),

    # Amazon is tricky - could be anything
    # Leave for ML layer
]

def categorize_pattern(payee):
    """Apply regex pattern matching."""
    for pattern, account in PATTERN_RULES:
        if re.search(pattern, payee, re.IGNORECASE):
            return account
    return None

Layer 3: Machine Learning (The Smart Layer)

For everything else, I use smart_importer to learn from my historical categorizations:

from smart_importer import PredictPostings, PredictPayees
from smart_importer.pipelines import get_pipeline

class MyBankImporter:
    # ... importer setup ...

# Wrap with ML decorators
@PredictPostings()
@PredictPayees()
class SmartBankImporter(MyBankImporter):
    pass

The smart_importer learns from your existing ledger. Key insight: it only needs 2-3 examples per category to start making decent predictions.

Training Your ML Model

The Bootstrap Problem

When you’re starting fresh, you have no training data. Here’s my approach:

Week 1-4: Manual categorization

  • Categorize everything by hand
  • Be consistent with your account names
  • This builds your training set

Month 2+: Enable ML

  • Turn on smart_importer
  • It learns from your month of data
  • Review predictions and correct mistakes

Month 3+: Refinement

  • Add deterministic rules for 100% certain categories
  • ML handles the ambiguous cases
  • Accuracy improves as data grows

Minimum Viable Training Data

# You need at least 2-3 examples per category
# Example: To categorize "AMAZON" correctly, you need:

2024-01-15 * "AMAZON" "Household supplies"
  Expenses:Shopping:Online  45.00 USD
  Liabilities:CreditCard

2024-02-20 * "AMAZON" "Books"
  Expenses:Shopping:Online  23.99 USD
  Liabilities:CreditCard

# Now smart_importer can predict future AMAZON transactions

My Complete Importer Stack

Here’s how I combine all three layers:

from beancount.ingest import importer
from smart_importer import PredictPostings

class BankImporter(importer.ImporterProtocol):

    def extract(self, file, existing_entries):
        transactions = self.parse_csv(file)
        entries = []

        for txn in transactions:
            # Layer 1: Deterministic rules
            account = categorize_deterministic(txn['payee'])

            # Layer 2: Pattern matching
            if account is None:
                account = categorize_pattern(txn['payee'])

            # Layer 3: Leave blank for ML
            # smart_importer will fill this in
            if account is None:
                account = 'Expenses:Uncategorized'

            entry = self.create_transaction(txn, account)
            entries.append(entry)

        return entries

# Wrap with ML prediction
@PredictPostings()
class SmartBankImporter(BankImporter):
    pass

Measuring Success

I track my categorization accuracy monthly:

2025-01-31 custom "import-metrics" "January"
  transactions-imported: 127
  auto-categorized-correct: 114     ; 89.8%
  auto-categorized-wrong: 8
  manual-categorization: 5
  accuracy-rate: 89.8%

My target is 90%+ accuracy. When I dip below, I review what’s failing and add rules.

Common Gotchas

Gotcha 1: Payee Name Variations

Banks show the same merchant differently:

AMAZON.COM*123ABC
AMZN MKTP US*456
AMAZON PRIME*789

Solution: Normalize payees before matching:

def normalize_payee(payee):
    """Standardize payee names."""
    payee = payee.upper()
    # Remove transaction IDs
    payee = re.sub(r'\*[A-Z0-9]+$', '', payee)
    # Common variations
    payee = re.sub(r'AMZN|AMAZON\.COM', 'AMAZON', payee)
    return payee.strip()

Gotcha 2: Context Matters

“COSTCO” could be:

  • Groceries (food purchase)
  • Gas (Costco gas station)
  • Membership fee

Smart_importer handles this better than rules because it learns from amount patterns and timing.

Gotcha 3: One-Time Expenses

ML struggles with one-off transactions. For big purchases, I just accept manual categorization.

The 90/10 Goal

My realistic targets:

Transaction Type Handling % of Total
Recurring bills Deterministic rules 30%
Common merchants Pattern matching 35%
Variable spending ML prediction 25%
One-off/unusual Manual review 10%

If you’re spending more than 10 minutes per month on categorization, you haven’t trained your importers enough.

Questions for Discussion

  1. What’s your current categorization accuracy? I’m curious if others are hitting 90%+

  2. Anyone using LLMs for categorization? I’ve seen Beanborg uses ChatGPT as a fallback - worth the API cost?

  3. How do you handle splits? When one transaction needs multiple categories (like Costco groceries + household goods), what’s your approach?

Would love to hear other strategies for reducing manual categorization work!

Fred, this is incredibly useful! As someone managing bookkeeping for 15+ small business clients, I’ve spent a lot of time thinking about categorization efficiency. Let me add a professional bookkeeper’s perspective.

The Multi-Client Challenge

When you’re managing multiple ledgers, you can’t hand-tune importers for each client. I needed a more scalable approach:

The Shared Rule Library

I maintain a master rule library that works across all clients:

# rules/shared_rules.py
UNIVERSAL_MERCHANT_RULES = {
    # These work for virtually everyone
    "PAYPAL": None,  # Requires inspection - could be anything
    "SQUARE": None,  # Same - payment processor

    # Universally consistent
    "USPS": "Expenses:Shipping",
    "UPS STORE": "Expenses:Shipping",
    "FEDEX": "Expenses:Shipping",

    # Office supplies
    "STAPLES": "Expenses:Office:Supplies",
    "OFFICE DEPOT": "Expenses:Office:Supplies",

    # Software subscriptions
    "ADOBE": "Expenses:Software:Subscriptions",
    "MICROSOFT": "Expenses:Software:Subscriptions",
    "GOOGLE\\s*\\*": "Expenses:Software:Cloud",
}

# Client-specific overrides
CLIENT_OVERRIDES = {
    "restaurant_abc": {
        "SYSCO": "Expenses:COGS:Food",  # Food service distributor
        "US FOODS": "Expenses:COGS:Food",
    },
    "consulting_xyz": {
        "ZOOM": "Expenses:Software:Communication",
        "CALENDLY": "Expenses:Software:Scheduling",
    }
}

The Categorization Hierarchy

def categorize_for_client(payee, client_id, amount):
    """Multi-client categorization with priority."""

    # 1. Client-specific rules first
    if client_id in CLIENT_OVERRIDES:
        result = match_rules(payee, CLIENT_OVERRIDES[client_id])
        if result:
            return result

    # 2. Universal rules
    result = match_rules(payee, UNIVERSAL_MERCHANT_RULES)
    if result:
        return result

    # 3. Amount-based heuristics
    if amount > 1000:
        return "Expenses:Large:ToReview"  # Flag for manual review

    # 4. Fall back to ML
    return None

Smart_Importer in Production

I’ve been using smart_importer for about 18 months now. Here’s what I’ve learned:

Training Data Quality Matters More Than Quantity

A common misconception: “I need thousands of transactions to train the model.”

Reality: Clean, consistent data beats volume. I’ve seen better results from 200 well-categorized transactions than 2000 sloppy ones.

The key is consistency in your account names:

; BAD - inconsistent naming
Expenses:Food
Expenses:Food:Groceries
Expenses:Groceries
Expenses:Food&Drink:Groceries

; GOOD - consistent hierarchy
Expenses:Food:Groceries
Expenses:Food:Restaurant
Expenses:Food:Coffee
Expenses:Food:Delivery

The “Confidence Threshold” Trick

Smart_importer can tell you how confident it is. I use this to flag low-confidence predictions:

from smart_importer import PredictPostings
from smart_importer.pipelines import get_pipeline

@PredictPostings(
    string_tokenizer=lambda s: s.lower().split(),
    predict_payees=False,
    overwrite_existing_entries=False,
)
class SmartImporter(MyImporter):
    pass

# Then in post-processing, I check prediction confidence
# Low confidence -> flag for review

When ML Gets It Wrong

I keep a log of ML failures to improve my rules:

ML_FAILURES_LOG = [
    # Date, Payee, ML Predicted, Correct
    ("2025-01-15", "TARGET", "Expenses:Food:Groceries", "Expenses:Household"),
    ("2025-01-18", "COSTCO", "Expenses:Food:Groceries", "Expenses:Auto:Gas"),
]

# After accumulating failures, I add deterministic rules
# to prevent the same mistakes

My Time Investment vs. Savings

Here’s the honest math for one client:

Phase Hours Invested Monthly Time Saved
Initial setup 4 hours -
Rule writing 6 hours -
ML training 2 hours -
Total setup 12 hours -
Ongoing maintenance 30 min/month 2.5 hours/month

Payback period: 5 months

After that, it’s pure time savings. For 15 clients, the math is even better since I reuse 70% of the rules.

The “Flag for Review” Pattern

Instead of wrong categorization, I prefer “uncertain” categorization:

; Instead of guessing wrong:
2025-01-15 * "AMBIGUOUS MERCHANT"
  Expenses:Miscellaneous    ; ML guessed, probably wrong

; I prefer:
2025-01-15 * "AMBIGUOUS MERCHANT" #needs-review
  Expenses:ToReview         ; Explicit flag

Then I have a query that finds all items needing review:

SELECT date, payee, narration, account, number
WHERE 'needs-review' IN tags

This way I never have to hunt for miscategorized transactions.

Question for Fred

You mentioned LLMs as a fallback. I’ve experimented with this and found the latency problematic - each API call adds 1-2 seconds per transaction. For a statement with 100 transactions, that’s 2-3 minutes of waiting.

Have you found a way to batch LLM calls? Or do you run them offline and cache the results?

This thread is hitting close to home. I’ve been using Beancount for 4+ years and my categorization approach has evolved significantly. Here’s what I’ve learned, including some mistakes you can avoid.

The Beginner’s Mistake: Over-Engineering

When I first heard about smart_importer, I went ALL IN. I configured complex ML pipelines, wrote hundreds of regex rules, built elaborate normalization functions…

Result: A fragile system that broke constantly and was hard to debug.

The Veteran’s Approach: Progressive Complexity

Now I follow a simpler philosophy: Start dumb, get smarter gradually.

Phase 1: Just Get Data In (Month 1-3)

When starting with a new account, I do zero categorization automation:

class DumbImporter(Importer):
    def extract(self, file, existing_entries):
        # Just import with a generic expense account
        # Don't try to be clever yet
        for txn in self.parse_file(file):
            yield self.make_transaction(
                txn,
                account='Expenses:Uncategorized'
            )

Why? Because I don’t know the patterns yet. I need to see what data actually looks like.

Phase 2: Obvious Rules Only (Month 4-6)

After 3 months, I have data. Now I add rules for things I’m 100% certain about:

CERTAIN_CATEGORIES = {
    # Monthly bills - same every time
    "SPOTIFY USA": "Expenses:Subscriptions:Music",
    "NETFLIX.COM": "Expenses:Subscriptions:Streaming",

    # My employer
    "ACME CORP DIRECT DEP": "Income:Salary",

    # My landlord
    "AVALON PROP MGMT": "Expenses:Housing:Rent",
}

That’s it. Maybe 10-15 rules. Everything else stays Uncategorized.

Phase 3: Enable ML (Month 7+)

Only after I have 6+ months of categorized data do I enable smart_importer:

@PredictPostings()
class SmartImporter(MyImporter):
    pass

By now, the ML has plenty of training data and my rules handle the easy cases.

The “Teach Twice” Rule

This is my personal rule: Before adding an ML training example, I should have categorized that pattern at least twice manually.

Why? Because I might have made a mistake the first time. If I see “COSTCO” and categorize it as groceries, but next month realize some COSTCO charges are gas… that’s a training error.

Two observations give me confidence. Then I add the rule or let ML learn from it.

When NOT to Use ML

Some categories shouldn’t be predicted:

  1. Transfers between accounts - Too easy to create fake transactions
  2. Tax-sensitive categories - Deductions need human verification
  3. Large amounts - Anything over $500 gets manual review
def should_predict(transaction):
    """Decide if ML should attempt categorization."""
    # Never auto-categorize transfers
    if 'transfer' in transaction.narration.lower():
        return False

    # Large transactions need human eyes
    if abs(transaction.amount) > 500:
        return False

    # Tax-relevant categories need verification
    if transaction.suspected_category in TAX_SENSITIVE:
        return False

    return True

My Current Stats

After 4+ years of refinement:

Metric Value
Deterministic rules 45
Regex patterns 12
Monthly transactions ~150
Auto-categorized correctly 88%
Needs review 8%
Incorrectly categorized 4%

That 4% error rate sounds bad, but it’s usually small transactions under $20 that don’t materially affect my analysis.

The Debugging Mindset

When categorization fails, I ask:

  1. Is the payee name different than expected? Banks format names weirdly
  2. Is this a new merchant? ML can’t predict what it hasn’t seen
  3. Did I categorize inconsistently in the past? Garbage in, garbage out
  4. Is the amount unusual? Same merchant, different category (Costco grocery vs gas)

Most failures fall into category 1 or 2. Adding a normalization rule or waiting for more training data usually fixes it.

Encouragement for Newcomers

If you’re just starting with Beancount, don’t stress about automation. The categorization problem gets solved naturally over time:

  • Month 1-3: Manual is fine
  • Month 4-6: Add obvious rules
  • Month 7-12: Enable ML
  • Year 2+: Fine-tune as needed

The goal isn’t perfect automation. The goal is spending less time on tedious work so you can focus on actually understanding your finances.

Fred’s 90/10 target is realistic. I’d even say 85/15 is perfectly acceptable for personal finances.

Great thread! I wanted to add some practical tips from my experience managing multiple ledgers for personal and rental property finances.

The beancount-categorizer + smart_importer Combo

Fred mentioned the three-layer approach, and I want to emphasize how well beancount-categorizer works alongside smart_importer. Here’s my setup:

from beancount_categorizer import Categorizer
from smart_importer import PredictPostings

# Layer 1: Deterministic rules via categorizer
categorizer_rules = """
# Fixed bills - no guessing
SPOTIFY        -> Expenses:Subscriptions:Music
NETFLIX        -> Expenses:Subscriptions:Streaming
XFINITY        -> Expenses:Utilities:Internet
PG&E           -> Expenses:Utilities:Electric

# Landlord patterns (rental properties)
.* PROPERTY TAX -> Expenses:Property:Tax
.* HOA .*      -> Expenses:Property:HOA

# Regex patterns
(SAFEWAY|WHOLE FOODS|TRADER JOE) -> Expenses:Food:Groceries
(SHELL|CHEVRON|76 STATION)       -> Expenses:Auto:Gas
"""

@PredictPostings()  # Layer 2: ML for remaining
@Categorizer(rules=categorizer_rules)  # Layer 1: Rules first
class MyImporter(BaseImporter):
    pass

The key insight: order matters. The categorizer runs first and handles definite cases. Smart_importer only sees transactions that weren’t matched by rules.

Handling Multi-Category Merchants

Fred asked about splits. Here’s my approach for merchants like Costco that could be multiple categories:

Option 1: Amount-Based Heuristics

def categorize_costco(amount):
    """Costco-specific logic based on amount."""
    if amount < 50:
        return "Expenses:Auto:Gas"  # Likely gas fillup
    elif amount > 300:
        return "Expenses:Shopping:Bulk"  # Big shopping trip
    else:
        return "Expenses:Food:Groceries"  # Default assumption

This is crude but works 80% of the time.

Option 2: Leave for Manual Split

For complex transactions, I import without categorizing and split manually:

; Imported as:
2025-01-15 * "COSTCO"
  Assets:Bank:Checking    -247.83 USD
  Expenses:ToSplit

; Manually split to:
2025-01-15 * "COSTCO" "Monthly shopping"
  Assets:Bank:Checking    -247.83 USD
  Expenses:Food:Groceries    180.00 USD
  Expenses:Household          47.83 USD
  Expenses:Auto:Gas           20.00 USD

It’s extra work, but accurate tracking matters more than automation for me.

Option 3: Separate Payment Methods

My actual solution: I use different payment methods for different Costco purchases:

  • Costco Visa card → All Costco purchases (I split later)
  • Debit card → Gas only

This way the account already tells me the category.

The Rental Property Challenge

For rental properties, categorization has tax implications. I’m extra careful:

# Rental-specific rules with explicit property tagging
RENTAL_RULES = {
    "HOME DEPOT": {
        "default": "Expenses:Property:Maintenance",
        "metadata": {"property": "123_MAIN_ST"}
    },
    "LOWES": {
        "default": "Expenses:Property:Maintenance",
        "metadata": {"property": "123_MAIN_ST"}
    },
}

def categorize_rental(txn, rules):
    """Add property metadata for tax tracking."""
    if txn.payee in rules:
        rule = rules[txn.payee]
        txn.account = rule["default"]
        txn.meta.update(rule.get("metadata", {}))
    return txn

I can’t afford ML errors on Schedule E deductions.

Training Data Audit

One thing I do annually: audit my training data for consistency.

from collections import defaultdict

def audit_categorizations(entries):
    """Find inconsistent categorizations."""
    payee_accounts = defaultdict(set)

    for entry in entries:
        if hasattr(entry, 'payee') and entry.payee:
            for posting in entry.postings:
                if posting.account.startswith('Expenses:'):
                    payee_accounts[entry.payee].add(posting.account)

    # Find payees with multiple expense categories
    inconsistent = {
        payee: accounts
        for payee, accounts in payee_accounts.items()
        if len(accounts) > 1
    }

    return inconsistent

# Output:
# COSTCO: {Expenses:Food:Groceries, Expenses:Auto:Gas}
# TARGET: {Expenses:Food:Groceries, Expenses:Household}

This shows me where my categorization has been inconsistent - exactly the data that confuses ML models.

My Accuracy Numbers

For personal accounts (not rental):

Source Auto-Correct Manual Review Wrong
Credit Card 92% 5% 3%
Checking 85% 10% 5%
Rental Property 70% 25% 5%

The rental property numbers are lower because I intentionally flag more for manual review due to tax sensitivity.

One More Tip: The “New Merchant” Flag

I add a flag for merchants I haven’t seen before:

def is_new_merchant(payee, existing_entries):
    """Check if this is a payee we've never seen."""
    known_payees = {e.payee for e in existing_entries if hasattr(e, 'payee')}
    return payee not in known_payees

# In importer:
if is_new_merchant(txn.payee, existing):
    txn.tags.add('new-merchant')

New merchants always need human review - ML literally can’t help because there’s no training data.

This has been an incredibly valuable thread. Let me consolidate what we’ve learned into a practical framework that anyone can implement.

The Smart Defaults Framework

Based on everyone’s input, here’s a complete implementation strategy:

Level 1: Deterministic Rules (Day 1)

Start here. These handle your most predictable transactions:

# rules.py - Your first categorization layer
DETERMINISTIC_RULES = {
    # Monthly bills (100% predictable)
    "SPOTIFY": "Expenses:Subscriptions:Music",
    "NETFLIX": "Expenses:Subscriptions:Streaming",
    "HULU": "Expenses:Subscriptions:Streaming",
    "VERIZON": "Expenses:Phone",

    # Income sources
    "DIRECT DEPOSIT": "Income:Salary",
    "INTEREST PAYMENT": "Income:Interest",

    # Transfers (important to catch)
    "TRANSFER FROM": "Assets:Bank:Savings",
    "TRANSFER TO": "Assets:Bank:Checking",
}

def apply_deterministic(payee):
    payee_upper = payee.upper()
    for pattern, account in DETERMINISTIC_RULES.items():
        if pattern in payee_upper:
            return account
    return None

Level 2: Pattern Matching (Month 2)

Add after you see data patterns:

import re

PATTERN_RULES = [
    # Groceries
    (r'SAFEWAY|WHOLE\s*FOODS|TRADER|KROGER|PUBLIX|ALDI',
     'Expenses:Food:Groceries'),

    # Gas
    (r'SHELL|CHEVRON|EXXON|MOBIL|BP\s|COSTCO\s*GAS',
     'Expenses:Auto:Gas'),

    # Dining
    (r'DOORDASH|UBER\s*EATS|GRUBHUB',
     'Expenses:Food:Delivery'),

    # General restaurants (looser pattern)
    (r'CAFE|GRILL|PIZZA|SUSHI|THAI|CHINESE|BBQ',
     'Expenses:Food:Restaurant'),
]

def apply_patterns(payee):
    for pattern, account in PATTERN_RULES:
        if re.search(pattern, payee, re.IGNORECASE):
            return account
    return None

Level 3: ML Prediction (Month 6+)

Enable smart_importer after building training data:

from smart_importer import PredictPostings

@PredictPostings()
class SmartBankImporter(BaseBankImporter):
    """ML-augmented importer."""

    def extract(self, file, existing_entries):
        entries = []
        for txn in self.parse_file(file):
            # Try deterministic first
            account = apply_deterministic(txn['payee'])

            # Try patterns second
            if not account:
                account = apply_patterns(txn['payee'])

            # Default to Uncategorized (ML will predict)
            if not account:
                account = 'Expenses:Uncategorized'

            entry = self.make_transaction(txn, account)
            entries.append(entry)

        return entries

The Quality Metrics Dashboard

Track your categorization quality:

2025-01-31 custom "categorization-metrics" "January 2025"
  ; Layer 1: Deterministic rules
  rule-matched: 45
  rule-accuracy: 100%

  ; Layer 2: Pattern matching
  pattern-matched: 35
  pattern-accuracy: 97%

  ; Layer 3: ML prediction
  ml-predicted: 32
  ml-accuracy: 85%

  ; Manual review required
  manual-required: 15

  ; Overall
  total-transactions: 127
  auto-correct: 91%
  review-required: 9%

Decision Tree for New Importers

START: Setting up new importer
│
├─ Do you have 6+ months of data for this account?
│  │
│  ├─ NO: Start with rules only
│  │  └─ Add deterministic rules for known merchants
│  │  └─ Use Expenses:Uncategorized for unknowns
│  │  └─ Wait until you have more data
│  │
│  └─ YES: Enable full stack
│     └─ Deterministic rules first
│     └─ Pattern matching second
│     └─ smart_importer for remaining
│
├─ Is this a tax-sensitive account?
│  └─ YES: Set lower confidence thresholds
│  └─ Flag more transactions for review
│
└─ Are there multi-category merchants?
   └─ YES: Consider amount-based heuristics
   └─ OR: Use different payment methods
   └─ OR: Accept manual splitting

Key Takeaways from This Thread

Principle Implementation
Start simple Deterministic rules before ML
Build training data 6 months manual before automation
Layer your approach Rules → Patterns → ML → Manual
Audit regularly Check for inconsistent categorizations
Accept imperfection 90% auto + 10% manual is success
Flag uncertainty Better to review than miscategorize
Separate tax-sensitive Don’t rely on ML for deductions

The 90/10 Achievement Checklist

You’ve achieved smart defaults when:

  • Monthly bills auto-categorize with 100% accuracy
  • Known merchants categorize without review
  • New merchants are flagged for review
  • Less than 10% of transactions need manual work
  • Tax-sensitive transactions are always verified
  • Your monthly categorization time is under 15 minutes

What’s Next

I’m now exploring LLM integration for that remaining 10% - using Claude or GPT as a fallback for genuinely ambiguous transactions. But that’s a topic for another thread.

Thanks everyone for the excellent contributions. This discussion has improved my own setup significantly.