Mastering BQL: from basic SELECT to complex inventory aggregations

Introduction to Beancount Query Language (BQL)

After years of working with Beancount, I’ve come to appreciate BQL (Beancount Query Language) as one of its most powerful features. While it looks like SQL at first glance, it’s specifically designed for double-entry accounting and has some unique characteristics worth understanding deeply.

What Makes BQL Different from SQL?

BQL operates on a two-level filtering process that’s crucial to understand:

  1. FROM clause: Filters at the transaction level
  2. WHERE clause: Filters at the posting level (individual entries within transactions)

This distinction is important because in double-entry bookkeeping, every transaction has multiple postings. You might want to filter transactions by date, but then further filter which specific postings within those transactions you care about.

Real-World Query Examples

Example 1: Monthly Expense Analysis

SELECT 
  account,
  MONTH(date) as month,
  sum(position) as total
FROM OPEN ON 2024-01-01 CLOSE ON 2025-01-01
WHERE account ~ 'Expenses:'
GROUP BY account, month
ORDER BY month, total

This query:

  • Opens a period from Jan 1, 2024 to Jan 1, 2025 (FROM clause)
  • Filters only expense account postings (WHERE clause)
  • Groups by account and month
  • Returns monthly expense totals by category

Example 2: Holdings Report with Cost Basis

SELECT 
  account,
  sum(cost(position)) as cost_basis,
  sum(units(position)) as units,
  sum(convert(position, 'USD')) as market_value
WHERE account ~ 'Assets:Investments'
AND currency != 'USD'

Here we use three critical BQL functions:

  • cost(position): Returns the cost basis (what you paid)
  • units(position): Returns just the quantity of holdings
  • convert(position, 'USD'): Converts to current market value using price directives

Example 3: Payee Analysis

SELECT 
  payee,
  count(*) as num_transactions,
  sum(position) as total
WHERE account ~ 'Expenses:'
AND year = 2024
GROUP BY payee
ORDER BY total DESC
LIMIT 20

This shows your top 20 payees by expense amount—incredibly useful for budget analysis.

Advanced Aggregation and Inventory Handling

One of BQL’s most powerful features is sum(position), which doesn’t just add numbers—it maintains inventories. This means:

SELECT account, sum(position)
WHERE account ~ 'Assets:Investments:Vanguard'

Will return something like:

account                              sum_position
------------------------------------ ------------------
Assets:Investments:Vanguard          450.00 VTSAX {250.00 USD}
                                    +320.00 VTIAX {180.50 USD}

It preserves lot information, cost basis, and handles multiple currencies/commodities automatically. This is fundamentally different from SQL’s SUM(), which would just error on non-numeric data.

Integration with Fava

What makes BQL even more powerful is integration with Fava (the web interface). You can:

  1. Write BQL queries in Fava’s query interface
  2. Save them as custom reports
  3. Generate charts directly from query results
  4. Export to CSV for further analysis

In Fava, navigate to “Query” and run:

SELECT year, sum(position) 
WHERE account ~ 'Income:'
GROUP BY year

Then click “Chart” to visualize income trends over time.

Key Resources

For those wanting to dive deeper:

Advanced Topics Worth Exploring

  • Using OPEN ON and CLOSE ON for period analysis
  • The PIVOT functionality for cross-tab reports
  • Combining BQL with Python scripts for automation
  • Performance optimization for large ledgers (100k+ transactions)

What BQL queries do you find most useful? I’m particularly interested in hearing about creative uses of the inventory aggregation features.

Excellent overview! I want to add some practical documentation links and clarify a few points about BQL that often trip up newcomers.

Official Documentation Structure

The official BQL documentation has been significantly improved as of 2024. Here’s what you should read in order:

  1. Core BQL Reference: Beancount Query Language - Beancount Documentation

    • Complete syntax reference
    • All built-in functions documented
    • FROM/WHERE distinction explained
  2. Cookbook Examples: Running Beancount and Generating Reports - Beancount Documentation

    • Real-world query patterns
    • Common reporting scenarios
  3. bean-query CLI: Running Beancount and Generating Reports - Beancount Documentation

    • Command-line interface documentation
    • Output format options (text, csv, beancount)

Important Clarifications on Functions

@practical_adviser covered the key functions well, but I want to emphasize the difference between cost() and value():

cost() vs units() vs convert()

-- Example: You bought 10 shares of AAPL at $150 each
-- Current price is $180

SELECT 
  units(position),      -- Returns: 10 AAPL
  cost(position),       -- Returns: 1500.00 USD (what you paid)
  convert(position, 'USD')  -- Returns: 1800.00 USD (current value)
WHERE account = 'Assets:Investments:AAPL'

The distinction is crucial for:

  • Tax reporting: Use cost() for cost basis
  • Portfolio valuation: Use convert() for current value
  • Quantity tracking: Use units() for just the number of shares

Advanced: The PIVOT Functionality

One underdocumented feature is PIVOT, which creates cross-tab reports:

SELECT 
  account,
  sum(position) as balance
WHERE account ~ 'Expenses:'
PIVOT BY MONTH(date)

This creates a table with accounts as rows and months as columns—perfect for comparing expenses across time periods.

Performance Tips for Large Ledgers

If you have 50k+ transactions, BQL queries can get slow. Here’s what helps:

  1. Use date filtering early: Put date restrictions in the FROM clause

    FROM OPEN ON 2024-01-01 CLOSE ON 2024-12-31
    -- Much faster than WHERE year = 2024
    
  2. Limit account regex complexity: Simple prefixes are fastest

    -- Fast
    WHERE account ~ 'Expenses:'
    
    -- Slower
    WHERE account ~ 'Expenses:(Food|Entertainment|Travel)'
    
  3. Use bean-query with --format=csv for programmatic processing instead of parsing text output

Community Query Repository

Beyond aumayr’s repo (which is excellent), also check out:

Integration with Python Scripts

For automation, you can use BQL from Python:

from beancount import loader
from beancount.query import query_execute
from beancount.query import query_compile

entries, errors, options = loader.load_file('my.beancount')

query = "SELECT account, sum(position) WHERE account ~ 'Expenses:'"
result_types, result_rows = query_execute.execute_query(
    entries, options, query
)

for row in result_rows:
    print(row)

This enables scheduled reports, automated analysis, and custom tooling built on BQL.

Great post! The inventory aggregation point is especially important—I’ve seen too many people try to export to SQL databases and lose the lot information that makes Beancount so powerful.

This is incredibly helpful! I’ve been using Beancount for about 3 months now, and BQL has been one of the steeper learning curves for me. I want to share some beginner mistakes I made and how I overcame them.

Mistake #1: Confusing Transaction vs Posting Filtering

When I first started, I wrote queries like this:

-- WRONG: This doesn't work as expected
SELECT account, sum(position)
FROM date >= 2024-01-01
WHERE account ~ 'Expenses:'

I thought FROM date >= 2024-01-01 would filter transactions by date. But BQL doesn’t have a FROM date syntax! The correct way is:

-- CORRECT
SELECT account, sum(position)
FROM OPEN ON 2024-01-01
WHERE account ~ 'Expenses:'

Or alternatively:

SELECT account, sum(position)
WHERE account ~ 'Expenses:'
AND date >= 2024-01-01

Understanding that FROM operates on transactions while WHERE operates on postings took me a while. @practical_adviser’s explanation finally made it click.

Mistake #2: Not Understanding sum(position) Returns Inventory

I tried to do math on sum(position) results like this:

-- This doesn't work!
SELECT account, sum(position) * 2
WHERE account ~ 'Assets:'

Because sum(position) returns an Inventory object (not a number), you can’t multiply it directly. You need to use cost() or convert() first:

-- Correct way
SELECT account, sum(cost(position)) * 2
WHERE account ~ 'Assets:'

Mistake #3: Regex Syntax Confusion

Coming from other tools, I initially tried:

-- WRONG
WHERE account LIKE 'Expenses:%'

But BQL uses the ~ operator for regex matching, not LIKE:

-- CORRECT
WHERE account ~ 'Expenses:'

Note that it’s a regex, so special characters need escaping. To match Assets:US:Bank, you’d technically need Assets:US:Bank because : has special meaning in some regex engines (though in practice it works fine).

Beginner-Friendly Query: Net Worth Over Time

One query I run monthly to track progress:

SELECT 
  YEAR(date) as year,
  MONTH(date) as month,
  sum(convert(position, 'USD')) as net_worth
WHERE account ~ 'Assets:' OR account ~ 'Liabilities:'
GROUP BY year, month
ORDER BY year, month

This gives me a time series of net worth. I export it to CSV and graph it in a spreadsheet.

Question About Advanced Features

@doc_linker mentioned the PIVOT functionality—I tried this but got confused about the syntax. Can you show a complete example?

For instance, if I want to see my monthly expenses by category in a grid (categories as rows, months as columns), would it be:

SELECT account, sum(position)
WHERE account ~ 'Expenses:'
AND year = 2024
GROUP BY account
PIVOT BY MONTH(date)

Or does PIVOT require different syntax? The official docs weren’t super clear on this.

Tools That Helped Me Learn

  1. Fava’s query interface (http://localhost:5000/query) - Interactive and shows results immediately
  2. bean-query command line - Great for scripting once you know what you’re doing
  3. The aumayr repository @practical_adviser mentioned - I literally copied queries from there and modified them to learn

Request for Beginners

Could someone create a “BQL Cheat Sheet” with:

  • Common query patterns (expenses by month, income summary, net worth, holdings)
  • Function reference (cost, units, convert, sum)
  • Common regex patterns for accounts

I’d love to print it out and keep it next to my desk while I’m learning!

Thanks again for this detailed post—bookmarked for future reference!