Skip to content

Commit b6f7d37

Browse files
committed
adding post on hsa expense analyzer cli
1 parent f7d3f69 commit b6f7d37

File tree

2 files changed

+202
-0
lines changed

2 files changed

+202
-0
lines changed
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
---
2+
title: 'HSA Expense Analyzer CLI Tool'
3+
author: Josh Johanning
4+
date: 2025-09-04 10:00:00 -0500
5+
description: A Node.js CLI tool that analyzes and visualizes HSA expense and reimbursement totals by year from a folder of receipts.
6+
categories: [Hobbies, Tools]
7+
tags: [Node.js, Personal Finance]
8+
media_subpath: /assets/screenshots/2025-09-04-hsa-expense-analyzer
9+
image:
10+
path: hsa-expense-analyzer.png
11+
width: 100%
12+
height: 100%
13+
alt: HSA Expense Analyzer sample output showing yearly expense breakdown and charts
14+
---
15+
16+
## Overview
17+
18+
After a few years of saving HSA receipts, I had hundreds of PDFs and images with completely inconsistent naming. Then I saw [this post on X](https://x.com/VicVijayakumar/status/1733869605645385959) where someone shared a utility they built that standardizes receipt naming and graphs yearly totals. That's exactly what I needed! Therefore, I built my own Node.js CLI tool (with the help of GitHub Copilot 🤖) that parses receipt filenames and generates visual reports showing healthcare expenses and reimbursements year-over-year.
19+
20+
The tool works by reading filenames like `2023-05-15 - dentist - $150.00.pdf` and automatically categorizing them by year, tracking reimbursements, and creating ASCII charts. It's saved me hours of manual spreadsheet work which was my previous method of tracking this data.
21+
22+
You can find the complete source code on GitHub: [joshjohanning/hsa-expense-analyzer](https://github.com/joshjohanning/hsa-expense-analyzer-cli) and the npm package is published at [@joshjohanning/hsa-expense-analyzer-cli](https://www.npmjs.com/package/@joshjohanning/hsa-expense-analyzer-cli).
23+
24+
## The Problem
25+
26+
My HSA receipt folder looked like this mess:
27+
28+
- `receipt_scan_jan_2021.pdf`{: .filepath}
29+
- `Doctor visit Jan 15 2022 $45.pdf`{: .filepath}
30+
- `IMG_2847.jpg`{: .filepath} (a pharmacy receipt from who knows when and how much)
31+
- `2023-dental-cleaning-receipt.pdf`{: .filepath}
32+
33+
Sound familiar? After years of saving receipts, I had three main problems:
34+
35+
- **Tracking reimbursements**: Which expenses have been reimbursed and which haven't?
36+
- **Yearly analysis**: An orderly way to track receipts (expenses) and reimbursements by years without manually entering data into a spreadsheet
37+
- **Reimbursable amount**: Since I don't submit my HSA claims right away, I wanted to know how much I could withdraw at any given time
38+
39+
## The Solution
40+
41+
Instead of tracking 400+ files by hand, I decided to establish a naming convention going forward and build a tool to parse it. The format is simple but consistent:
42+
43+
```text
44+
<yyyy-mm-dd> - <description> - $<total>.pdf|png|jpg|whatever
45+
```
46+
{: .nolineno}
47+
48+
When I get reimbursed, I just add `.reimbursed.` before the file extension. The tool automatically detects this:
49+
50+
```text
51+
<yyyy-mm-dd> - <description> - $<total>.reimbursed.pdf|png|jpg|whatever
52+
```
53+
{: .nolineno}
54+
55+
It did take me a little bit to go through each receipt and add a date and total, but it was worth it to have a consistent format going forward.
56+
57+
### Example File Structure
58+
59+
Here's what my receipt folder looks like now:
60+
61+
```text
62+
receipts/
63+
├── 2021-01-01 - doctor - $45.00.pdf # Expense
64+
├── 2021-02-15 - pharmacy - $30.00.reimbursed.pdf # Reimbursed expense
65+
├── 2022-02-01 - doctor - $50.00.reimbursed.pdf # Reimbursed expense
66+
├── 2022-03-15 - dentist - $150.00.png # Expense
67+
├── 2023-05-01 - doctor - $45.00.pdf # Expense
68+
├── 2024-07-15 - doctor - $50.00.pdf # Expense
69+
└── 2025-01-15 - doctor - $125.00.pdf # Expense
70+
```
71+
{: .nolineno}
72+
73+
## Technical Implementation
74+
75+
I built this with Node.js because I wanted something I could run locally without dependencies on external services. Three main packages do the heavy lifting:
76+
77+
- **[`chartscii`](https://github.com/tool3/chartscii)** - Creates those ASCII bar charts you see in the output
78+
- **`yargs`** - Handles command-line arguments (`--dirPath`, `--help`, etc.)
79+
- **`prettyjson`** - Formats the summary tables nicely
80+
81+
The core logic is surprisingly simple - it's mostly string manipulation and basic math.
82+
83+
### What It Actually Does
84+
85+
1. **Reads filenames** - Splits on " - " and validates the date/amount format
86+
2. **Groups by year** - Extracts the year from each date
87+
3. **Tracks reimbursements** - Looks for `.reimbursed.` in the filename
88+
4. **Warns about mismatched files** - Lists files that don't match the expected pattern
89+
5. **Generates reports** - Summary tables plus three different ASCII charts
90+
6. **Does the math** - Calculates the yearly totals for expenses and reimbursements
91+
92+
## Usage
93+
94+
The easiest way is to install as a global package from [npm](https://www.npmjs.com/package/@joshjohanning/hsa-expense-analyzer-cli) and run it:
95+
96+
```bash
97+
npm install -g @joshjohanning/hsa-expense-analyzer-cli
98+
hsa-expense-analyzer-cli --dirPath="/path/to/your/receipts"
99+
```
100+
{: .nolineno}
101+
102+
103+
Or if you want to clone locally and hack on the code:
104+
105+
```bash
106+
git clone https://github.com/joshjohanning/hsa-expense-analyzer-cli
107+
cd hsa-expense-analyzer-cli
108+
npm install
109+
npm run test # runs with test data
110+
npm run start -- --dirPath="/your/receipts/folder" # runs with your data
111+
```
112+
{: .nolineno}
113+
114+
## What The Output Looks Like
115+
116+
Here's what you get when you run it on a folder with a few years of receipts:
117+
118+
```text
119+
2021:
120+
expenses: $75.00
121+
reimbursements: $30.00
122+
receipts: 2
123+
2022:
124+
expenses: $250.00
125+
reimbursements: $100.00
126+
receipts: 3
127+
2023:
128+
expenses: $100.00
129+
reimbursements: $55.00
130+
receipts: 2
131+
2024:
132+
expenses: $50.00
133+
reimbursements: $0.00
134+
receipts: 1
135+
2025:
136+
expenses: $125.00
137+
reimbursements: $0.00
138+
receipts: 1
139+
Total:
140+
expenses: $600.00
141+
reimbursements: $185.00
142+
receipts: 9
143+
144+
Expenses by year
145+
2021 ╢██████░░░░░░░░░░░░░░ $75.00
146+
2022 ╢████████████████████ $250.00
147+
2023 ╢████████░░░░░░░░░░░░ $100.00
148+
2024 ╢████░░░░░░░░░░░░░░░░ $50.00
149+
2025 ╢██████████░░░░░░░░░░ $125.00
150+
╚════════════════════
151+
152+
Reimbursements by year
153+
2021 ╢██████░░░░░░░░░░░░░░ $30.00
154+
2022 ╢████████████████████ $100.00
155+
2023 ╢███████████░░░░░░░░░ $55.00
156+
2024 ╢░░░░░░░░░░░░░░░░░░░░ $0.00
157+
2025 ╢░░░░░░░░░░░░░░░░░░░░ $0.00
158+
╚════════════════════
159+
160+
Expenses vs Reimbursements by year
161+
2021 Expenses ╢██████░░░░░░░░░░░░░░ $75.00
162+
2021 Reimbursements ╢██░░░░░░░░░░░░░░░░░░ $30.00
163+
2022 Expenses ╢████████████████████ $250.00
164+
2022 Reimbursements ╢████████░░░░░░░░░░░░ $100.00
165+
2023 Expenses ╢████████░░░░░░░░░░░░ $100.00
166+
2023 Reimbursements ╢████░░░░░░░░░░░░░░░░ $55.00
167+
2024 Expenses ╢████░░░░░░░░░░░░░░░░ $50.00
168+
2024 Reimbursements ╢░░░░░░░░░░░░░░░░░░░░ $0.00
169+
2025 Expenses ╢██████████░░░░░░░░░░ $125.00
170+
2025 Reimbursements ╢░░░░░░░░░░░░░░░░░░░░ $0.00
171+
╚════════════════════
172+
```
173+
{: .nolineno}
174+
175+
If you have files that don't match the expected naming pattern, you'll see a warning at the top of the output:
176+
177+
```text
178+
⚠️ WARNING: The following files do not match the expected pattern:
179+
Expected pattern: <yyyy-mm-dd> - <description> - $<amount>.<ext>
180+
Files with issues:
181+
- 2021-01-15- doctor - 50.00.pdf
182+
- 2021-01-15-wrong-format-missing-dashes.pdf
183+
- wrong-format.pdf
184+
```
185+
{: .nolineno}
186+
187+
## What's Next
188+
189+
A few things I'm thinking about adding:
190+
191+
- **CSV export** - For people who want to analyze the data in Excel
192+
- **Category parsing** - Extract "doctor" vs "pharmacy" vs "dental" from descriptions
193+
- **Family member tracking** - Parse names to see spending per person
194+
- **Year reimbursed vs. year incurred** - Track the year expenses were reimbursed vs. the year they were incurred (but does this matter??)
195+
196+
## Summary
197+
198+
This started as a weekend project to solve my own problem, but I realized other people probably have the same messy folder of HSA receipts. I love the [`chartscii`](https://github.com/tool3/chartscii) package that creates the ASCII charts which provide immediate visual feedback.
199+
200+
Plus, it was built entirely with GitHub Copilot 🤖, which was quite fun. I also learned a lot about creating and publishing `npm` packages, which has been useful for my day job.
201+
202+
The complete source code and documentation are available on GitHub at [joshjohanning/hsa-expense-analyzer-cli](https://github.com/joshjohanning/hsa-expense-analyzer-cli) and the npm package is published at [@joshjohanning/hsa-expense-analyzer-cli](https://www.npmjs.com/package/@joshjohanning/hsa-expense-analyzer-cli). Feel free to fork, contribute, or adapt it for your own needs!
446 KB
Loading

0 commit comments

Comments
 (0)