Skip to content

andrewhinh/formless

Repository files navigation

formless

Hard handwriting understanding.

Usage

Use the web app:

https://bit.ly/formless-fe

Or hit the API:

curl -X POST -H "Content-Type: application/json" -d '{"image_url": "<image-url>"}' https://bit.ly/formless-api

Or use the CLI:

uv run formless -i <image-url> [-v]
or
uv run formless -p <local-image-path> [-v]

Or use in Python:

from formless import scan
scan(image_url="<image-url>", verbose=1)
scan(image_path="<local-image-path>", verbose=1)

Development

Set Up

Set up the environment:

make setup

Create a .env (+ .env.dev):

HF_TOKEN=

POSTGRES_URL=
POSTGRES_PRISMA_URL=
SUPABASE_URL=
NEXT_PUBLIC_SUPABASE_URL=
POSTGRES_URL_NON_POOLING=
SUPABASE_JWT_SECRET=
POSTGRES_USER=
NEXT_PUBLIC_SUPABASE_ANON_KEY=
POSTGRES_PASSWORD=
POSTGRES_DATABASE=
SUPABASE_SERVICE_ROLE_KEY=
POSTGRES_HOST=
SUPABASE_ANON_KEY=

LIVE=
DEBUG=
STRIPE_PUBLISHABLE_KEY=
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=
DOMAIN=
API_URL=

WANDB_API_KEY=
WANDB_ENTITY=

AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_REGION=
OPENAI_API_KEY=

Useful Commands

Migrate db (do before running the frontend/api):

make migrate ENV=<env> MSG=<message>

Repository Structure

.
├── api                 # API.
├── frontend            # frontend.
├── src/formless        # python bindings.
├── training            # training.

API

Test the API with an example input:

modal run api/app.py

Serve the API locally:

uv run api/app.py

Serve the API on Modal:

modal serve api/app.py

Deploy on dev:

modal deploy api/app.py

Deploy on main:

modal deploy --env=main api/app.py

Frontend

Serve the web app locally:

uv run frontend/app.py
stripe listen --forward-to <url>/webhook
# update API_URL, STRIPE_WEBHOOK_SECRET, and DOMAIN in .env.dev

Serve the web app on Modal:

modal serve frontend/app.py
stripe listen --forward-to <url>/webhook
# update API_URL, STRIPE_WEBHOOK_SECRET, and DOMAIN in .env.dev

Deploy on dev:

modal deploy frontend/app.py
# update API_URL, STRIPE_WEBHOOK_SECRET, and DOMAIN in .env.dev

Deploy on main:

modal deploy --env=main frontend/app.py

PyPI

Run the package:

uv run formless -v
# update API_URL in src/formless/__init__.py

Build the package:

uvx --from build pyproject-build --installer uv

Upload the package:

uvx twine upload dist/*

Test the uploaded package:

uv run --with formless --no-project -- formless -v

Training

Download data:

make data

Upload to S3 (if using Modal):

make sync

Label subset of data (~1000 samples) to train writing quality classifier:

modal run training/etl.py --cls

or:

uv run training/etl.py --cls

Run classifier training (e.g. here):

modal run training/train.py --cls

or:

uv run training/train.py --cls

Use trained classifier to filter train/val/test data (down to ~10k samples) to train VLM using full SFT and eval:

modal run training/etl.py --sft

Run SFT:

modal run training/train.py --sft

Run trained VLM on val data and collect/manually label worst examples (~50 samples) for DPO training:

modal run training/etl.py --dpo

Run DPO:

modal run training/train.py --dpo

Quantize the DPO model:

modal run training/quantize.py

About

Hard handwriting understanding.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages