DOCUMENT OPS

Extract Invoice Fields from Dropbox PDFs into Supabase

Watches a Dropbox folder for new PDF invoices, pulls structured fields with an OpenAI vision model, and writes clean rows to Supabase.

CategoryDocument Ops
Enginesim
Difficultyintermediate
Triggerevent
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNew PDF added to Dropbox folderDropboxDropbox
  • ActionDownload PDF file contentsDropboxDropbox
  • ActionExtract invoice fields with OpenAI structured outputOpenAI
  • LogicValidate and normalize fields, route failures to review
  • OutputInsert clean row into Supabase invoices tableSupabaseSupabase

What it does

This workflow turns a Dropbox folder into a hands-off invoice intake pipeline. Whenever a new PDF lands in the watched folder, it downloads the file, sends it to an OpenAI model with structured-output instructions, and parses out the fields you care about — vendor name, invoice number, issue and due dates, line items, subtotal, tax, and total. The validated result is inserted as a typed row in a Supabase table, ready for reporting or reconciliation. Files that fail extraction are flagged rather than silently dropped.

When to use it

Reach for this when finance or ops receives invoices as PDFs and someone is currently retyping them into a spreadsheet or ERP. It is ideal for accounts-payable teams, bookkeeping firms handling many clients, or any back office that needs every incoming document captured as queryable data within seconds of arrival. It also works well as the first stage of a larger approval or payment flow, since the Supabase row becomes the single source of truth other automations can build on.

How it works

The workflow is triggered by Dropbox's file-added event on a specific folder. The new file is downloaded as binary, then passed to OpenAI with a JSON schema describing the invoice shape, so the model returns strict structured output instead of free text. A logic step validates required fields (invoice number and total must be present and numeric) and normalizes dates and currency. Valid records are inserted into a Supabase `invoices` table via the service-role client; records that fail validation are written to a `invoices_review` table with the raw extraction so a human can correct them. Because it runs on Hive's Sim engine, each colony processes its own documents in isolation with no shared state.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DropboxFiles and folders.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Connect SupabaseTables, auth, storage, edge functions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.