INVOICE PROCESSING

PDF Invoice OCR Extraction with Three-Way Match Validation

Watches a Google Drive folder for new invoice PDFs, extracts line items with an LLM, matches them against the PO and receipt, and files clean invoices while queuing mismatches.

CategoryInvoice Processing
EngineSim + Paperclip
Difficultyadvanced
Triggerevent
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNew invoice PDF in Google Drive folderGoogle DriveGoogle Drive
  • ActionExtract line items from PDF with LLMOpenAI
  • ActionFetch PO + goods receipt from PostgresPostgreSQLPostgres
  • LogicMatch qty/price within tolerance, flag discrepancies
  • OutputFile matched invoice or queue exceptionGoogle DriveGoogle Drive

What it does

Handles invoices that arrive as PDFs. It reads each new file, extracts vendor, PO number, and line-item detail with an LLM, then runs the three-way match against your purchase order and receiving data. Validated invoices are filed; anything that fails the match is held for review with the extracted data attached.

When to use it

Use this when vendors send invoices as email attachments or scanned PDFs rather than structured e-invoices, and someone is currently typing those line items in by hand.

How it works

  1. 1A new PDF appears in the monitored Google Drive invoices folder.
  2. 2The file content is read and passed to OpenAI to extract structured fields: vendor, PO reference, line items, unit prices, and totals.
  3. 3The flow fetches the referenced PO and goods receipt from Postgres.
  4. 4A logic step matches quantities and prices within tolerance and flags discrepancies.
  5. 5Clean invoices are filed to a 'matched' Drive subfolder; exceptions are recorded with the parsed data for a reviewer.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect Google DriveDocs, sheets, slides, files.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Connect PostgresAny Postgres URL — query, write, migrate.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.