Back to all work
Internal tools · AI automation

AI Document Analysis & Import Consolidation Engine

A building-materials import business relied on manual reconciliation of invoices, purchase orders, and logistics paperwork. Each supplier used different formats, making consolidation slow, error-prone, and impossible to scale.

Hours saved every week

Manual recons cut dramatically across invoices, purchase orders, and logistics docs.

Flexible document ingestion

OCR + AI handled new formats without custom import logic.

Linked business insights

Vector embeddings connected data across suppliers, shipments, and timelines.

Approach

Designed a 2-stage pipeline: OCR → AI extraction with confidence scoring and human-review gates. Normalized data into a single schema and used vector embeddings to cross-reference data across shipments, vendors, and dates for various business insights.

Outcome

Turned hours of daily manual reconciliation into minutes, improved data accuracy, and gave the operations team live clarity into shipments and supplier performance.

Internal toolsAI automationWorkflow design

Where things were stuck

The business imported building materials from multiple suppliers, each sending invoices, packing lists, and logistics documentation in different formats. None of it lined up cleanly.

Every week, the team spent hours manually extracting numbers, cross-checking totals, and matching documents to shipments. Any incorrect figure could cascade into pricing mistakes, late payments, or missed operational decisions.

The leadership didn’t need a big ERP or a full system replacement - just a reliable way to ingest documents, validate them, and surface insight from the data they already had.

What we built

Over a 2-month period, I built an AI-assisted import engine that automated the entire intake and review process. The system began with OCR to extract raw text, then passed the output through a tuned AI model that handled entity extraction, table parsing, and field mapping.

A human-in-the-loop review screen allowed staff to verify or correct extracted fields. This kept accuracy high while still capturing huge productivity gains.

To unify the data, I used vector embeddings to cross-reference shipments, suppliers, SKUs, amounts, and dates. This made it possible to ask questions like: 'Which supplier is consistently late?', 'What's the landed cost variance this month?', or 'Which documents are missing data?'

Because the system relied on OCR + AI rather than brittle templates, it could ingest new document formats without any additional development.

Results & handoff

The business went from hours of manual reconciliation per week to quick, reliable reviews completed in minutes. Teams gained real-time insight into relationships across shipments, invoices, and suppliers.

More importantly, the system created a repeatable workflow: documents flowed in consistently, data synced cleanly, and the team could track exactly what was happening across their import pipeline without digging through email chains or spreadsheets.

I left behind a set of clear workflows, extraction prompts, and schema docs so the business could extend the system as it grows.