Three primary scores. Eight pillars. Ten to fourteen questions we work through with you on a call, tailored to your company and use case. Calibrated against 47 production pipelines we have built and run since 2023, with project outcomes tracked at six-month checkpoints.
The three primary scores
Three numbers. Each answers a different procurement question.
Most readiness tools deliver one composite score, then ask you to trust it. We deliver three because operations leaders need to answer three different questions before any vendor conversation: am I ready, how hard is this build, and what is the payback if we get it right.
01 / Readiness
Readiness
0 to 100, higher is better
How prepared the operation is for document automation today. Aggregates the 8 pillars below using regression-derived weights.
Σ (pillar_score × pillar_weight) for 8 pillars
02 / Build complexity
Build complexity
0 to 100, higher is harder
How difficult the engineering work would be. Driven by document type variability, system count, exception rate, and rules ambiguity. A high readiness operation can still have a complex build.
Estimated payback in the first 12 months given the volume, hours-per-document, error rate, and loaded labor cost we capture with you. Assumptions are footnoted and we adjust them together.
volume × hours × error_rate × loaded_labor
The 8 pillars
Eight pillars, weighted by how often each one decides the project.
Weights come from regression against the 47-project history. Rules clarity, Build readiness, and Owner & change capacity carry the most weight because, when they go wrong, they end the project. Document inflow, Systems, and Data quality are technical work; we know how to scope those. Volume signature decides whether the work is worth doing at all. Hand-off path decides whether the pipeline survives day 90.
Pillar weights, summing to 100%n = 47 production pipelines
Clarity of source channels, format mix, and intake variability. How documents arrive, from how many places, and how consistent each channel is.
Sample questions
How do rate confirmations from your top eight carriers actually arrive (email PDF, portal, EDI, fax, driver text)?
Are inbound formats consistent enough for one classifier, or do they fork by source?
Low 0–30Many channels, no documented format library, scanned faxes still common.
Mid 31–65Two or three channels documented, occasional format drift.
High 66–100Channels named and counted, format library current, drift is the exception.
2. Volume signature10% weight
Monthly volume, peak versus steady-state ratio, and growth trajectory. Whether the workflow at current volume is genuinely painful enough to fund automation.
Sample questions
What is your monthly document volume, and what is the seasonal multiplier at peak?
How many hours per week does the current manual workflow consume?
Low 0–30Sub-500 per month, payback math marginal.
Mid 31–65500 to 10K per month, growing, defensible payback.
High 66–10010K+ per month or steep peak ratio, payback obvious.
Why no compliance pillar in v1.
We do not score generic AI maturity, and we do not score compliance as a standalone pillar. If specific regulatory pressure surfaces in research (HIPAA, FDA 21 CFR Part 11, SOC 2 boundaries, state privacy law), the personalized question pass surfaces it as a complexity factor, not a separate score. Compliance is binary by document type, not a maturity gradient.
Confidence indicator
Three confidence tiers, surfaced under every score.
The score is only as honest as the inputs behind it. Every report carries a confidence tier based on input completeness multiplied by research-signal density. The tier appears as a single line under the score: "Confidence: Moderate (5 verified signals, 8 of 10 questions answered)."
High
All numeric anchors filled, every question answered, four or more verified research findings, no skipped prelim questions. The score is robust enough for procurement.
Triggers
4+ verified research findings
All numeric anchors filled (volume, exceptions, hours, errors)
0 skipped questions
Moderate
Most questions answered, some research signal, one or two anchors estimated. Treat the score as directional, expect a five to eight point band on either side of the headline number.
Triggers
2 to 3 verified research findings
1 or 2 numeric anchors estimated, not measured
1 to 2 skipped or "I'm not sure" answers
Limited
Low public-signal company, three or more skipped questions, or no numeric anchors. The diagnostic still has value, but the headline number should not drive procurement on its own. Re-take after a discovery conversation.
Triggers
Sparse public research signal (no website, no LinkedIn)
3+ skipped or "I'm not sure" answers
No numeric anchors set
How questions feed scoring
Ten to fourteen questions. Every one tied to a pillar.
We tailor the question set to your company; some assessments need ten, some fourteen. No generic maturity questions, no vanity ROI math, no "rate yourself one to five" pseudoscience. The questions come in three layers, each with a defined job.
Layer 1
Prelim questions
3 questions, tailored to your company
We research your company first, then open with these: the document workflow shape, the use case in scope, and who owns it today. Quick to answer, and they set up everything that follows.
Layer 2
Personalized questions
5 to 7 questions, tailored to your company and use case
Drawn from the research, corrected facts, and your first answers. These reference your specific systems, document types, and named exceptions, and never ask the same thing twice.
Layer 3
Numeric anchors
2 to 4 anchors, woven into the personalized pass
Volume, exception rate, hours-per-document, optional error rate. These drive the 12-month value math directly and feed the volume, rules clarity, and data quality pillars. We capture a range plus an exact number for each.
The four numeric anchors in detail
Monthly document volume
Bands (sub-500 / 500–2K / 2K–10K / 10K–50K / 50K+) plus an exact number.
Feeds: Volume signature, 12-month value
Exception rate
0 to 30%, with markers at 5%, 15%, and 25% to flag the cliff.
Feeds: Rules clarity, Build complexity
Hours per document
Ranges (under 2 minutes / 2–5 / 5–15 / 15–30 / 30+ minutes) plus an exact number.
Readiness bands
Three bands. Each maps to a different next step.
The Readiness score determines the offer ladder, not the salesperson's mood. We have found the bands stable across industries: a 47/100 logistics operation and a 47/100 commercial real estate operation both land on the discovery-call path because both are the same distance from a working pipeline.
Not ready yet
0 to 30
Fix the basics first.
Foundational gaps in two or more pillars. A pipeline build right now would fail and erode trust across the operation. You still get the full diagnostic, plus a checklist of what to put in place first.
What we recommend
A short list of things to put in place first. We re-assess with you in 60 to 90 days.
Worth a conversation
31 to 65
Worth a conversation.
Most of what is needed is in place. One or two specific gaps drag the score down. Both typically resolve in week one of an engagement.
What we recommend
Tier 0 (named engineer review, no call) or Tier 1 (30-minute discovery call).
Ready to build now
66 to 100
Straight to a build.
Documented systems, named owner, written rules. You are ready to move straight to a build against your real documents.
What we recommend
Tier 3: working proof in a week. Five business days, fixed price, against your real documents.
Calibration
What past assessment scores predicted about real outcomes.
Each dot is one of the 47 pipelines we have put into production to date. The horizontal axis is the Readiness score we calculated at intake; the vertical axis is the actual time-to-production in weeks. The dotted line is the regression. Mid-band engagements (31 to 65) reached production in a median of 5.5 weeks. High-band engagements landed in 3. Low-band engagements that proceeded against our recommendation took 11+ weeks or stalled entirely.
Readiness score (x) vs weeks to production (y) · n = 47Low bandMid bandHigh band
Source: 47 Hexaa engagements between 2023 Q3 and 2026 Q1. Weeks to production measured from kickoff to first live production batch. Two stalled engagements removed from the regression; both were low-band that proceeded against recommendation.
Why we do not publish per-industry calibration yet.
The 47-pipeline cohort breaks down into roughly 14 logistics, 11 manufacturing, 9 commercial real estate, 7 banking, and 6 across education and government. Sample sizes per vertical are too small for the regression to be honest. We will publish per-vertical calibration once any single industry crosses 50 production pipelines. Until then, the cross-industry regression above is the calibration we trust.
How we re-calibrate.
Every six months we re-run the regression against the latest engagement set. If a pillar weight changes by more than 5 percentage points, the assessment scoring is adjusted and the methodology page is dated. Last calibration: April 2026.
How to read your three scores
Three numbers, one decision.
The Readiness score reflects the operation as it stands at intake. It does not predict your team's capacity to act on the report. We have watched a 38 become a 70 in three weeks because a single named owner walked into the room. We have also watched an 80 stall because the named owner was named and then promoted out.
Build complexity is independent of Readiness. A high-readiness operation with 14 systems and a 22% exception rate still has a complex build. A low-readiness operation with 2 systems and a 4% exception rate is simple to build once the readiness gaps close.
12-month value tells you whether the work pays for itself inside the first year using the volume and labor numbers you give us. We adjust the assumptions with you if any of them feel off, and the math updates as we go.
Three practical rules:
If your Readiness is Not ready, do not argue with the report. The three things to fix are listed. Fix them. Re-run.
If your Readiness is Ready to build now and Complexity is low, take the working-proof offer. The discovery call would only confirm what the report already says.
If your Readiness is high but Complexity is high, book the call. A complex build needs scoping; a 30-minute conversation saves a month of misalignment.
Want this rubric applied to your operation?
Book a 30-minute call and we will walk your document workflow against these pillars with you.
Whether exception-handling logic is documented or lives only in three dispatchers' heads. The single highest-weight pillar because rules ambiguity is what stalls every other pillar.
Sample questions
When a BOL arrives mismatched, what happens today? Is there a written rule, or is it judgment?
What is your current exception rate as a percentage of monthly volume?
High 66–100Rules library current, exceptions categorized, exception rate under 5%.
4. Owner & change capacity13% weight
Whether there is a named pipeline owner today and the operations team has muscle for change management. Pipelines with no named owner stall in week three.
Sample questions
Who would own the pipeline after handover? Name a person.
What was the most recent operational change your team absorbed, and how long did it take to stabilize?
Low 0–30No named owner, no recent change history, change-fatigued team.
High 66–100Named owner with capacity, recent successful changes, ops team trained for adoption.
Technical 3 pillars35% of Readiness aggregate
5. Systems & integration14% weight
Destination systems, API availability, master-data quality. Whether the systems we need to write into have documented APIs and credentials sit with named owners.
Sample questions
Where does the validated, clean document data need to land (TMS, ERP, AP system, spreadsheet)?
Is there a sandbox we can integrate against, and who owns the API credential today?
Low 0–30Closed systems, no API, no sandbox, master data inconsistent.
Mid 31–65One or two integrated systems, partial API coverage, credential owner identifiable.
High 66–100Modern systems with documented APIs, sandbox available, credential governance in place.
6. Data quality11% weight
Extraction-readiness, structured versus unstructured ratio, accuracy of upstream master data. Garbage in produces garbage automation.
Sample questions
What percentage of inbound documents are structured (forms, EDI) versus unstructured (scanned PDF, photo)?
What is your current data-entry error rate downstream of these documents?
Low 0–30Mostly unstructured, master data dirty, error rate above 8%.
Mid 31–65Mixed structured and unstructured, master data usable, error rate 3 to 8%.
High 66–100Mostly structured or extraction-friendly, master data clean, error rate under 3%.
Strategic 1 pillar15% of Readiness aggregate
8. Build readiness15% weight
Executive air cover, time and budget alignment, internal capacity for the engagement. Tied with Rules clarity for the highest weight: a perfect operation with no executive air cover still fails.
Sample questions
Is there an executive sponsor by name, and what is the budget envelope for the first 90 days?
Does your team have 4 to 6 hours per week of capacity to engage during the build, or is everyone already over-allocated?
Low 0–30No sponsor, aspirational budget, team over-allocated.
High 66–100Sponsor named, budget approved, capacity blocked on calendars.
Feeds: 12-month value
Error rate (optional)
0 to 20%. Skipped if you do not measure it, and flagged in the confidence tier.
Feeds: Data quality
Question principles
"I'm not sure" is always a valid answer. It handles uncertainty better than fabricated certainty, and a "not sure" carries diagnostic weight ("you do not measure your error rate" is itself signal).
We tell you why we ask each question. Which pillar it feeds and how it moves your score. No black-box scoring.
Question count is dynamic. The set is built for your company, never pulled from a fixed list.
The owner question is asked once. If you tell us early that Sarah owns it today, we cite Sarah by name when we ask who owns the pipeline after handover.
7. Hand-off path10% weight
Post-go-live ownership, monitoring, and day-2 operations readiness. Pipelines without a hand-off path get rebuilt by the second vendor in 18 months.
Sample questions
Who monitors the pipeline once it is live, and what does their alert console look like today?
Does your ops team have a runbook discipline for production systems, or is it ad hoc?
Low 0–30No runbook discipline, no monitoring habit, no day-2 owner.