Case studies/Commercial real estate
A commercial real estate portfolio manager running 220 assets, with abstractors reading every executed lease end to end before the data hit Yardi. We built an abstraction pipeline that pre-populates the Yardi record against a 36-field schema, so analysts verify against the lease rather than key from scratch.
| Asset | Suite | Tenant | Term end | Status |
|---|---|---|---|---|
| 100 Main St | 0420 | Northline Capital | 2029-08-31 | posted |
| 240 Park Ave | 1100 | Brightline Advisory | 2031-04-30 | posted |
| 875 Market St | 0210 | Foothill Health Group | 2027-12-31 | pending |
| 500 Wacker Dr | 3300 | Cedar Mill Software | 2028-06-30 | posted |
| 120 Beale St | 0840 | Harbor Trust Bank | 2032-03-15 | review |
| 50 Liberty Pl | 2200 | Westbrook Insurance | 2026-11-30 | posted |
| 900 Broadway | 0150 | Maple City Diner | 2027-05-31 | review |
| 15 Federal St | 0900 | Ridgepoint Legal | 2030-09-30 | pending |
At a glance
One portfolio, 220 assets, one Yardi tenancy. The pipeline had to carry scanned leases and executed PDFs without losing fidelity, and it had to feed Yardi a clean abstract the analyst could verify in minutes.
The engagement
The stack
ISO 27001 · ISO 9001 · DPA and NDA signed at kickoff.
Before, the abstractor desk
Abstractors were experienced. They knew where to look for the rent step, where to look for the CAM cap, where the estoppel obligations hid. These were the three patterns we found in discovery.
Base rent, escalation clause, option to renew, CAM exclusions, tenant improvement, security deposit. Each field sat in a different part of the lease. On a clean lease, 4 hours. On a heavily amended one, closer to 6.
Pre-build baseline: 4 to 6 hours per complex lease, keyed field-by-field into Yardi.
A base lease from 2019, three amendments, one assignment. Abstractors abstracted each one, then held all four documents in their head when they updated the Yardi record. The reconciliation errors surfaced at the next refinance.
Pre-build baseline: ~14% of Yardi records carried a stale clause against the latest amendment.
Older assets in the portfolio carried scanned leases with poor OCR. Abstractors worked off a printed copy, highlighted the 36 fields, then keyed them into Yardi at the end of the day. The scanned workflow ran about 40% slower than the digital one.
Pre-build baseline: scanned lease abstraction ran at approximately 0.6x the speed of digital.
What we built
The pipeline follows the same five stages we run on every CRE engagement. The details below are the ones we implemented for this portfolio, mapped one-for-one against the client's 36-field Yardi abstract schema.
Asset-management data room synced on a nightly run. Property manager email routed via a shared mailbox. Executed-document archive crawled for net-new. All normalised to a single lease ID before classification runs.
Document type tagged on ingest. Scanned originals routed to LandingAI, digital executions to LlamaParse. Amendments and estoppels attached to the parent lease by property and tenant identifier.
Clause-level extraction against the Yardi schema. Base rent, escalation, option to renew, CAM cap, TI, security deposit, assignment, subletting, holdover. Every field carries a source-span citation to the page and clause.
Amendments rolled up against the base lease, so the analyst sees the current effective clause with a history trail. Missing fields or low-confidence extractions held for analyst review. Below 0.90 confidence routes to the named abstractor.
Pre-populated records posted to Yardi Voyager with the lease PDF attached to the deal record. Analysts open the Yardi record and verify against the cited clause, rather than read the whole lease.
After, the numbers the desk signs off
Same abstractors, same analysts, same Yardi tenancy. The pipeline fed each record a pre-populated 36-field abstract with source citations. Analysts verified against the lease, rather than read the lease cover to cover.
Analysts still own every abstract sign-off. They still read the citation for every non-standard clause. The difference is that on a clean lease, the Yardi record is ready to verify by the time the analyst opens it. On a lease with three amendments, the analyst sees the effective clause with a history trail, not a stack of four documents to reconcile.
From the desk
Analysts now verify against the lease instead of keying from scratch, and that is the difference between reading 180 leases and reviewing 180 abstracts.
Director of asset managementPortfolio manager, New York
Handover
The engagement ends at a clean handover. The asset management team runs the pipeline; Hexaa stays on call for a fixed retention period, then steps back.
Related cases
Each links to a named client, a named document, and the system the clean data lands in. We publish only what the client signed off to publish.
Estoppels read and reconciled to the master lease, variances flagged before they reach the closing package. 98% variance catch pre-closing.
→Construction · 2026General contractor · AIA G702 and G703 intakePay applications extracted against the schedule of values, retainage and stored materials validated before billing.
→Real estate · 2025REIT asset manager · rent roll normalisationMonthly rolls from 11 third-party managers normalised to the portfolio schema, every field traceable back to the source cell.
→Free 30-minute call
You'll leave with a clear next step.
The base lease carries an escalation clause. The amendment revises it. The current Yardi record needs the effective clause, not either one in isolation. The pipeline reconciles the base, the amendment, and the target schema, flags the revision, and hands the analyst one cited effective clause with a history trail.