Platform · Raw input → sealed record

How a raw file becomes a record that survives discovery.

The connector is the easy part. The hard part is everything that happens next — to every email, PDF, photo, statement, telemetry stream, and voice note you let in. Seven stages, each one inspectable, each one reproducible. The pipeline is what turns a Tesla session, a 14-line HELOC statement, or a parking-lot voice note into a typed record the Provenance Ledger will sign and keep forever.

See it work in Finance Read the next pillar → Connectors

MULTIMODAL INGESTION PIPELINE 31 stages

OCR, diarization, statement parsing, entity extraction, linking, and embedding — every record made legible before it lands in the model.

OCR

Diarization

Parsing

Extraction

Linking

Embedding

Classification

Validation

Indexing

Why this matters

A file you can't find isn't a record. It's just clutter.

Most of your life already lives in files — but a PDF in a folder doesn't know what it is, who it's about, or when it matters. The pipeline is the part of Lossless that turns a pile of documents into something you can actually ask questions of.

✓ It runs on its own — connect an account and walk away
✓ It reads what people can read: scans, handwriting, audio, video, photos
✓ It files every record into 19 domains and 130+ record types
✓ It writes a plain-language summary for every single record
✓ It never edits or deletes the original — only ever reads

What you connected

scan_2026-05-12_0007.pdf
— 2 pages, no text layer, slightly skewed

↓

What the pipeline gave back

Auto insurance — policy renewal

DomainVehicles → Vehicle Insurance

RenewsJun 1, 2026

Premium$1,284 / 6 mo ▲ $90

CategoryVehicle Expenses → Vehicle insurance

Summary written Sealed · 1 source

The seven stages

Open any stage. See what's actually happening.

Under the hood there are 51 individual steps across 7 stages. You don't need to think about any of them — but you should be able to see them. Tap a stage to look inside.

Nothing sits in a folder waiting to be noticed. The instant a file comes in — from your inbox, your drive, an upload, a voice note — Lossless gives it a stable address and opens a record for it. From here on, it can't get lost.

Given a permanent, addressable home
A record of its own is opened

A scanned receipt isn't text until someone reads it. Lossless reads everything: the words on a page, the words inside a photo, what's spoken in an audio note, what happens in a video. If it's in another language, it's translated. It even reads the quiet details — when a photo was taken, where, and on what device — and picks out the people, businesses, amounts, orders, and dates along the way.

Printed text & handwriting (OCR)
Images inside documents
Audio transcribed
Video transcribed & narrated
Photos described
Anything translated to English
When & where it was created
The file's own metadata
People mentioned
Businesses & organizations
Amounts & money
Orders & receipts
Events & dates
Anything you've taught it to look for

Now the file is understood, not just transcribed — what it's about, who it concerns, why it matters, and what (if anything) you need to do about it. Lossless quietly connects the new file to the ones it belongs with: the confirmation email, the earlier statement, the right person, the right property. The pile starts becoming a story.

What it's about
Key quotes pulled out
Action items spotted
Routed to the right handler
Linked to related files
Tied to the right people
Attributed to the right property
Patterns & behavior noticed
Importance scored
Sorted into clean, structured fields

Every record gets a short, plain-language summary and a one-line brief — the kind of thing you'd jot on a sticky note. It's categorized — filed into the taxonomy of domains, document families, and spend categories. And if something's missing or thin, Lossless goes back and fills the gap rather than leaving a hole.

Filed into a category
Classified by document type
Gaps healed
A plain-language summary
A one-line brief
How relevant it is to you

A record is only useful if it shows up when you need it. Lossless tags each one with the topics and the bigger themes it belongs to — drawn from a three-tier map of your life that the system builds for you — so a question three years from now still finds the answer.

Topics extracted
Themes assigned

This is the part the name is about. The finished record is kept in several places at once, made searchable by meaning rather than just keywords, woven into the graph of people and places and things you own, and sealed with its provenance — payload, origin, seal. The original file is never touched. Nothing gets lost.

Searchable by meaning
Stored in the database
Original kept in cloud storage
Added to the vector store
Woven into your entity graph
Metadata written back
Sealed with provenance

Before Lossless calls a record done, one more pass looks it over — checking for anything thin, missing, or off — and fixes it. The bar is simple: would this hold up if you actually had to rely on it?

A final quality review & gap repair

7 STAGES · 51 STEPS · RUNS IN THE BACKGROUND · YOU NEVER PRESS A BUTTON

Inside the app

You can watch it work.

Open any record — or any upload batch — and there's an AI Pipeline tab. Every step, timed. Every file in a batch, mapped. Nothing hidden behind a spinner.

lossless · records · evidence_photo_001.jpg

Live

The Overview tab shows the AI summary, the one-line brief, and every extracted entity — all six of which are covered further down this page.

68%

Elapsed

0:41

Remaining

0:19

Steps

35/51

Stage

3 / 7

Currently running: Topic Extraction — AI Analysis

📥 Ingestion 2/2

🔍 Extraction 16/16

🧠 AI Analysis 7/15

✓Scores Extraction1.4s

✓Quotes Extraction2.1s

✓Action Items Extraction1.8s

✓Document Routing0.9s

✓Related Files Discovery3.2s

✓Person-Document Linking2.0s

✓Behavior Pattern Analysis2.7s

Topic Extractionrunning…

·AI Structured Extraction—

✨ Enrichment & Summarization 0/6

🔗 AI Search Readiness 0/2

💾 Persistence 0/9

✅ Quality Assurance 0/1

186

Complete

Processing

Pending

Needs review

File completeness heatmap — 240 files in this batch

Complete Processing Pending Needs review

Every cell is one file. Hover to inspect. A batch of years of paperwork runs the whole 51-step pipeline, file by file — and tells you exactly which five need a human glance.

The Entities tab lists every person, business, place, amount, date, topic, and action item the pipeline pulled — each one tappable, each one linked back to the source.

Follow one file

Pick something ordinary. Watch it go through.

The pipeline treats a voice memo and a nine-page bank statement with the same care. Choose one and see.

Watch the pipeline handle

The taxonomy

Every record gets a place.

"Categorized" isn't a vague promise here. There's a real, structured taxonomy underneath — domains, record types, document families, and a deep spend map — and every file is routed into it.

Step 1

A file arrives

Any type — email, PDF, photo, statement, voice note.

"scan_0007.pdf"

Step 2

Routed to a domain

One of 19 life domains — the broad area it belongs to.

→ Vehicles

Step 3

Typed as a record

One of 130+ record types — what kind of thing it is.

→ Vehicle Insurance

Step 4

Filed by category

Document family + spend category, down to the sub-category.

→ Vehicle Expenses → Vehicle insurance

Step 5

Tagged & linked

Entities, topics and themes — so it surfaces on demand.

→ people · amounts · "Auto & Insurance"

19 domains — the broad areas of a life

Every record belongs to one. Together they hold 130+ record types and 25+ sub-types.

Finance & Banking

20 record types

Accounts, credit cards, statements, transactions, loans, tax documents.

Properties & Rentals

25 record types

Units, leases, bookings, guests, utility bills, P&L reports.

Vehicles

10 record types

Trips, tolls, tickets, service history, mileage, insurance, claims.

Legal & Divorce

12 record types

Cases, disclosures, custody arrangements, support orders, chronologies.

People & Relationships

6 record types

People, contacts, relationships, life events, promises, grievances.

Email & Messaging

5 record types

Emails, text messages, notes, unified messages, threads.

Voice

3 record types

Voice sessions, audio recordings, voice profiles.

Calendar & Events

3 record types

Calendar events, universal events, trips.

Documents & Records

4 record types

Records, upload batches, quotes, extracted entities.

Trips & Travel

8 record types

Flights, hotels, car rentals, itineraries, travel documents.

Health & Medical

6 record types

Medical records, prescriptions, evaluations, insurance policies.

E-Commerce & Orders

4 record types

Vendors, orders, order items, receipts.

+ Memory & Knowledge · Projects & Action Items · Topics & Pulse · Photos & Media · AI Chat · Device Sync · System — 19 domains in all

39 document families — what kind of document it is

A second axis, running across the domains: the document type, grouped into eight families.

Financial 10

Bank statements · Credit-card statements · Tax documents · Investment statements · Bills · Receipts · Invoices · Collection notices · Order & return receipts

Personal 8

Text messages · Emails · Voice recordings · Voicemails · Social media · Photographs · Video recordings · Screenshots

Medical 6

Medical records · Psychiatric evaluations · Therapy notes · Prescription records · Medical insurance policies · Hospital records

Legal 5

Police reports · Court documents · Restraining orders · Motions & filings · Legal correspondence

Rental management 5

Property documents · Rental income records · Maintenance & repairs · HOA documents · Insurance policies & claims

Childcare 4

School records · Childcare documents · Custody schedules · Child support records

Other 3

Witness statements · Timeline documents · Unknown / unclassified

Plus file formats 18

PDF · Word · Sheets · Slides · Photos · Screenshots · iPhone messages · Audio · Video · Code · Archives

The spend map — 21 categories, 100+ sub-categories

When a record involves money, it's filed down to the leaf. Tap a category to see how deep it goes.

GasChargingParkingTollsRegistrationServiceOil changeTiresBrake replacementRepairsVehicle cleaningTowingVehicle insuranceParking ticketsSpeeding ticketsVehicle purchaseVehicle leaseVehicle financingVehicle accessories

FurnishingsSmart home devicesFurnitureFixturesLarge applianceKitchen cabinetryBathroom tileworkPlumbingElectricalHVAC systemsPaintingDrywallFraming & lumberRoofFlooringRenovationDemolitionPermitsContractor laborInterior designArchitectureStaging

CleaningRepairsListings feesLegal servicesCPA servicesMarketing & adsGuest toiletriesBedding & towelsCleaning suppliesGuest giftsDecorKitchenwareSmart home devicesWi-Fi devicesUtilitiesInternetMortgage componentsHELOCHOA feesProperty insuranceSubscriptions

Payroll depositPayout depositAccount transferCash withdrawalATM withdrawalCredit card paymentMortgage paymentHELOC paymentDigital wallet fundingInterest feeOverdraft feeLate feeWire feeService feeProcessing feeManagement feeDividend creditInterest creditReversalClawback

MoviesShowsTheaterConcertsFestivalsComedySport eventsTheme parksZoos & aquariumsMuseums & galleriesArcades & recreationCasinos & gamblingNightclubsRavesPartiesBarsEvent tickets & fees

DentalPharmacySupplementsVisionMedical insuranceMedical appointmentMedical procedureMedical suppliesLabworkSkin treatmentsAddiction treatmentTherapySpaMassageYoga classGym membershipPersonal training

+ Groceries & Food · Housing · Digital Services · Shopping & Retail · Travel · Transit · Education · Childcare · Charity & Gifts · Professional Services · Beauty & Personal Care — 21 in all

What you get back

Six things every record carries — that the raw file never did.

The plain-language summary

A short, human paragraph that explains what the record is — readable in about five seconds, no jargon, no skimming a PDF.

The one-line brief

An entity-first headline with the significant stuff flagged — the deadline, the amount, the thing you'd actually want to know first.

The right shelf

Routed into the taxonomy — a domain, a record type, a document family, and a spend category down to the sub-category.

The entity graph

Every person, business, place, amount, and date it touched — linked, so you can walk from a person to a property to an account.

Themes & topics

The throughlines of your life, tagged onto the record across a three-tier topic map — so a question years later still surfaces it.

The provenance seal

Payload, origin, seal. Every finished record carries proof of where it came from — so any agent you authorize can trust it.

By the numbers

One pipeline. Every kind of file you own.

7stages · 51 steps

19domains · 130+ record types

39document families

100+spend sub-categories

"I connected eighteen years of Gmail and a folder of scanned paperwork I'd been avoiding. A day later it was all just… records — summarized, dated, filed under the right category. I didn't lift a finger. The pipeline did the part I'd been dreading for a decade."
— Beta user · two-Tesla household · San Francisco

"Reproducibility at the parser level is the part the security auditor actually cares about — re-run the source, re-derive the record, the signature still verifies."

— What the technical evaluator writes in their report

Continue the architecture tour

You've seen the kiln. Now see what feeds it.

Next pillar: the schema-aware connectors that hand the pipeline its raw material — Gmail receipts, Plaid statements, Tesla telemetry, iMessage threads — each one parsed into a typed record, not a blob.

Next pillar → Connectors Voice Records Provenance Ledger ← Back to Overview