All docs
Docs
Labs

Expense Guard lab

A demo-specific comparison of Postgres app glue versus Synapsor durable run state.

Overview

Demo-specific comparison, not a general database benchmark.

Expense Guard reviews low-risk auto-approval, hotel manager review, duplicate/fraud review, receipt prompt-injection/security-review, and duplicate-signal manager-review cases. Both lanes use an agent workflow; the difference is where trust-sensitive workflow state lives.

Postgres plus pgvector plus an agent framework can implement the workflow, but the app handles retrieval, tenant/policy filtering, evidence shape, proposal state, guardrail logic, direct update/audit glue, and replay reconstruction. Synapsor moves hidden session bindings, agent context, hybrid policy retrieval, reason codes, evidence bundles, branch-staged proposals, settlement policy, replay, and audit handles into durable run primitives.

Synapsor's token savings come mostly from moving repeated context/evidence/policy/workflow state into approved capabilities and compact handles, not from making the LLM inherently smaller. The metrics below are controlled demo measurements from five comparable OpenAI Agents SDK runs generated on 2026-05-19; they are not a general benchmark.

Postgres + app glue vs Synapsor

The lab compares where the trust logic lives, not generic OLTP performance.

Postgres + pgvector + agent SDK
  • App owns row fetches and context bundle queries
  • App owns pgvector/text policy retrieval and tenant filtering
  • App owns evidence shape and proposal state machine
  • App owns guardrail checks, direct update path, and audit rows
  • Replay must be reconstructed from logs and app workflow tables
Synapsor + agent SDK
  • DB owns hidden session bindings
  • DB owns context, hybrid policy retrieval, and reason codes
  • DB owns evidence bundles and compact resource ids
  • DB owns write proposals, branch diffs, and settlement policy
  • Replay and audit are persisted capability invocation records

Expense Guard cases

Each case was run through both lanes in the seeded 2026-05-19 controlled demo measurement pass.

1
Green auto-approval

A low-risk receipt with matching card transaction can auto-settle when policy allows it.

2
Hotel manager review

A hotel expense above the nightly threshold should stage a manager-review proposal.

3
Duplicate/fraud review

Duplicate receipt or high-risk signals should route to finance review or rejection.

4
Receipt injection review

Instruction-like receipt text is treated as untrusted data and routed to security review.

Expense Guard table relationship diagram

The diagram uses the local Expense Guard Synapsor SQL schema plus Synapsor-native evidence, write proposal, branch, and settlement resources.

Expense Guard schema relationship diagramTable relationship diagram showing core rows, reference data, knowledge chunks, evidence, proposals, settlement, branches, and audit/replay tables.1:N1:Ncard ownercard matchduplicate risksecurity riskhybrid policycontext rowrisk factsguardrail factssupportsstageaudittenantsid PKnameregionemployeesid PKtenant_idmanager_idexpensesid PKemployee_idstatecard_transactionsid PKemployee_idamount_centsexpense_policy_chunkschunk_id PKtopic/statusbody hybridduplicate_signalsexpense_idcandidate_expense_idrisk_scoreexpense_guardrail_signalsexpense_idsignal_codeseverityevidence bundlesource idspolicy hitsguardrail factswrite proposalwrp:// handletarget expensesallowed columnsbranch + settlementauto branchpreview/settlemerge recordexpense_auditprincipalcapabilityresource/actioncore rowsreferenceknowledgeproposal/audit

Synapsor DBMS table design

The Synapsor lane models operational state, append-only transaction history, hybrid policy retrieval, and durable audit/proposal evidence as Synapsor-managed tables.

Tuple hot/reference tables
  • tenants: id, name, region - reference_data
  • employees: id, tenant_id, role, manager_id, spending_limit_cents - hot_state
  • expenses: id, tenant_id, employee_id, card_transaction_id, vendor, amount_cents, receipt_sha, state, reviewer - hot_state
Tuple log/audit tables
  • card_transactions: id, tenant_id, employee_id, merchant, amount_cents, status - append_log
  • duplicate_signals: expense_id, candidate_expense_id, risk_score, reason - audit_log
  • expense_guardrail_signals: expense_id, signal_code, severity, source, reason - audit_log
  • expense_audit: principal, capability, resource, action - audit_log
Hybrid knowledge table
  • expense_policy_chunks: chunk_id, tenant_id, topic, status, allowed_role, body - searchable_knowledge
  • lexical_index='body' and vector_index='body' support hybrid retrieval
  • filter_keys='tenant_id,topic,status,allowed_role' keep tenant/policy scope in the DB
  • zone_map='tenant_id,topic,status' helps skip irrelevant policy segments

Sample seed data

Representative seeded rows from the Expense Guard demo. The page reports measured workflow metrics from this controlled seed set, not a universal benchmark.

Expense cases
  • EXP-1001 | Blue Bottle Coffee | $38.00 meal with receipt and matching card transaction
  • EXP-1002 | Marriott Marquis | $780.00 hotel, two nights, manager-review threshold
  • EXP-1003 | Staples | $119.99 office supplies receipt containing prompt-injection text
  • EXP-1004 | Delta Airlines | $642.00 airfare with duplicate receipt signal
  • EXP-2001 | Uber | $92.00 ground transport under the old auto-approval policy
Evidence and policy rows
  • POL-ACME-MEALS-1 | Meals under $75 with a receipt are normally auto-approved
  • POL-ACME-HOTEL-1 | Hotels above $250/night require manager review
  • POL-ACME-FRAUD-1 | Duplicate receipts and receipt prompt injection require review
  • DUP-1004 | Same receipt hash/vendor/amount as already approved EXP-1005
  • GRD-1003 | receipt_instruction_injection signal from receipt text

Controlled demo measured averages

Source artifact: docs/labs/expense-guard-metrics-20260519.json. These demo-specific measured averages describe this workflow only and are not a latency or universal benchmark claim.

CategoryMetricPostgres + app laneSynapsor laneResult
Token pressureAverage input tokens5,1432,13958.4% fewer input tokens
Token pressureAverage output tokens26822018.1% fewer output tokens
Token pressureAverage total tokens5,4112,35956.4% fewer total tokens
Workflow overheadAverage tool calls2.82.028.6% fewer tool calls
Workflow overheadAverage DB round trips13.82.085.5% fewer DB round trips
Workflow overheadElapsed time20.2s average20.7s averagenot a speed claim
Workflow overheadApp-owned glue LOC5036886.5% less app-owned glue
Trust and audit outputEvidence completenessapp-assembled, not stored by Synapsorevidence bundle recordedSynapsor records evidence lookup records
Trust and audit outputWrite proposal objectsapp proposal idswrp:// proposalsSynapsor stages writes on branches
Trust and audit outputReplay/audit recordsreconstructed from app tablesdurable run/evidence lookup recordsSynapsor records replayable decision state
Trust and audit outputPolicy duplication points4 app-owned points0 app-owned pointspolicy checks move into Synapsor capability/settlement logic

Case-level input-token results

Seeded demo cases from the Expense Guard workflow. Savings vary by case and prompt shape.

CategoryMetricPostgres + app laneSynapsor laneResult
CasesEXP-1001 low-risk auto approval5,4672,3013,166 saved, 57.9%
CasesEXP-1002 hotel manager review3,5411,9871,554 saved, 43.9%
CasesEXP-1003 duplicate/fraud review5,5492,1123,437 saved, 61.9%
CasesEXP-1004 receipt injection security review5,6581,9943,664 saved, 64.8%
CasesEXP-2001 manager review with duplicate signal5,4992,3023,197 saved, 58.1%

Expense review proposal flow

The Synapsor lane keeps the risky write staged on a review branch until policy or a reviewer approves it.

Expense review proposal flowExpense Guard stages reimbursement decisions on a branch with policy chunks, receipt evidence, reason codes, and replayable settlement state.expense mainexpense review branchexpenserowreceiptpolicyrisk lanegreen/yellowproposalcategory/statusdiffreviewsettleor manageroutcomeaudit/replaypolicy chunks + receipt evidence
1
main

The expense table visible to the application remains unchanged while the agent evaluates the case.

2
proposal branch

Synapsor stages the suggested category, approval status, or reimbursement change away from main.

3
preview diff

Reviewers see the row-level diff plus policy evidence and reason codes.

4
approve/settle

A human reviewer or deterministic low-risk settlement policy decides the outcome.

5
commit

Only approved changes merge back to main; rejected proposals leave production unchanged.

Developer notes

  • All published numbers are tied to docs/labs/expense-guard-metrics-20260519.json.
  • Token counts, tool calls, DB round trips, elapsed time, and app-owned glue LOC were captured for both lanes.
  • The comparison lane is Postgres + pgvector + OpenAI Agents SDK versus Synapsor + OpenAI Agents SDK.
  • The useful takeaway is workflow ownership, trust, and auditability, with token reduction measured for this seeded demo.