The week of work that ends every RFP — done in one pass. Paste your weighted requirement list on the left, the vendor proposals on the right (2-3 vendors, separated by a delimiter line). One structured-output call returns a 1-5 scored comparison matrix with a rationale per cell, weighted totals per vendor, ranked from highest to lowest, and a recommendation with explicit confidence and caveats. The first procurement deck slide, populated from nothing.
Browser
└─→ POST /api/lab/chat
- system: scoring prompt with the schema below
(versioned: rfp.v1)
- user: { requirements_block, proposals_block }
- temperature: 0.2
- max_tokens: 3000
← single response, parsed as JSON: {
rfp_summary, category,
requirements[], vendors[], scores[],
totals[], recommendation
}
← matrix render is a CSS grid:
grid-template-columns: 240px repeat(V, 1fr)
← weighted_total computed client-side as Σ(score × weight)
and re-validated against model output
One LLM call. The model parses requirements (with optional
[N] weights), parses each vendor proposal as a
distinct block, and scores every (requirement, vendor) pair
on a 1-5 integer scale with a 1-2 sentence rationale and
short evidence pull. Weighted totals are computed by the
model and double-checked client-side. The recommendation
includes an explicit confidence level — the
most important discipline for a procurement output.
Procurement RFPs are slow not because the questions are hard but because the volume of vendor prose is high. A 50-page Salesforce response, a 38-page HubSpot response, a 60-page Microsoft response — all answering the same 30 requirements in different formats. A procurement analyst spends days normalizing those into a comparison matrix, two days writing rationales, half a day computing weighted totals, and another day drafting the recommendation memo.
This demo collapses the normalization. The matrix gets built. The rationales get drafted. The totals get computed. The recommendation gets scaffolded with confidence and caveats. The procurement analyst still owns the final call — but starts from a complete first draft rather than an empty spreadsheet.
Same shape works for corp dev evaluating acquisition targets against a thesis, IT teams comparing vendor proposals against an architecture standard, and consulting teams running structured comparisons for clients.
Honest caveat:
the model scores from what's in each vendor's proposal —
and proposals are sales documents. A 5-out-of-5 on a
requirement means "the vendor said yes confidently,"
not "the vendor proved it works." Treat scores as the
top of the funnel; reference checks, demos, and
proof-of-concepts still own the bottom. The
confidence field on the recommendation is
the model's honest assessment of how cleanly the
evidence pointed at one winner.
Run the demo to see telemetry.