How PRISM works

Five AIs, one operator's judgment, $0/month.

Technology is a commodity. Judgment is the product. PRISM encodes 10+ years of category operating expertise into a system that AI executes at scale — every Monday morning, a weekly report concrete enough to act on before lunch.

The five AIs

Each model assigned to the task it's actually best at — bulk extraction goes to Flash-Lite, forecasting to TimesFM, deep analysis to Claude. Multi-model orchestration is the differentiation.

TimesFM

Google Research

Time-series forecasting

200M-param decoder that produces 26-week forecasts with p10/p50/p90 quantile bands. Loads in 4s on a free GitHub Actions CPU runner; forecasts 22 keywords in 0.4s.

Gemini 3.1 Flash Lite

Google

Bulk attribute extraction

Tags every Amazon listing against the canonical taxonomy. 500 RPD free tier; ~1k input tokens per listing. Validated client-side against the operator's curated dictionary.

Azure GPT-4o

OpenAI / Azure

Weekly executive synthesis

Reads forecasts + opportunities + Reddit complaint patterns, writes the markdown weekly report in operator's voice (Prototype/Iterate/Monitor/Skip/Kill). Direct-routed to Azure since Gemini Pro free tier ended 2026-04 (W7 decision); ~13s latency per category, deterministic quality.

Claude (Max)

Anthropic

Operator-driven deep analysis (Track B)

Operator clicks 'Open in Claude' on any /[category] or /migration page. The 129KB export pack drops into claude.ai with 3 ready-to-paste prompts — improvement focus, risk audit, supplier brief draft. For the 20% of decisions worth deep dialogue.

ChatGPT-4o-mini (Azure)

Azure OpenAI

Bulk-extraction failover

Last line of defense for bulk extraction (listings + Reddit aspects). After dual Gemini keys both hit daily quota, requests transparently re-route to Azure GPT-4o-mini. ~$0.04/wk worst case, well under the non-profit grant.

The weekly pipeline

Triggered Mondays at 02:00 UTC. Each stage runs in a try/except — a transient failure (Gemini 503, pytrends rate-limit) doesn't bring down the whole run; later stages still work off cached parquets.

Data fetch

pytrends · Reddit (.json) · Amazon (Apify)

Attribute extraction

Gemini Flash-Lite tags listings

TimesFM forecasts

26-week × p10/p50/p90

Proprietary scoring

4-factor + confidence discount

Synthesis

Azure GPT-4o writes operator's voice

Output

Supabase · dashboard · export pack

The proprietary scoring layer

The IP. Five subsystems sit between AI models and decisions: canonical attribute taxonomies, signal weighting rules, cross-source validation, opportunity scoring, and a decision feedback loop. The scoring formula is committed in code (and unit tested with 70+ tests, not buried in an LLM prompt):

score = 0.40·trend + 0.20·(1 − saturation) + 0.20·quality_gap + 0.20·strategic_fit

Then multiplied by (0.5 + 0.5·confidence). An attribute we know nothing about gets halved; an attribute with strong multi-source signal keeps full weight. Soft enough to preserve ranking among sparse-data peers, hard enough that real signal outranks neutral defaults.

Worked example: material = titanium

Trend score0.36— "titanium water bottle" slope -0.28 (above the -0.98..0 distribution)
Saturation0.05— only 2 of 40 listings carry it
Quality gap0.00— no operator complaint pattern points to titanium
Strategic fit0.88— matches expansion target "Specialty: titanium for outdoor enthusiasts"
Aggregate0.515→ × confidence discount (0.75) →0.383final

Three flywheels

The system's value compounds because of three feedback loops: every operator decision retunes the scoring weights; every approved novel attribute grows the dictionary's precision; every cross-category signal is a candidate for the next category's keyword map. None produce dramatic data this quarter — but the rails are in.

flywheel 1

Decision feedback loop

surface: /decisions

Operator logs every action against an opportunity. Backfills the actual outcome + 1-5 success rating once it lands. The calibration job (scoring/calibration.py) computes per-factor Spearman correlation between scores and ratings — surfacing which scoring factors actually predict success.

flywheel 2

Attribute dictionary growth

surface: /submit (review queue)

Gemini extraction emits novel candidate (dim, value) pairs each week. Operator approves → next deploy adds them to the canonical yaml. Operator rejects → marked ignored so they don't resurface. Dictionary precision compounds weekly.

flywheel 3

Cross-category migration

surface: /migration

Patterns observed in one category register as candidate signals in others. The matrix surfaces attributes hot in one category and missing from another — operator decides whether to add the keyword. Pickleball is the canonical example: #2 in sun_hat, #5 in water_bottle, absent from tank_top.

Zero-cost stack

Recurring monthly target: $0. Hard rule — no paid services without operator approval. Verified live; numbers below are not aspirational.

Layer	Service	Tier	Monthly
Frontend	Vercel	Free tier (Hobby)	$0
Database	Supabase	Free tier	$0
Scheduler	GitHub Actions	2,000 min/mo free	$0
Time-series	TimesFM 1.0 200M	OSS · runs on Actions CPU	$0
LLM (extraction + synthesis)	Gemini 3.1 Flash Lite Preview	500 RPD free	$0
LLM (failover)	Azure OpenAI GPT-4o	Non-profit grant	$0
Demo endpoint	Hugging Face Spaces	Free CPU Space	$0
Total recurring infrastructure			$0/month

Operator-only surfaces

Three pages live behind the operator login — that's where the flywheels actually turn. For walk-throughs without a passphrase, each link below opens the page in demo mode (frozen 2026-W19 snapshot, read-only, no Supabase calls).

flywheel 1 surface

Decision log →

9 example decisions across 3 categories. Closed entries show backfilled outcomes + 1-5 ratings.

gap analysis surface

My Products →

5 SKUs scored against this week's opportunities. Improvement candidates surface SKUs whose attributes score < 0.3.

flywheel 2 surface

Novel-attribute review →

~60 grouped (dim, value) candidates from this week's extractions with operator approve/reject state.

Want the details?

PRISM ships with a project constitution — every decision, attribute taxonomy, and architectural constraint encoded in a single Markdown file that AI sessions read before any work. It's also a demo asset.

Read CLAUDE.md private repo Live dashboard (3 categories, 2026-W19)

Last verified (2026-W19): TimesFM forecast in 0.4s · Gemini synthesis in 18-21s · 72 unit tests passing (49 opportunity + 4 migration + 13 calibration + 6 export packager) · 62 forecasts + 150 opportunities + 147 attribute matrix rows + 144 cross-cat migration rows + 3 weekly reports in Supabase. Decision Log + My Products + Submit (operator-input loops) live behind the Operator login. Track B export pack at /api/export/2026-W19 (operator-only, signed-URL).