TimesFM
Google ResearchTime-series forecasting
200M-param decoder that produces 26-week forecasts with p10/p50/p90 quantile bands. Loads in 4s on a free GitHub Actions CPU runner; forecasts 22 keywords in 0.4s.
How PRISM works
Technology is a commodity. Judgment is the product. PRISM encodes 10+ years of category operating expertise into a system that AI executes at scale — every Monday morning, a weekly report concrete enough to act on before lunch.
Each model assigned to the task it's actually best at — bulk extraction goes to Flash-Lite, forecasting to TimesFM, deep analysis to Claude. Multi-model orchestration is the differentiation.
Time-series forecasting
200M-param decoder that produces 26-week forecasts with p10/p50/p90 quantile bands. Loads in 4s on a free GitHub Actions CPU runner; forecasts 22 keywords in 0.4s.
Bulk attribute extraction
Tags every Amazon listing against the canonical taxonomy. 500 RPD free tier; ~1k input tokens per listing. Validated client-side against the operator's curated dictionary.
Weekly executive synthesis
Reads forecasts + opportunities + Reddit complaint patterns, writes the markdown weekly report in operator's voice (Prototype/Iterate/Monitor/Skip/Kill). Direct-routed to Azure since Gemini Pro free tier ended 2026-04 (W7 decision); ~13s latency per category, deterministic quality.
Operator-driven deep analysis (Track B)
Operator clicks 'Open in Claude' on any /[category] or /migration page. The 129KB export pack drops into claude.ai with 3 ready-to-paste prompts — improvement focus, risk audit, supplier brief draft. For the 20% of decisions worth deep dialogue.
Bulk-extraction failover
Last line of defense for bulk extraction (listings + Reddit aspects). After dual Gemini keys both hit daily quota, requests transparently re-route to Azure GPT-4o-mini. ~$0.04/wk worst case, well under the non-profit grant.
Triggered Mondays at 02:00 UTC. Each stage runs in a try/except — a transient failure (Gemini 503, pytrends rate-limit) doesn't bring down the whole run; later stages still work off cached parquets.
pytrends · Reddit (.json) · Amazon (Apify)
Gemini Flash-Lite tags listings
26-week × p10/p50/p90
4-factor + confidence discount
Azure GPT-4o writes operator's voice
Supabase · dashboard · export pack
The IP. Five subsystems sit between AI models and decisions: canonical attribute taxonomies, signal weighting rules, cross-source validation, opportunity scoring, and a decision feedback loop. The scoring formula is committed in code (and unit tested with 70+ tests, not buried in an LLM prompt):
score = 0.40·trend + 0.20·(1 − saturation) + 0.20·quality_gap + 0.20·strategic_fitThen multiplied by (0.5 + 0.5·confidence). An attribute we know nothing about gets halved; an attribute with strong multi-source signal keeps full weight. Soft enough to preserve ranking among sparse-data peers, hard enough that real signal outranks neutral defaults.
Worked example: material = titanium
The system's value compounds because of three feedback loops: every operator decision retunes the scoring weights; every approved novel attribute grows the dictionary's precision; every cross-category signal is a candidate for the next category's keyword map. None produce dramatic data this quarter — but the rails are in.
surface: /decisions
Operator logs every action against an opportunity. Backfills the actual outcome + 1-5 success rating once it lands. The calibration job (scoring/calibration.py) computes per-factor Spearman correlation between scores and ratings — surfacing which scoring factors actually predict success.
surface: /submit (review queue)
Gemini extraction emits novel candidate (dim, value) pairs each week. Operator approves → next deploy adds them to the canonical yaml. Operator rejects → marked ignored so they don't resurface. Dictionary precision compounds weekly.
surface: /migration
Patterns observed in one category register as candidate signals in others. The matrix surfaces attributes hot in one category and missing from another — operator decides whether to add the keyword. Pickleball is the canonical example: #2 in sun_hat, #5 in water_bottle, absent from tank_top.
Recurring monthly target: $0. Hard rule — no paid services without operator approval. Verified live; numbers below are not aspirational.
| Layer | Service | Tier | Monthly |
|---|---|---|---|
| Frontend | Vercel | Free tier (Hobby) | $0 |
| Database | Supabase | Free tier | $0 |
| Scheduler | GitHub Actions | 2,000 min/mo free | $0 |
| Time-series | TimesFM 1.0 200M | OSS · runs on Actions CPU | $0 |
| LLM (extraction + synthesis) | Gemini 3.1 Flash Lite Preview | 500 RPD free | $0 |
| LLM (failover) | Azure OpenAI GPT-4o | Non-profit grant | $0 |
| Demo endpoint | Hugging Face Spaces | Free CPU Space | $0 |
| Total recurring infrastructure | $0/month | ||
Three pages live behind the operator login — that's where the flywheels actually turn. For walk-throughs without a passphrase, each link below opens the page in demo mode (frozen 2026-W19 snapshot, read-only, no Supabase calls).
flywheel 1 surface
9 example decisions across 3 categories. Closed entries show backfilled outcomes + 1-5 ratings.
gap analysis surface
5 SKUs scored against this week's opportunities. Improvement candidates surface SKUs whose attributes score < 0.3.
flywheel 2 surface
~60 grouped (dim, value) candidates from this week's extractions with operator approve/reject state.
PRISM ships with a project constitution — every decision, attribute taxonomy, and architectural constraint encoded in a single Markdown file that AI sessions read before any work. It's also a demo asset.
Last verified (2026-W19): TimesFM forecast in 0.4s · Gemini synthesis in 18-21s · 72 unit tests passing (49 opportunity + 4 migration + 13 calibration + 6 export packager) · 62 forecasts + 150 opportunities + 147 attribute matrix rows + 144 cross-cat migration rows + 3 weekly reports in Supabase. Decision Log + My Products + Submit (operator-input loops) live behind the Operator login. Track B export pack at /api/export/2026-W19 (operator-only, signed-URL).