VTKL Evals Dashboard

⚡ Discovery-First Methodology

How we collapsed three sequential phases (elicitation → documentation → validation) into one parallel process — producing validated specs, working prototypes, and production schemas simultaneously.

Scrumerfall

1,440h

4-12 weeks, 30-50% rework

Traditional Scrum

48h

Per sprint, no client deliverable

Discovery-First

14h

Spec + Prototype + Schema

Efficiency Ratio

103:1

vs Scrumerfall worst case

Phase Collapse — The Key Innovation

Traditional (Sequential)

1. Elicitation2-4 weeks

2. Documentation4-8 weeks

3. Validation2-4 weeks

Total8-16 weeks

Discovery-First (Parallel)

All three phases4-6 hours

Spec✓ simultaneous

Prototype✓ simultaneous

Schema✓ simultaneous

Evidence: Mari's Garden Case Study

📖 Product Specification

Executive summary, 4 customer journeys, 7 BR sets with testable ACs. Client-presentable spec site.

maris-gardens-spec.pages.dev

🖥️ Interactive Prototype

20+ screens: admin dashboard, product CRUD, ordering portal, customer portal. Real images, working navigation.

maris-gardens-demo.pages.dev

🗄️ Production Schema

19 tables, 14 enums, 20+ RLS policies. Atomic stored procedures. Every deferred decision tagged.

PostgreSQL + Supabase RLS

Eval Integration

Discovery methodology is fully integrated into the evals system with a dedicated rubric (6 PASS criteria, 6 FAIL triggers, 4 process discipline checks) and 9 calibration corpus entries.

Rubricdiscovery-methodology.yaml

Corpus entries9 (TJ-087→095)

Judge modelGLM 5.1

🎧 Full Narration

Complete methodology walkthrough with audio narration and detailed visual comparison.

View Full Presentation →

Pipeline Status

Phase 1 — Intelligence Intake

Operational

Drive intake—

Slack monitoringActive

Stakeholder files—

Phase 2 — Shadow Review

Operational

Total runs—

Judge modelGLM-5.1

Rubrics6

Phase 3 — Correlation Engine

Operational

Correlation runs—

Decisions tracked—

Intel items—

Cron Schedule

Job	Schedule	Description	Status
shadow-review	`0 3 * * *`	Nightly shadow review of agent outputs	Active
memory-consolidation	`0 4 * * *`	Consolidate daily memory into long-term storage	Active
drive-intake	`/30 * * *`	Sync Google Drive shared files for analysis	Active
tony-task-capture	`0 8,12,17 * * 1-5`	Capture and triage Tony's DM task backlog	Active
bd-daily	`0 9 * * 1-5`	Generate and post BD daily briefing	Active
correlation-engine	`0 5 * * 0`	Weekly cross-domain correlation analysis	Active

Memory Layer

Stakeholder Profiles

—

Individual intelligence files

Rubrics Registered

behavioral, discovery, effort, process, product, sales

MLflow Experiment

warren-evals

Experiment ID: 1 • MLflow 3.12.0

⚡ VTKL Evals Dashboard

🔁 Closed-Loop Intelligence

🧠 Memory Lifecycle

📊 Evals Explainer

🏢 Enterprise Demo Flow