Service Catalog
Last updated: 2026-06-03
Quick reference for every deployable service in the Grizzlebear platform.
API Gateway (app.py)
Entry point for all services. Composes sub-apps into a single Modal deployment.
Includes: user_data_app (users + data), low_priority_app (voices + geocoding + capture + static_site), livekit_ts.app, ml.app, model_proxy.app, tsweb.app
Modal Function Topology
Services are grouped into consolidated Modal functions to reduce cold-start surface:
| Function | Services | Notes |
|---|---|---|
user_data_app |
users, data | In-process dispatch (no HTTP self-calls) |
low_priority_app |
voices, geocoding, capture, static_site | Hosts deploy dashboard |
livekit_ts |
livekit server + agent | Separate due to heavy Go/ML deps |
ml |
ML gateway, training, serving, eval | GPU-attached |
model_proxy |
model proxy | Static egress IP |
tsweb |
tsweb | Quarantine boundary |
Users Service (users/)
| Endpoint | Method | Description |
|---|---|---|
/v1/register |
POST | Register new user |
/v1/login |
POST | Authenticate user |
/v1/profile |
GET | Get user profile |
/v1/resolve-location |
GET | Resolve location with geocoding context |
Auth: Supabase magic links + OTP. Replaced original badauth key/secret system.
Traction Admin (users/traction.py)
| Endpoint | Method | Description |
|---|---|---|
/admin/traction/stats |
GET | User totals (registered/anonymous, homeowner/pro) |
/admin/traction/daily-signups |
GET | Per-day signup counts for calendar heatmap |
/admin/traction/daily-projects |
GET | Per-day project creation counts |
/admin/traction/projects |
GET | Paginated project list (20/page) |
/admin/traction/projects/{id} |
GET | Project detail (prompt, refinement Q&A, plan tree) |
Auth: TradesparkEmailAdmin dependency — email allowlist via TS_ADMIN_EMAILS env var (from TradesparkAdmins Modal Secret). Decoupled from Supabase profiles.role.
~~Billing Service~~ (archived)
Deprecated May 21 — Stripe integration moving elsewhere. Code archived at dev/_archived/billing/.
Data Service (data/)
| Endpoint | Method | Description |
|---|---|---|
/v1/assets |
GET | List assets (supports projectId=all with deduplication) |
/v1/asset |
GET/POST/DELETE | CRUD for individual assets |
/v1/asset/chunked/init |
POST | Initialize chunked upload |
/v1/asset/chunked/upload |
POST | Upload chunk (set final=true to complete) |
/v1/sync-project |
POST | Pull Supabase project data into location markdown |
Storage: SQLite per-location (S3-backed, writes protected by ETag CAS + per-key locking via with_location_db), S3 blobs for files. Paths follow account/geo_prefix/location/ convention.
Voices Service (voices/)
| Endpoint | Method | Description |
|---|---|---|
/v1/tts |
POST | Text-to-speech (streaming audio) |
Providers: ElevenLabs, OpenAI. Voice selection via ranked voices.json config.
Geocoding Service (geocoding/)
| Endpoint | Method | Description |
|---|---|---|
/v1/reverse-geocode |
GET | GPS coordinates to address |
/v1/map-image |
GET | Static map image |
Providers: Mapbox (primary), Google Maps (fallback).
Capture Service (capture/)
AR session metadata collection (color images, depth maps, device metadata).
LiveKit Service (livekit_ts/)
WebRTC room management and agent worker for real-time video sessions. Handles room creation, token generation, and the server-side agent that participates in sessions (LiveKit SDK 1.5.6).
Recording: The Go-based livekit-recorder captures RGB and depth video tracks, encrypts them at rest using AES-CBC (account DEK passed via DEK_BASE64 env var), and writes directly to the S3-mounted path. Recording assets are registered in the location's metadata DB.
3D Reconstruction: Session recordings feed into the splatting pipeline (queues/session_to_splat.py) which routes to either the standard or video-3D-reconstruction variant. The latter (session_to_splat.video_3d_reconstruction.py) decrypts recordings → extracts frames via ffmpeg → runs COLMAP SfM (with checkpoint-based preemption tolerance) → produces Gaussian splats via fastgs.
ML Gateway (ml/)
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Liveness check |
/ml/generate |
POST | Single-shot LLM completion (Gemini) |
/ml/generate-plan |
POST | Question-answering + plan generation flow |
/ml/chat/stream |
POST | SSE streaming chat (Gemini + Ollama multiplexed). Supports optional messages_by_provider for per-provider conversation threads |
/ml/data/inspect |
GET | Browse captured training data |
/ml/data/stats |
GET | Training data statistics |
/ml/training/start |
POST | Trigger model training (stub) |
/ml/training/status |
GET | Training job status (stub) |
/ml/serving/models |
GET | List deployed model versions (stub) |
/ml/serving/promote |
POST | Promote model version (stub) |
/ml/eval/run |
POST | Non-blocking multi-model eval — spawns one worker per model, returns {run_id, jobs} |
/ml/eval/jobs/{job_id} |
GET | Eval job status + per-record progress (with zombie probe) |
/ml/eval/runs |
GET | List recent eval runs |
/ml/eval/runs/{run_id} |
GET | Eval run detail with per-model jobs |
/ml/eval/golden-set |
POST | Create golden set from raw eval data |
/comparison |
GET | Side-by-side streaming comparison across all 6 models |
/dashboard |
GET | Web UI for ML pipeline operations |
/ml/serve/{slot}/warmup |
POST | Spawn non-blocking vLLM container warmup |
/ml/serve/warmup/{call_id} |
GET | Poll warmup status (starting/ready/error) |
/ml/serve/{slot}/status |
GET | Live runner count + backlog for a vLLM class |
/ml/mobile/ingest-litert |
POST | Admin: download base LiteRT bundle from HF → S3 |
/ml/mobile/convert-litert |
POST | Admin: convert finetuned weights to LiteRT (returns 501 — awaiting litert-torch E-series support) |
/ml/mobile/manifest/{slot} |
GET | iOS client: version, size, sha256, presigned S3 URL |
Data capture: All LLM requests/responses logged to S3 JSONL for training.
LLM providers: Gemini (primary), Claude, OpenAI (teacher models for comparison/distillation), on-prem Ollama (via model proxy), vLLM (Modal @modal.cls() + .remote.aio() RPC — live for all 3 Gemma 4 slots).
ML Sub-modules
- gateway/: LLM routing (Gemini, Ollama), streaming provider adapters
- contracts/: v0 data contracts — capture-bundle, scene-facts, data-pile. JSON Schema (draft 2020-12), ARKit world frame. Shared loader (
loader.py) for schema validation and markdown front-matter parsing. Validator script checks all fixtures - perception/: Track 2 — video/capture → structured
scene_facts. SAM2 video segmentation (Modal app, per-frame masks with track IDs), Cosmos 3 video-language model (NVIDIA NIM Modal app, room/surface captions), merge step combining both into validated scene_facts. Hallucination judge (LLM cross-check of captions vs frames). Eval scripts produce artifact reports - reasoning/: Track 3 — grounded Q&A over progressive-markdown knowledge bases. Loads data-pile docs, validates front-matter, answers questions via LLM. A/B eval compares grounded vs video-only reasoning. Fixture pile for local testing
- data_pipeline/: Supabase scraper (nightly), synthetic data generator, format converters
- training/: Model registry, LoRA/QLoRA configs, Unsloth trainer stubs
- serving/: vLLM inference (live for E2B/E4B/31B via
.remote.aio()) - mobile/: LiteRT on-device model distribution (HF → Volume → S3 → iOS manifest)
- eval/: Non-blocking evaluation framework — golden-set management, multi-model comparison, per-record metrics + judge scores, zombie probe on running workers
Static Site (static_site/)
Centralized demo hub, blog, docs viewer, and deploy dashboard at static-{env}.grizzlebear.io. Consolidates 8 demo pages under a shared TradeSpark-themed shell with the canonical design system tokens.
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Landing page with links to all demos, docs, and blog |
/demos/ml-comparison |
GET | Side-by-side model comparison UI |
/demos/ml-generation |
GET | Synthetic project generation pipeline (formerly ml-dashboard; old slug 301-redirects) |
/demos/ml-eval |
GET | Multi-model eval dashboard (golden-set picker, comparison table, record diffs) |
/demos/ml-training |
GET | Training job monitor with progress bars |
/demos/traction |
GET | Admin analytics — user/project signal from Supabase (email-allowlist gated) |
/demos/mobile-session |
GET | Mobile session simulator (project/property dropdowns from tsweb API) |
/demos/model-proxy |
GET | On-prem model proxy test page |
/demos/websocket-test |
GET | WebSocket session test client |
/docs |
GET | Docs index (architecture, services, changelogs, IMPROVE.md) |
/docs/{path} |
GET | Individual doc page (runtime markdown rendering, login-required) |
/login |
GET | Login page |
/posts |
GET | Blog post index |
/posts/{slug} |
GET | Individual blog post (Markdown with YAML frontmatter) |
Shared assets: CSS design tokens (TradeSpark theme), JS utility layer (Site.apiFetch with auto-auth + 401 refresh, Site.serviceUrl for per-env URL resolution).
Docs viewer (docs.py): Renders docs/ corpus and IMPROVE.md at runtime via markdown-it-py. Login-required (not in publicPrefixes). Path-traversal protection via _safe_doc_path(). FastAPI's built-in Swagger UI is disabled to free the /docs route.
Deploy Dashboard (deploy_dashboard.py)
Pipeline visualization at /deploy-dashboard/ showing all environments and their promotion flow.
| Endpoint | Method | Description |
|---|---|---|
/deploy-dashboard/ |
GET | Dashboard HTML shell (Cache-Control: no-store) |
/deploy-dashboard/api/envs |
GET | All env states, branch tips, test results (parallelized) |
/deploy-dashboard/api/promote |
POST | Trigger promotion between environments |
/deploy-dashboard/api/test |
POST | Trigger test run on an environment |
/deploy-dashboard/api/approve-prod |
POST | Approve and deploy to production (gated) |
Features: env cards with status chips (SHA, test pass/fail, commit delta), pipeline connectors, action toasts with per-stage elapsed timers, live test progress (X/Y folders), adaptive polling, skeleton first-paint. Approve-prod button gated: only enabled when beta tests pass on current origin/beta tip.
Backend: Queries ci/_git_in_modal for branch tips (with 60s TTL cache), Modal AppList RPC for env states, and Bruno test results. All data sources fetched in parallel.
tsweb (tsweb/)
Quarantine boundary for all tsweb-app Supabase integrations at tsweb-{env}.grizzlebear.io. Isolates project/property/task reads so the rest of the codebase stays clean — when tsweb-app is retired, cleanup is a single-directory delete.
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Service health (temporary status JSON) |
/projects |
GET | User-scoped projects with joined property data (bearer auth) |
Nightly cron: tsweb.scheduled runs the Supabase scraper on schedule (migrated from ml_endpoint).
Internal consumers: data/sync.py (project sync), users/auth.py (location resolution), static_site mobile-session demo (dropdowns).
Dependency management: tsweb is imported lazily at call sites (not top-level) and mounted only on the users and data functions via Modal's Image.add_local_python_source("tsweb"). Top-level imports would crash all services that import AuthContext but don't mount tsweb.
Model Proxy (model_proxy/)
Reverse proxy to on-prem model servers (vLLM/Ollama on Mac Mini).
| Domain | Flow |
|---|---|
model-{env}.grizzlebear.io |
Client -> Modal (static egress IP) -> UDM firewall -> Mac Mini |
Auth: JWT bearer token validation before forwarding. Static IP allowlisted on UDM.
CI/CD Pipeline (ci/)
Modal-driven promotion DAG running in the cicd environment. Replaces the previous GitLab CI shell scripts.
Webhook Dispatcher (webhook.py)
| Endpoint | Method | Description |
|---|---|---|
/promote-to-dev |
POST | Merge feature branch → dev, deploy, test |
/dev-test |
POST | Deploy dev + run tests (self-heals on uncertain state) |
/promote-dev-to-beta |
POST | Merge dev → beta, deploy, test |
/approve-prod |
POST | Deploy to production (requires X-Approve-Token) |
/action-status/{id} |
GET | Poll in-flight action progress |
Supporting Modules
| Module | Purpose |
|---|---|
deploy_in_modal.py |
Remote modal deploy execution (slim image, no ML deps) |
bruno_in_modal.py |
Remote Bruno test runner — parallel tier-aware, live progress, Modal Secret creds |
_git_in_modal.py |
Git operations (merge, branch tips with commits, env states via AppList RPC, TTL-cached GitLab calls) |
scheduled_cleanup.py |
Cron: non-prod app cleanup + daily main canary test (14:00 UTC) |
Headless Claude Runner (automation/)
Separate Modal app (grizzlebear-claude-runner) that runs the /improve and /document skills headless via the Claude Code CLI. Isolated from the main grizzlebear-api blast radius (its own image + secrets).
| Schedule | Skill | Scope |
|---|---|---|
| Mon 03:00 UTC | /improve |
IMPROVE.md only |
| Mon 04:00 UTC | /document |
docs/ only |
Safety guards (hardening.py): per-skill allowed-path enforcement, pre-push hook blocking force/delete pushes and non-dev branch pushes, env scrubbing, gitleaks scanning, commit/diff validation.
Commits use reserved prefixes: IMPROVE: and DOCUMENT: . Secrets sourced from environment_name="main".
Docker Base Images
Heavy base images are pre-baked to AWS ECR to avoid repeated pip installs during deploys:
| Base | Contents | Built by |
|---|---|---|
dockerfile.base |
Core Python deps | build_ecr_base_on_ec2.py |
dockerfile.ai.base |
torch, transformers, CUDA | build_ecr_base_on_ec2.py |
dockerfile.livekit_server.base |
LiveKit SDK + deps | build_ecr_base_on_ec2.py |
dockerfile.livekit_agent.base |
Go compiler + ML deps | build_ecr_base_on_ec2.py |
dockerfile.ml_training.base |
Unsloth, training deps | build_ecr_base_on_ec2.py |
dockerfile.session_to_splat.base |
COLMAP, fastgs, ffmpeg | build_ecr_base_on_ec2.py |
Automation Runner (dev/automation/)
Separate Modal app (grizzlebear-claude-runner) — not part of app.py — that runs the /improve and /document Claude Code skills headless. Isolated image (Node + Claude CLI + gitleaks) and secrets (sourced from env=main) keep it off the grizzlebear-api blast radius.
| Entrypoint | Description |
|---|---|
run_improve |
Refresh IMPROVE.md — commits IMPROVE: … to origin/dev |
run_document |
Refresh /docs/ — commits DOCUMENT: … to origin/dev |
Secrets (reused, from env=main): TradeSpark (AWS + ANTHROPIC_API_KEY), GitlabPushToken (GITLAB_TOKEN), ClaudeCodeOAuth (headless CLI auth).
Hardening (hardening.py): per-skill allowed-path enforcement, git pre-push hook (no force/delete, dev-only), env scrubbing, gitleaks scan. Manual-only for now (uv run modal run -e jh automation/claude_runner.py::run_improve [--no-push]); cron commented in place.