Service Catalog

Last updated: 2026-06-03

Quick reference for every deployable service in the Grizzlebear platform.

API Gateway (app.py)

Entry point for all services. Composes sub-apps into a single Modal deployment.

Includes: user_data_app (users + data), low_priority_app (voices + geocoding + capture + static_site), livekit_ts.app, ml.app, model_proxy.app, tsweb.app

Modal Function Topology

Services are grouped into consolidated Modal functions to reduce cold-start surface:

Function	Services	Notes
`user_data_app`	users, data	In-process dispatch (no HTTP self-calls)
`low_priority_app`	voices, geocoding, capture, static_site	Hosts deploy dashboard
`livekit_ts`	livekit server + agent	Separate due to heavy Go/ML deps
`ml`	ML gateway, training, serving, eval	GPU-attached
`model_proxy`	model proxy	Static egress IP
`tsweb`	tsweb	Quarantine boundary

Users Service (users/)

Endpoint	Method	Description
`/v1/register`	POST	Register new user
`/v1/login`	POST	Authenticate user
`/v1/profile`	GET	Get user profile
`/v1/resolve-location`	GET	Resolve location with geocoding context

Auth: Supabase magic links + OTP. Replaced original badauth key/secret system.

Traction Admin (users/traction.py)

Endpoint	Method	Description
`/admin/traction/stats`	GET	User totals (registered/anonymous, homeowner/pro)
`/admin/traction/daily-signups`	GET	Per-day signup counts for calendar heatmap
`/admin/traction/daily-projects`	GET	Per-day project creation counts
`/admin/traction/projects`	GET	Paginated project list (20/page)
`/admin/traction/projects/{id}`	GET	Project detail (prompt, refinement Q&A, plan tree)

Auth: TradesparkEmailAdmin dependency — email allowlist via TS_ADMIN_EMAILS env var (from TradesparkAdmins Modal Secret). Decoupled from Supabase profiles.role.

Billing Service (archived)

Deprecated May 21 — Stripe integration moving elsewhere. Code archived at dev/_archived/billing/.

Data Service (data/)

Endpoint	Method	Description
`/v1/assets`	GET	List assets (supports `projectId=all` with deduplication)
`/v1/asset`	GET/POST/DELETE	CRUD for individual assets
`/v1/asset/chunked/init`	POST	Initialize chunked upload
`/v1/asset/chunked/upload`	POST	Upload chunk (set `final=true` to complete)
`/v1/sync-project`	POST	Pull Supabase project data into location markdown

Storage: SQLite per-location (S3-backed, writes protected by ETag CAS + per-key locking via with_location_db), S3 blobs for files. Paths follow account/geo_prefix/location/ convention.

Voices Service (voices/)

Endpoint	Method	Description
`/v1/tts`	POST	Text-to-speech (streaming audio)

Providers: ElevenLabs, OpenAI. Voice selection via ranked voices.json config.

Geocoding Service (geocoding/)

Endpoint	Method	Description
`/v1/reverse-geocode`	GET	GPS coordinates to address
`/v1/map-image`	GET	Static map image

Providers: Mapbox (primary), Google Maps (fallback).

Capture Service (capture/)

AR session metadata collection (color images, depth maps, device metadata).

LiveKit Service (livekit_ts/)

WebRTC room management and agent worker for real-time video sessions. Handles room creation, token generation, and the server-side agent that participates in sessions (LiveKit SDK 1.5.6).

Recording: The Go-based livekit-recorder captures RGB and depth video tracks, encrypts them at rest using AES-CBC (account DEK passed via DEK_BASE64 env var), and writes directly to the S3-mounted path. Recording assets are registered in the location's metadata DB.

3D Reconstruction: Session recordings feed into the splatting pipeline (queues/session_to_splat.py) which routes to either the standard or video-3D-reconstruction variant. The latter (session_to_splat.video_3d_reconstruction.py) decrypts recordings → extracts frames via ffmpeg → runs COLMAP SfM (with checkpoint-based preemption tolerance) → produces Gaussian splats via fastgs.

ML Gateway (ml/)

Endpoint	Method	Description
`/health`	GET	Liveness check
`/ml/generate`	POST	Single-shot LLM completion (Gemini)
`/ml/generate-plan`	POST	Question-answering + plan generation flow
`/ml/chat/stream`	POST	SSE streaming chat (Gemini + Ollama multiplexed). Supports optional `messages_by_provider` for per-provider conversation threads
`/ml/data/inspect`	GET	Browse captured training data
`/ml/data/stats`	GET	Training data statistics
`/ml/training/start`	POST	Trigger model training (stub)
`/ml/training/status`	GET	Training job status (stub)
`/ml/serving/models`	GET	List deployed model versions (stub)
`/ml/serving/promote`	POST	Promote model version (stub)
`/ml/eval/run`	POST	Non-blocking multi-model eval — spawns one worker per model, returns `{run_id, jobs}`
`/ml/eval/jobs/{job_id}`	GET	Eval job status + per-record progress (with zombie probe)
`/ml/eval/runs`	GET	List recent eval runs
`/ml/eval/runs/{run_id}`	GET	Eval run detail with per-model jobs
`/ml/eval/golden-set`	POST	Create golden set from raw eval data
`/comparison`	GET	Side-by-side streaming comparison across all 6 models
`/dashboard`	GET	Web UI for ML pipeline operations
`/ml/serve/{slot}/warmup`	POST	Spawn non-blocking vLLM container warmup
`/ml/serve/warmup/{call_id}`	GET	Poll warmup status (starting/ready/error)
`/ml/serve/{slot}/status`	GET	Live runner count + backlog for a vLLM class
`/ml/mobile/ingest-litert`	POST	Admin: download base LiteRT bundle from HF → S3
`/ml/mobile/convert-litert`	POST	Admin: convert finetuned weights to LiteRT (returns 501 — awaiting litert-torch E-series support)
`/ml/mobile/manifest/{slot}`	GET	iOS client: version, size, sha256, presigned S3 URL

Data capture: All LLM requests/responses logged to S3 JSONL for training.

LLM providers: Gemini (primary), Claude, OpenAI (teacher models for comparison/distillation), on-prem Ollama (via model proxy), vLLM (Modal @modal.cls() + .remote.aio() RPC — live for all 3 Gemma 4 slots).

ML Sub-modules

gateway/: LLM routing (Gemini, Ollama), streaming provider adapters
contracts/: v0 data contracts — capture-bundle, scene-facts, data-pile. JSON Schema (draft 2020-12), ARKit world frame. Shared loader (loader.py) for schema validation and markdown front-matter parsing. Validator script checks all fixtures
perception/: Track 2 — video/capture → structured scene_facts. SAM2 video segmentation (Modal app, per-frame masks with track IDs), Cosmos 3 video-language model (NVIDIA NIM Modal app, room/surface captions), merge step combining both into validated scene_facts. Hallucination judge (LLM cross-check of captions vs frames). Eval scripts produce artifact reports
reasoning/: Track 3 — grounded Q&A over progressive-markdown knowledge bases. Loads data-pile docs, validates front-matter, answers questions via LLM. A/B eval compares grounded vs video-only reasoning. Fixture pile for local testing
data_pipeline/: Supabase scraper (nightly), synthetic data generator, format converters
training/: Model registry, LoRA/QLoRA configs, Unsloth trainer stubs
serving/: vLLM inference (live for E2B/E4B/31B via .remote.aio())
mobile/: LiteRT on-device model distribution (HF → Volume → S3 → iOS manifest)
eval/: Non-blocking evaluation framework — golden-set management, multi-model comparison, per-record metrics + judge scores, zombie probe on running workers

Static Site (static_site/)

Centralized demo hub, blog, docs viewer, and deploy dashboard at static-{env}.grizzlebear.io. Consolidates 8 demo pages under a shared TradeSpark-themed shell with the canonical design system tokens.

Endpoint	Method	Description
`/`	GET	Landing page with links to all demos, docs, and blog
`/demos/ml-comparison`	GET	Side-by-side model comparison UI
`/demos/ml-generation`	GET	Synthetic project generation pipeline (formerly ml-dashboard; old slug 301-redirects)
`/demos/ml-eval`	GET	Multi-model eval dashboard (golden-set picker, comparison table, record diffs)
`/demos/ml-training`	GET	Training job monitor with progress bars
`/demos/traction`	GET	Admin analytics — user/project signal from Supabase (email-allowlist gated)
`/demos/mobile-session`	GET	Mobile session simulator (project/property dropdowns from tsweb API)
`/demos/model-proxy`	GET	On-prem model proxy test page
`/demos/websocket-test`	GET	WebSocket session test client
`/docs`	GET	Docs index (architecture, services, changelogs, IMPROVE.md)
`/docs/{path}`	GET	Individual doc page (runtime markdown rendering, login-required)
`/login`	GET	Login page
`/posts`	GET	Blog post index
`/posts/{slug}`	GET	Individual blog post (Markdown with YAML frontmatter)

Shared assets: CSS design tokens (TradeSpark theme), JS utility layer (Site.apiFetch with auto-auth + 401 refresh, Site.serviceUrl for per-env URL resolution).

Docs viewer (docs.py): Renders docs/ corpus and IMPROVE.md at runtime via markdown-it-py. Login-required (not in publicPrefixes). Path-traversal protection via _safe_doc_path(). FastAPI's built-in Swagger UI is disabled to free the /docs route.

Deploy Dashboard (deploy_dashboard.py)

Pipeline visualization at /deploy-dashboard/ showing all environments and their promotion flow.

Endpoint	Method	Description
`/deploy-dashboard/`	GET	Dashboard HTML shell (Cache-Control: no-store)
`/deploy-dashboard/api/envs`	GET	All env states, branch tips, test results (parallelized)
`/deploy-dashboard/api/promote`	POST	Trigger promotion between environments
`/deploy-dashboard/api/test`	POST	Trigger test run on an environment
`/deploy-dashboard/api/approve-prod`	POST	Approve and deploy to production (gated)

Features: env cards with status chips (SHA, test pass/fail, commit delta), pipeline connectors, action toasts with per-stage elapsed timers, live test progress (X/Y folders), adaptive polling, skeleton first-paint. Approve-prod button gated: only enabled when beta tests pass on current origin/beta tip.

Backend: Queries ci/_git_in_modal for branch tips (with 60s TTL cache), Modal AppList RPC for env states, and Bruno test results. All data sources fetched in parallel.

tsweb (tsweb/)

Quarantine boundary for all tsweb-app Supabase integrations at tsweb-{env}.grizzlebear.io. Isolates project/property/task reads so the rest of the codebase stays clean — when tsweb-app is retired, cleanup is a single-directory delete.

Endpoint	Method	Description
`/`	GET	Service health (temporary status JSON)
`/projects`	GET	User-scoped projects with joined property data (bearer auth)

Nightly cron: tsweb.scheduled runs the Supabase scraper on schedule (migrated from ml_endpoint).

Internal consumers: data/sync.py (project sync), users/auth.py (location resolution), static_site mobile-session demo (dropdowns).

Dependency management: tsweb is imported lazily at call sites (not top-level) and mounted only on the users and data functions via Modal's Image.add_local_python_source("tsweb"). Top-level imports would crash all services that import AuthContext but don't mount tsweb.

Model Proxy (model_proxy/)

Reverse proxy to on-prem model servers (vLLM/Ollama on Mac Mini).

Domain	Flow
`model-{env}.grizzlebear.io`	Client -> Modal (static egress IP) -> UDM firewall -> Mac Mini

Auth: JWT bearer token validation before forwarding. Static IP allowlisted on UDM.

CI/CD Pipeline (ci/)

Modal-driven promotion DAG running in the cicd environment. Replaces the previous GitLab CI shell scripts.

Webhook Dispatcher (webhook.py)

Endpoint	Method	Description
`/promote-to-dev`	POST	Merge feature branch → dev, deploy, test
`/dev-test`	POST	Deploy dev + run tests (self-heals on uncertain state)
`/promote-dev-to-beta`	POST	Merge dev → beta, deploy, test
`/approve-prod`	POST	Deploy to production (requires `X-Approve-Token`)
`/action-status/{id}`	GET	Poll in-flight action progress

Supporting Modules

Module	Purpose
`deploy_in_modal.py`	Remote `modal deploy` execution (slim image, no ML deps)
`bruno_in_modal.py`	Remote Bruno test runner — parallel tier-aware, live progress, Modal Secret creds
`_git_in_modal.py`	Git operations (merge, branch tips with commits, env states via AppList RPC, TTL-cached GitLab calls)
`scheduled_cleanup.py`	Cron: non-prod app cleanup + daily main canary test (14:00 UTC)

Headless Claude Runner (automation/)

Separate Modal app (grizzlebear-claude-runner) that runs the /improve and /document skills headless via the Claude Code CLI. Isolated from the main grizzlebear-api blast radius (its own image + secrets).

Schedule	Skill	Scope
Mon 03:00 UTC	`/improve`	`IMPROVE.md` only
Mon 04:00 UTC	`/document`	`docs/` only

Safety guards (hardening.py): per-skill allowed-path enforcement, pre-push hook blocking force/delete pushes and non-dev branch pushes, env scrubbing, gitleaks scanning, commit/diff validation.

Commits use reserved prefixes: IMPROVE: and DOCUMENT: . Secrets sourced from environment_name="main".

Docker Base Images

Heavy base images are pre-baked to AWS ECR to avoid repeated pip installs during deploys:

Base	Contents	Built by
`dockerfile.base`	Core Python deps	`build_ecr_base_on_ec2.py`
`dockerfile.ai.base`	torch, transformers, CUDA	`build_ecr_base_on_ec2.py`
`dockerfile.livekit_server.base`	LiveKit SDK + deps	`build_ecr_base_on_ec2.py`
`dockerfile.livekit_agent.base`	Go compiler + ML deps	`build_ecr_base_on_ec2.py`
`dockerfile.ml_training.base`	Unsloth, training deps	`build_ecr_base_on_ec2.py`
`dockerfile.session_to_splat.base`	COLMAP, fastgs, ffmpeg	`build_ecr_base_on_ec2.py`

Automation Runner (dev/automation/)

Separate Modal app (grizzlebear-claude-runner) — not part of app.py — that runs the /improve and /document Claude Code skills headless. Isolated image (Node + Claude CLI + gitleaks) and secrets (sourced from env=main) keep it off the grizzlebear-api blast radius.

Entrypoint	Description
`run_improve`	Refresh `IMPROVE.md` — commits `IMPROVE: …` to `origin/dev`
`run_document`	Refresh `/docs/` — commits `DOCUMENT: …` to `origin/dev`

Secrets (reused, from env=main): TradeSpark (AWS + ANTHROPIC_API_KEY), GitlabPushToken (GITLAB_TOKEN), ClaudeCodeOAuth (headless CLI auth).

Hardening (hardening.py): per-skill allowed-path enforcement, git pre-push hook (no force/delete, dev-only), env scrubbing, gitleaks scan. Manual-only for now (uv run modal run -e jh automation/claude_runner.py::run_improve [--no-push]); cron commented in place.