← All docs services.md

Service Catalog

Last updated: 2026-06-03

Quick reference for every deployable service in the Grizzlebear platform.

API Gateway (app.py)

Entry point for all services. Composes sub-apps into a single Modal deployment.

Includes: user_data_app (users + data), low_priority_app (voices + geocoding + capture + static_site), livekit_ts.app, ml.app, model_proxy.app, tsweb.app

Modal Function Topology

Services are grouped into consolidated Modal functions to reduce cold-start surface:

Function Services Notes
user_data_app users, data In-process dispatch (no HTTP self-calls)
low_priority_app voices, geocoding, capture, static_site Hosts deploy dashboard
livekit_ts livekit server + agent Separate due to heavy Go/ML deps
ml ML gateway, training, serving, eval GPU-attached
model_proxy model proxy Static egress IP
tsweb tsweb Quarantine boundary

Users Service (users/)

Endpoint Method Description
/v1/register POST Register new user
/v1/login POST Authenticate user
/v1/profile GET Get user profile
/v1/resolve-location GET Resolve location with geocoding context

Auth: Supabase magic links + OTP. Replaced original badauth key/secret system.

Traction Admin (users/traction.py)

Endpoint Method Description
/admin/traction/stats GET User totals (registered/anonymous, homeowner/pro)
/admin/traction/daily-signups GET Per-day signup counts for calendar heatmap
/admin/traction/daily-projects GET Per-day project creation counts
/admin/traction/projects GET Paginated project list (20/page)
/admin/traction/projects/{id} GET Project detail (prompt, refinement Q&A, plan tree)

Auth: TradesparkEmailAdmin dependency — email allowlist via TS_ADMIN_EMAILS env var (from TradesparkAdmins Modal Secret). Decoupled from Supabase profiles.role.

~~Billing Service~~ (archived)

Deprecated May 21 — Stripe integration moving elsewhere. Code archived at dev/_archived/billing/.

Data Service (data/)

Endpoint Method Description
/v1/assets GET List assets (supports projectId=all with deduplication)
/v1/asset GET/POST/DELETE CRUD for individual assets
/v1/asset/chunked/init POST Initialize chunked upload
/v1/asset/chunked/upload POST Upload chunk (set final=true to complete)
/v1/sync-project POST Pull Supabase project data into location markdown

Storage: SQLite per-location (S3-backed, writes protected by ETag CAS + per-key locking via with_location_db), S3 blobs for files. Paths follow account/geo_prefix/location/ convention.

Voices Service (voices/)

Endpoint Method Description
/v1/tts POST Text-to-speech (streaming audio)

Providers: ElevenLabs, OpenAI. Voice selection via ranked voices.json config.

Geocoding Service (geocoding/)

Endpoint Method Description
/v1/reverse-geocode GET GPS coordinates to address
/v1/map-image GET Static map image

Providers: Mapbox (primary), Google Maps (fallback).

Capture Service (capture/)

AR session metadata collection (color images, depth maps, device metadata).

LiveKit Service (livekit_ts/)

WebRTC room management and agent worker for real-time video sessions. Handles room creation, token generation, and the server-side agent that participates in sessions (LiveKit SDK 1.5.6).

Recording: The Go-based livekit-recorder captures RGB and depth video tracks, encrypts them at rest using AES-CBC (account DEK passed via DEK_BASE64 env var), and writes directly to the S3-mounted path. Recording assets are registered in the location's metadata DB.

3D Reconstruction: Session recordings feed into the splatting pipeline (queues/session_to_splat.py) which routes to either the standard or video-3D-reconstruction variant. The latter (session_to_splat.video_3d_reconstruction.py) decrypts recordings → extracts frames via ffmpeg → runs COLMAP SfM (with checkpoint-based preemption tolerance) → produces Gaussian splats via fastgs.

ML Gateway (ml/)

Endpoint Method Description
/health GET Liveness check
/ml/generate POST Single-shot LLM completion (Gemini)
/ml/generate-plan POST Question-answering + plan generation flow
/ml/chat/stream POST SSE streaming chat (Gemini + Ollama multiplexed). Supports optional messages_by_provider for per-provider conversation threads
/ml/data/inspect GET Browse captured training data
/ml/data/stats GET Training data statistics
/ml/training/start POST Trigger model training (stub)
/ml/training/status GET Training job status (stub)
/ml/serving/models GET List deployed model versions (stub)
/ml/serving/promote POST Promote model version (stub)
/ml/eval/run POST Non-blocking multi-model eval — spawns one worker per model, returns {run_id, jobs}
/ml/eval/jobs/{job_id} GET Eval job status + per-record progress (with zombie probe)
/ml/eval/runs GET List recent eval runs
/ml/eval/runs/{run_id} GET Eval run detail with per-model jobs
/ml/eval/golden-set POST Create golden set from raw eval data
/comparison GET Side-by-side streaming comparison across all 6 models
/dashboard GET Web UI for ML pipeline operations
/ml/serve/{slot}/warmup POST Spawn non-blocking vLLM container warmup
/ml/serve/warmup/{call_id} GET Poll warmup status (starting/ready/error)
/ml/serve/{slot}/status GET Live runner count + backlog for a vLLM class
/ml/mobile/ingest-litert POST Admin: download base LiteRT bundle from HF → S3
/ml/mobile/convert-litert POST Admin: convert finetuned weights to LiteRT (returns 501 — awaiting litert-torch E-series support)
/ml/mobile/manifest/{slot} GET iOS client: version, size, sha256, presigned S3 URL

Data capture: All LLM requests/responses logged to S3 JSONL for training.

LLM providers: Gemini (primary), Claude, OpenAI (teacher models for comparison/distillation), on-prem Ollama (via model proxy), vLLM (Modal @modal.cls() + .remote.aio() RPC — live for all 3 Gemma 4 slots).

ML Sub-modules

  • gateway/: LLM routing (Gemini, Ollama), streaming provider adapters
  • contracts/: v0 data contracts — capture-bundle, scene-facts, data-pile. JSON Schema (draft 2020-12), ARKit world frame. Shared loader (loader.py) for schema validation and markdown front-matter parsing. Validator script checks all fixtures
  • perception/: Track 2 — video/capture → structured scene_facts. SAM2 video segmentation (Modal app, per-frame masks with track IDs), Cosmos 3 video-language model (NVIDIA NIM Modal app, room/surface captions), merge step combining both into validated scene_facts. Hallucination judge (LLM cross-check of captions vs frames). Eval scripts produce artifact reports
  • reasoning/: Track 3 — grounded Q&A over progressive-markdown knowledge bases. Loads data-pile docs, validates front-matter, answers questions via LLM. A/B eval compares grounded vs video-only reasoning. Fixture pile for local testing
  • data_pipeline/: Supabase scraper (nightly), synthetic data generator, format converters
  • training/: Model registry, LoRA/QLoRA configs, Unsloth trainer stubs
  • serving/: vLLM inference (live for E2B/E4B/31B via .remote.aio())
  • mobile/: LiteRT on-device model distribution (HF → Volume → S3 → iOS manifest)
  • eval/: Non-blocking evaluation framework — golden-set management, multi-model comparison, per-record metrics + judge scores, zombie probe on running workers

Static Site (static_site/)

Centralized demo hub, blog, docs viewer, and deploy dashboard at static-{env}.grizzlebear.io. Consolidates 8 demo pages under a shared TradeSpark-themed shell with the canonical design system tokens.

Endpoint Method Description
/ GET Landing page with links to all demos, docs, and blog
/demos/ml-comparison GET Side-by-side model comparison UI
/demos/ml-generation GET Synthetic project generation pipeline (formerly ml-dashboard; old slug 301-redirects)
/demos/ml-eval GET Multi-model eval dashboard (golden-set picker, comparison table, record diffs)
/demos/ml-training GET Training job monitor with progress bars
/demos/traction GET Admin analytics — user/project signal from Supabase (email-allowlist gated)
/demos/mobile-session GET Mobile session simulator (project/property dropdowns from tsweb API)
/demos/model-proxy GET On-prem model proxy test page
/demos/websocket-test GET WebSocket session test client
/docs GET Docs index (architecture, services, changelogs, IMPROVE.md)
/docs/{path} GET Individual doc page (runtime markdown rendering, login-required)
/login GET Login page
/posts GET Blog post index
/posts/{slug} GET Individual blog post (Markdown with YAML frontmatter)

Shared assets: CSS design tokens (TradeSpark theme), JS utility layer (Site.apiFetch with auto-auth + 401 refresh, Site.serviceUrl for per-env URL resolution).

Docs viewer (docs.py): Renders docs/ corpus and IMPROVE.md at runtime via markdown-it-py. Login-required (not in publicPrefixes). Path-traversal protection via _safe_doc_path(). FastAPI's built-in Swagger UI is disabled to free the /docs route.

Deploy Dashboard (deploy_dashboard.py)

Pipeline visualization at /deploy-dashboard/ showing all environments and their promotion flow.

Endpoint Method Description
/deploy-dashboard/ GET Dashboard HTML shell (Cache-Control: no-store)
/deploy-dashboard/api/envs GET All env states, branch tips, test results (parallelized)
/deploy-dashboard/api/promote POST Trigger promotion between environments
/deploy-dashboard/api/test POST Trigger test run on an environment
/deploy-dashboard/api/approve-prod POST Approve and deploy to production (gated)

Features: env cards with status chips (SHA, test pass/fail, commit delta), pipeline connectors, action toasts with per-stage elapsed timers, live test progress (X/Y folders), adaptive polling, skeleton first-paint. Approve-prod button gated: only enabled when beta tests pass on current origin/beta tip.

Backend: Queries ci/_git_in_modal for branch tips (with 60s TTL cache), Modal AppList RPC for env states, and Bruno test results. All data sources fetched in parallel.

tsweb (tsweb/)

Quarantine boundary for all tsweb-app Supabase integrations at tsweb-{env}.grizzlebear.io. Isolates project/property/task reads so the rest of the codebase stays clean — when tsweb-app is retired, cleanup is a single-directory delete.

Endpoint Method Description
/ GET Service health (temporary status JSON)
/projects GET User-scoped projects with joined property data (bearer auth)

Nightly cron: tsweb.scheduled runs the Supabase scraper on schedule (migrated from ml_endpoint).

Internal consumers: data/sync.py (project sync), users/auth.py (location resolution), static_site mobile-session demo (dropdowns).

Dependency management: tsweb is imported lazily at call sites (not top-level) and mounted only on the users and data functions via Modal's Image.add_local_python_source("tsweb"). Top-level imports would crash all services that import AuthContext but don't mount tsweb.

Model Proxy (model_proxy/)

Reverse proxy to on-prem model servers (vLLM/Ollama on Mac Mini).

Domain Flow
model-{env}.grizzlebear.io Client -> Modal (static egress IP) -> UDM firewall -> Mac Mini

Auth: JWT bearer token validation before forwarding. Static IP allowlisted on UDM.

CI/CD Pipeline (ci/)

Modal-driven promotion DAG running in the cicd environment. Replaces the previous GitLab CI shell scripts.

Webhook Dispatcher (webhook.py)

Endpoint Method Description
/promote-to-dev POST Merge feature branch → dev, deploy, test
/dev-test POST Deploy dev + run tests (self-heals on uncertain state)
/promote-dev-to-beta POST Merge dev → beta, deploy, test
/approve-prod POST Deploy to production (requires X-Approve-Token)
/action-status/{id} GET Poll in-flight action progress

Supporting Modules

Module Purpose
deploy_in_modal.py Remote modal deploy execution (slim image, no ML deps)
bruno_in_modal.py Remote Bruno test runner — parallel tier-aware, live progress, Modal Secret creds
_git_in_modal.py Git operations (merge, branch tips with commits, env states via AppList RPC, TTL-cached GitLab calls)
scheduled_cleanup.py Cron: non-prod app cleanup + daily main canary test (14:00 UTC)

Headless Claude Runner (automation/)

Separate Modal app (grizzlebear-claude-runner) that runs the /improve and /document skills headless via the Claude Code CLI. Isolated from the main grizzlebear-api blast radius (its own image + secrets).

Schedule Skill Scope
Mon 03:00 UTC /improve IMPROVE.md only
Mon 04:00 UTC /document docs/ only

Safety guards (hardening.py): per-skill allowed-path enforcement, pre-push hook blocking force/delete pushes and non-dev branch pushes, env scrubbing, gitleaks scanning, commit/diff validation.

Commits use reserved prefixes: IMPROVE: and DOCUMENT: . Secrets sourced from environment_name="main".

Docker Base Images

Heavy base images are pre-baked to AWS ECR to avoid repeated pip installs during deploys:

Base Contents Built by
dockerfile.base Core Python deps build_ecr_base_on_ec2.py
dockerfile.ai.base torch, transformers, CUDA build_ecr_base_on_ec2.py
dockerfile.livekit_server.base LiveKit SDK + deps build_ecr_base_on_ec2.py
dockerfile.livekit_agent.base Go compiler + ML deps build_ecr_base_on_ec2.py
dockerfile.ml_training.base Unsloth, training deps build_ecr_base_on_ec2.py
dockerfile.session_to_splat.base COLMAP, fastgs, ffmpeg build_ecr_base_on_ec2.py

Automation Runner (dev/automation/)

Separate Modal app (grizzlebear-claude-runner) — not part of app.py — that runs the /improve and /document Claude Code skills headless. Isolated image (Node + Claude CLI + gitleaks) and secrets (sourced from env=main) keep it off the grizzlebear-api blast radius.

Entrypoint Description
run_improve Refresh IMPROVE.md — commits IMPROVE: … to origin/dev
run_document Refresh /docs/ — commits DOCUMENT: … to origin/dev

Secrets (reused, from env=main): TradeSpark (AWS + ANTHROPIC_API_KEY), GitlabPushToken (GITLAB_TOKEN), ClaudeCodeOAuth (headless CLI auth).

Hardening (hardening.py): per-skill allowed-path enforcement, git pre-push hook (no force/delete, dev-only), env scrubbing, gitleaks scan. Manual-only for now (uv run modal run -e jh automation/claude_runner.py::run_improve [--no-push]); cron commented in place.