Backend Overview
Goal
The Hephaes backend is the local API and orchestration layer for the data pipeline.
It connects the desktop UI to the hephaes package for asset registration, authoring,
conversion execution, job tracking, and output cataloging.
Local-Only Design
The backend is designed for local use only. It runs on the user’s machine and
stores data in the package-owned hephaes workspace (workspace.db) plus local
filesystem directories for uploads, outputs, and logs.
There is no authentication, multi-tenancy, or cloud deployment model in the current open-source release.
What It Does
- Asset management — register, upload, scan directories, index
.bagand.mcapfiles, and manage tags - Conversion authoring — inspect asset topics, generate draft specs, preview sample rows, and persist reusable configs with revision history
- Conversion execution — run conversions via the
hephaeslibrary to produce TFRecord or Parquet output - Job tracking — durable index and convert job records
- Output catalog — track output artifacts, surface manifest and artifact metadata, and serve file content directly
- Dashboard — aggregate summary metrics, trend views, and blocker counts for the local workflow
What It Does Not Do Today
- replay or visualization APIs
- output-action APIs
- authentication, organizations, or cloud-hosted multi-user workflows
Architecture
Frontend (React + Vite / Tauri) <-> Backend (FastAPI) <-> hephaes (Workspace + conversion library)
↕
workspace.db + local filesystemThe backend is intentionally thin:
hephaesowns workspace persistence, conversion semantics, spec validation, and data processing- Backend owns HTTP contracts, local runtime configuration, background job submission, and response mapping
- Frontend owns presentation and guided user workflows
Stack
| Layer | Technology |
|---|---|
| Framework | FastAPI (Python 3.11+) |
| Server | Uvicorn (ASGI) |
| Persistence | hephaes workspace SQLite database (workspace.db) |
| Background jobs | In-process ThreadPoolExecutor wrapper |
| Conversion | hephaes (internal library) |
| Testing | pytest, httpx |
Configuration
The backend is configured via environment variables:
| Variable | Purpose | Default |
|---|---|---|
HEPHAES_BACKEND_DATA_DIR | Root local data directory | backend/data/ (dev) or ~/.hephaes/backend/ (desktop) |
HEPHAES_WORKSPACE_ROOT | Workspace storage path | <data_dir>/workspace/ |
HEPHAES_BACKEND_RAW_DATA_DIR | Staged uploaded asset files | <data_dir>/raw/ |
HEPHAES_BACKEND_OUTPUTS_DIR | Conversion output artifacts | <data_dir>/outputs/ |
HEPHAES_BACKEND_LOG_DIR | Backend log files | <data_dir>/logs/ |
HEPHAES_DESKTOP_MODE | Desktop sidecar mode toggle | 0 |
HEPHAES_BACKEND_APP_NAME | FastAPI app title | Hephaes Backend |
HEPHAES_BACKEND_DEBUG | Debug mode | 0 |
HEPHAES_BACKEND_CORS_ALLOW_ORIGIN_REGEX | CORS origin allowlist | https?://(localhost|127\.0\.0\.1)(:\d+)? |
Code Organization
| Directory | Purpose |
|---|---|
app/api/ | HTTP route handlers |
app/services/ | Business logic and orchestration |
app/mappers/ | Workspace-to-API response mapping |
app/schemas/ | Request/response Pydantic models |
app/config.py | Environment-based configuration |
app/main.py | App creation, middleware, router registration, lifespan |
app/workspace_bootstrap.py | Workspace bootstrap and resolution |
tests/ | API and sidecar test suite |
Future Direction
- Worker queue for long-running conversion jobs with retry and cancellation support
- Custom computation scripts as a conversion option
- Richer backend-side filtering, pagination, and output inspection metadata
Last updated on