Hephaes Overview
Goal
The hephaes Python package turns ROS logs into standardized, ML-ready datasets with stable schemas across runs, robots, and recording variations.
It is the core data-conversion engine used by the backend and frontend, but it is also usable directly as a standalone Python library.
What The Package Solves
- Ingest ROS1
.bagand ROS2.mcaplogs - Profile recordings to understand topics, rates, and time bounds
- Map variable source topic names into a canonical output schema
- Synchronize asynchronous streams onto a shared timeline
- Convert episodes into Parquet or TFRecord outputs
- Emit sidecar manifests with provenance and metadata
Design Principles
- Keep conversion semantics explicit and versioned
- Preserve source fidelity while producing practical ML features
- Support schema standardization across heterogeneous fleets
- Keep outputs deterministic and reproducible for training pipelines
Where It Fits
ROS/MCAP logs -> hephaes (profile + map + convert) -> Parquet/TFRecord + manifestIn the full local stack:
Frontend (Next.js) <-> Backend (FastAPI) <-> hephaes (Python package)Current Scope
- Python 3.11+
- Local filesystem inputs and outputs
- One output dataset file per input log
- Library-first interface (no separate CLI in this repo)
Related Docs
Last updated on