Egocentric Action Baselines
Compare RGB, hand-joint, and fusion baselines. Learn temporal windows, feature vectors, softmax classification, ablations, and F1 metrics.
Start with what the wearer is doing, move into how the camera sees 3D space, then build a structured memory of objects and interactions. Each project has its own live tutorial, tests, visual artifacts, and reproducible commands.
The projects are designed to be read in order. The same sample episode becomes three different technical views of egocentric video.
Compare RGB, hand-joint, and fusion baselines. Learn temporal windows, feature vectors, softmax classification, ablations, and F1 metrics.
Prepare frames, calibration, SLAM poses, COLMAP commands, NeRF templates, and 3D Gaussian Splatting templates.
Convert annotations into objects, relations, timestamps, provenance, and queryable world-state memory.
Classify actions from first-person video and hand motion.
Relate frames, calibration, camera poses, and reconstruction tools.
Represent objects and interactions as a temporal graph.
Install each project when you want the clean console commands shown in the tutorials.
git clone https://github.com/ChaoYue0307/egocentric-action-baselines git clone https://github.com/ChaoYue0307/egocentric-3d-reconstruction-demo git clone https://github.com/ChaoYue0307/scene-graph-from-egocentric-video