First-person video · action · 3D · world memory

A guided path through egocentric vision.

Start with what the wearer is doing, move into how the camera sees 3D space, then build a structured memory of objects and interactions. Each project has its own live tutorial, tests, visual artifacts, and reproducible commands.

Start the path View repository

Three-part egocentric vision learning path preview

Animated three-part learning path preview

3connected tutorials

CItested repos with GitHub Actions

Liveinteractive GitHub Pages

Learning Sequence

The projects are designed to be read in order. The same sample episode becomes three different technical views of egocentric video.

Video + hand joints

→

Camera poses + geometry

→

Objects + relations

Part 1

Egocentric Action Baselines

Compare RGB, hand-joint, and fusion baselines. Learn temporal windows, feature vectors, softmax classification, ablations, and F1 metrics.

Open tutorial GitHub

Part 2

Egocentric 3D Reconstruction

Prepare frames, calibration, SLAM poses, COLMAP commands, NeRF templates, and 3D Gaussian Splatting templates.

Open tutorial GitHub

Part 3

Scene Graph Memory

Convert annotations into objects, relations, timestamps, provenance, and queryable world-state memory.

Open tutorial GitHub

What Each Project Teaches

Understanding

Classify actions from first-person video and hand motion.

Geometry

Relate frames, calibration, camera poses, and reconstruction tools.

Memory

Represent objects and interactions as a temporal graph.

Local Setup Commands

Install each project when you want the clean console commands shown in the tutorials.

git clone https://github.com/ChaoYue0307/egocentric-action-baselines
git clone https://github.com/ChaoYue0307/egocentric-3d-reconstruction-demo
git clone https://github.com/ChaoYue0307/scene-graph-from-egocentric-video