EgoAgent
Building a wearable egocentric AI agent on Raspberry Pi 5 with Hailo AI HAT 2 — from hardware assembly to real-time first-person vision-language interaction.
Publications
-
What's next?
- CV4Animals Workshop, CVPR 2024 · 2024
AnimalMotionCLIP: Embedding Motion in CLIP for Animal Behavior Analysis
We extend CLIP for animal behavior recognition by interleaving video frames with optical flow, adding motion awareness to a model designed for static images. Multiple temporal aggregation strategies (dense, semi-dense, sparse) are compared, achieving state-of-the-art results on the Animal Kingdom dataset.
- MDPI Sensors · 2023
Real-time monocular skeleton-based hand gesture recognition using 3D-Jointsformer
Automatic hand gesture recognition in video sequences has widespread applications, ranging from home automation to sign language interpretation and clinical operations. A hybrid approach combining 3D Convolutional Neural Networks (3D-CNNs) and Transformers is proposed: a 3D-CNN computes high-level semantic skeleton embeddings capturing local spatial and temporal characteristics, while a Transformer with self-attention efficiently captures long-range temporal dependencies. Evaluation on the Briareo and Multimodal Hand Gesture datasets achieved accuracy scores of 95.49% and 97.25% respectively, with real-time performance on a standard CPU.
Featured Projects
Side projects at the intersection of AI and the physical world.
- EgoAgent
Building a wearable egocentric AI agent on Raspberry Pi 5 with Hailo AI HAT 2 — from hardware assembly to real-time first-person vision-language interaction.
- Infrastructure
Local compute and networking stack supporting EgoAgent and other projects — DGX Spark, Tailscale mesh, and development tooling.
- NFC Personal Pin
3D-printed wearable pin with an embedded NFC chip. Tap it with your phone and it opens this website — a physical personal card for networking events.