PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos

Steven Abreu; Tiffany D Do; Karan Ahuja; Eric J Gonzalez; Lee Payne; Daniel McDuff; Mar Gonzalez-Franco

doi:10.48550/arxiv.2407.09503

Back

PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos

Preprint

Open access

PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos

Steven Abreu, Tiffany D Do, Karan Ahuja, Eric J Gonzalez, Lee Payne, Daniel McDuff and Mar Gonzalez-Franco

ArXiv.org

14 Jun 2024

DOI: https://doi.org/10.48550/arxiv.2407.09503

Files and links (1)

url

https://arxiv.org/abs/2407.09503View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Computer Vision and Pattern Recognition

Computer Science - Human-Computer Interaction

Computer Science - Neural and Evolutionary Computing

Intelligent assistance involves not only understanding but also action. Existing ego-centric video datasets contain rich annotations of the videos, but not of actions that an intelligent assistant could perform in the moment. To address this gap, we release PARSE-Ego4D, a new set of personal action recommendation annotations for the Ego4D dataset. We take a multi-stage approach to generating and evaluating these annotations. First, we used a prompt-engineered large language model (LLM) to generate context-aware action suggestions and identified over 18,000 action suggestions. While these synthetic action suggestions are valuable, the inherent limitations of LLMs necessitate human evaluation. To ensure high-quality and user-centered recommendations, we conducted a large-scale human annotation study that provides grounding in human preferences for all of PARSE-Ego4D. We analyze the inter-rater agreement and evaluate subjective preferences of participants. Based on our synthetic dataset and complete human annotations, we propose several new tasks for action suggestions based on ego-centric videos. We encourage novel solutions that improve latency and energy requirements. The annotations in PARSE-Ego4D will support researchers and developers who are working on building action recommendation systems for augmented and virtual reality systems.

Metrics

8 Record Views

Details

Title: PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Creators: Steven Abreu
Tiffany D Do
Karan Ahuja
Eric J Gonzalez
Lee Payne
Daniel McDuff
Mar Gonzalez-Franco
Publication Details: ArXiv.org
Resource Type: Preprint
Language: English
Academic Unit: Computer Science
Other Identifier: 991021916804104721

PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media