Daniele Materia
Computer Science Master Student
Daniele Materia
Computer Science Master Student
DMI, University of Catania
Human-Object Interaction Anticipation in Egocentric Video
My current research investigates how Vision-Language Models can anticipate future human-object interactions from first-person videos by combining Set-of-Mark visual prompting with recent gaze trajectories.
Improving intent understanding and short-term object interaction anticipation is essential to make assistive systems more reliable in procedural scenarios, where timely and context-aware guidance directly impacts user support.
Research Directions
- 01 Egocentric perception for activity and intent understanding.
- 02 Multimodal alignment between vision, language and context.
- 03 Human-centered AI systems for real-world assistance.
Resume
I'm a -year-old Computer Science Master's student at the Department of Mathematics and Computer Science of the University of Catania and a member of the LIVE research group within the Image Processing Laboratory (IPLAB). My academic path is focused on AI and Computer Vision, with particular interest in Vision-Language Models for egocentric video understanding and in intelligent assistive systems designed to support users during complex procedural tasks. Beyond research and technology, I have a long-standing passion for Karate, which has been an important part of my life over the years.
Feel free to contact me through one of my emails or any of my social media profiles.
Education
Current
University of CataniaMaster's Degree in Computer Science
Study path focused on AI and Computer Vision.
Relevant courses: Deep Learning, Advanced Robotics, Computer Vision, Multimedia, Principles of Parallel Computing and GPU Programming.
Jul 2025
University of CataniaBachelor's Degree in Computer Science
Study path focused on Artificial Intelligence and Robotics.
Relevant courses: AI, Robotics, Social Media Data Analysis, Fundamentals of Data Analysis, ML.
Thesis title: Predicting Next-Active Objects with Vision Large Language Models in Egocentric Scenarios.
Final grade: 110/110 cum laude
Jul 2022
Archimede Technical InstituteHigh School Degree
Information and Communication Technologies.
Final grade: 100/100 with Honours
Experience
Apr 2026 - Current
Università di Catania - LIVE@IPLABResearch Scholar
Study and development of Multimodal Learning algorithms for procedural task assistance and mistake detection.
Feb 2025 - Jun 2025
Next Vision s.r.l.Student Researcher - Internship
Conducted research and designed multimodal LLM-based Computer Vision algorithms, with a focus on Object Anticipation tasks in Egocentric Vision.
Milestones
1st Dan Black Belt in Shotokan Karate
Projects
Comparison of Classical and DL-based Watermarking Methods
Multimedia ProjectQ-Learning-Based Robot Navigation in a Simulated Maze Environment
Robotic Systems ProjectHybrid Genetic Algorithm to solve the WFVS problem
AI ProjectDanny: a multi-purpose modular Discord bot
Discord BotChiostroVR
VR ExperiencePublications
Leveraging Gaze and Set-of-Mark in VLLMs for Human-Object Interaction Anticipation from Egocentric Videos
Daniele Materia, Francesco Ragusa, Giovanni Maria Farinella