Neuro-symbolic AI (NeSy AI) seeks to bridge the gap between data-driven learning and knowledge-based reasoning, combining the adaptability of Neural Networks with the structure of symbolic systems. While Deep Learning models excel at perception and pattern recognition, they often lack transparency and compositional understanding. Symbolic reasoning, on the other hand, offers interpretability and rule-based consistency but struggles with bottom-up learning, ambiguity, and scale. Integrating the two promises AI systems that can both learn from raw data and reason in a structured way, enabling more reliable decision-making, explainable inference, and generalization beyond narrow downstream tasks.
At AIRLab, we are exploring NeSy AI techniques to enhance the video understanding capabilities of multimodal large language models (MLLMs). By leveraging graph-based representations that model objects, people, events, and their relationships, we aim to structure complex visual narratives into interpretable reasoning pathways. These symbolic abstractions can help MLLMs verify their own responses, maintain narrative coherence, and support richer, more trustworthy interactions with users, advancing both comprehension and explainability in multimodal AI systems.
Contacts:

