PhD Student —
Cristian Sbrolli is a PhD student captivated by the field of multimodal learning, with a particular focus on contrastive multimodal representation learning. He’s deeply inspired by the human ability to seamlessly integrate information from multiple senses – vision, hearing, touch, and more – to build incredibly robust and versatile internal representations of the world. Cristian’s central research question revolves around whether we can achieve something similar with artificial intelligence.
He investigates methods to train models on diverse data sources – images, text, 3D, audio, and potentially others – in a way that forces them to learn shared, abstract representations. These representations, much like our own, could then be leveraged for a wide array of downstream tasks, enabling machines to reason about and interact with the world in a more holistic and human-like manner. Ultimately, his goal is to contribute to the development of general multimodal representations that can be applied to real-life scenarios and tasks, leading to more adaptable and intelligent AI systems.