Federico Espositi graduated in Automation Engineering from Politecnico di Milano, studying social robotics and artificial intelligence inspired by biological neural processes. Alongside his technical path, he explored various creative and storytelling mediums, including graphic novel adn videogames. Since 2010 he has studied theater acting, writing and directing, developing short and feature plays. He started working as a theater pedagogist since 2017, adressing starytelling, acting, musical and immersive theater. Since 2021, he has been pursuing a PhD at Politecnico di Milano investigating nonverbal communication through new forms of technological expression, such as virtual reality, robotics and wearable devices, presented as interactive installations at Milan Digital Week 2022 and 2023, and hybridising phygital devices with performative arts. His artistic work is characterized by collaboration and an investigation of communicative potential across different media.
Reference for First Contact research line @AIRLAB. Read about the project here.
In “the Game”, since more familiar creatures are created than in “The Room”, the main objective is to study whether the specific robot designs (shape, movement, but also control and sensory perception systems) are capable of allowing human subjects to interact within the specific contexts that are given to them. For example, are the human subject able to complete together and escape room that requires them to instantiate an emergent communication, to exchange specific types of information according to our game design? Are they feeling empathy for these other creatures they are interacting with?
Thus, the subject’s motivations are quite structured during “the Game”, and the main research question is whether our expectations are met, with respect to the quality of our system and its ability to allow the interaction to unfold according to this structure. And also, if this is not the case, to what extent the subjects deviate from this structure, and when.
In this track we ask ourselves if the human subjects are able to instantiate an emergent communication, to exchange specific types of information. As this is necessary to complete the game, the amount of victories quantifies the quality of the setup.
PROJECTS
Odile’s Escape Room
The focus of this phase is on the experience of the “Visitor”, the user interacting with the robot, playing with a physical robot in the real world, and on the design of the robot’s body and the expressive capability of its motions. The robot is controlled in a wizard of Oz manner so that the control system does not need to be precise or ready for testing with inexperienced users.
Project By: Erica Panelli
Supervision: Federico Espositi and Andrea Bonarini
The focus of this phase is on the experience of the “Controller”, the user controlling the robot which is playing in the real world, and on the design of the sensory translation system and the expressive capability of the human translations.
Project By: Giuseppe Epifani (phase I); Maurizio Vetere (phase II)
Supervision: Federico Espositi and Andrea Bonarini
“The Room” places participants in a more demanding situation. Not only is the body of a room completely unfamiliar as a social being, ‘this alternate environment also introduces a completely new pattern to the nature itself of the interaction in which communication needs to be rebuilt from the ground up.
Conceptually, the setup is the same as for “The Game”, with two humans interacting through a digital filter: a Controller subject embodying a highly non-anthropomorphic avatar, and a Visitor subject interacting with this avatar in the same shared space. The core of the Room project is in the nature of the avatar: an enclosed space, designed to be an organic living being. In this context, it is not possible to start, as it is done in “The Game”, from specific motivations and structures for the characters involved in the interaction, namely, the Controller, embodying the Room, and the Visitor, that goes inside it.
In “The Game”, the two participants had clear motives and clear goals, a clear narrative. This could work because the context of interaction, even if the creature is new and strange, is still to some extent familiar and recognizable. With “The Room”, there is no initial, clear narrative designed from the start. The main point itself is to understand how an actuated space could become a social being, believable, authentic. Which could be its motivations? Its drives? And the same holds for the Visitor, entering the Room. Thus, the main objective of “The Room” project is to build the Room as a character, to understand how it could be an authentic other.
The Room is so radically different as an avatar from existing research that no valuable information was previously available from embodiment theory and results. Instead, to start building a framework to guide the design of The Room, we made a thorough review of theories on how humans interact with spaces (the Room IS a space), and on spaces as interactive installations. As a result, an initial framework was designed on the possible elements that could compose the Room, how the room could “express itself”, and what types of interaction mechanism with the visitor are designed for it.
Immediately after the framework was formulated, we designed a pilot study to test a subset of its elements, an experiment in the form of an interactive installation, displayed at the xCities expositions within Politecnico di Milano in Fall 2023.
PROJECTS
Pilot
Developing a Room prototype in VR where prototyping is fast, with also the creation of smaller physical modules as the first results are available. The different designs will aim at covering all the possibilities identified with the framework and identify the most promising ones. Sensory translation will focus on haptic mediums coherently with the room body being a distributed system. Two initial virtual rooms and prototypes for haptic feedback suits have already been developed. Output of this phase became an interactive installation and was displayed at the xCities expositions within Politecnico di Milano in Fall 2023
Concept and VR development: Federico Espositi
Haptic devices: Amor Madhkour
Supervision: Federico Espositi and Andrea Bonarini
The focus of this phase is on the experience of the “Visitor”, the user entering the Room and interacting with it, and on the design of the Room’s body and the expressive capability of its motions. The first complete physical mechatronic room is developed as a modular system. The robot is controlled in a wizard of Oz manner, testing specific behaviours to try to establish communication with the Visitor.
Project By: Sobhan Esfandiar (Design & Mechanics); Giuseppe Bonanno (Hardware and Development)
Supervision: Federico Espositi and Andrea Bonarini
“First Contact” is a research project aimed at exploring embodiment and interaction with non-human entities, social robots and virtual shapes, through non-verbal communication, focusing on creating authentic, believable, and socially interactive characters. Subjects interact with each other through a digital filter that transforms their bodies into new creatures that the other will interact with, creating a context to test genuine understanding of this alien otherness.
The project follows two main tracks, “The Game” and “The Room”, experimenting with radically non-anthropomorphic bodies with different narratives and interaction contexts.
Hugh Herr believes that in the 21st century humans might extend their bodies into non- anthropomorphic forms, like wings, and control and feel these movements through their nervous system, drastically changing our morphology and dynamics. This concept raises questions about the nature of our bodies, identity, and potential virtual existences. Avatars are crucial in shaping our social lives and identities, as they become the means through which we embody and express ourselves.
It investigates the challenges of embodiment and genuine human interaction with non- anthropomorphic beings (in terms of shape, perception and expressive capabilities), using only non-verbal communication. By imbuing these entities with unique narratives, motivations, and desires, the study seeks to create truly authentic and relatable characters.
This research was introduced as the PhD of Federico Espositi and has developed into several multidisciplinary projects, from robot and interaction design, to the study of affordances in Virtual Reality, to the history of the concept itself of animacy and embodiment. Projects are higly experimental and often experiements take the shape of interactive installations or are inherently mixed with performance.
This study challenges traditional notions of what constitutes a living and social body, leveraging new technologies to expand our understanding of embodiment and cognition. As the concept of embodying avatars different from human morphology is gaining attention, this research investigates how far we can push the level of non-anthropomorphism while still retaining the possibility to understand the world and act in it through the avatar. Research on embodied cognition emphasizes the body’s central role in processing and understanding reality, while also showing the human’s capacity to re-adapt to different body morphologies, a phenomenon called homuncular flexibly. Thus, exploring radically altered bodies can help us surpass our current understanding of cognition.
At the same time, this research aims to determine what makes a non-human entity appear “alive” to humans by examining the physical elements, movements, sounds, and actions that contribute to this perception. This aim is shared with social psychology research, such as the famous Heider and Simmel test on social intelligence, where they showed that humans could attribute human-like behavior and motivations even to simple geometrical shapes, and the entire field of animation, starting from Disney’s “the illusion of life” showing the principles that could make a drawing of a half- empty sack of flour display emotions. Applications of this research range from social robotics to creating more relatable and empathetic characters in entertainment. The project aims at offering insights and guidelines into the development of realistic non-humanoid social creatures and their impact on human-technology engagement in an increasingly digital world.
The Digital Filter
Both projects depend on a ‘digital filter’ — a mechanism that allows participants to interact while concealing their identities. One human acts as the Controller. The other human acts as ‘the Visitor’. The Controller embodies the avatar and controls the avatar by using embodied controllers. The Controller perceives reality in an altered way with a VR headset. The visual information the Controller perceives is a translation of data received by the avatar’s sensors in real-time. The representations the Controller sees are abstract. The human Visitor seen by the Controller is not a recognizable human entity. Meanwhile, the Visitor interacts with this non- anthropomorphic avatar.
These two entities, Controller and Visitor, are in direct interaction within the same shared space, but act through the digital filter described above. Crucially, the two can’t speak. They can only communicate through the motion of their digitally filtered bodies. Neither is able to perceive another human directly. Instead, they are each ‘the other’ human ‘translated’ into an alien presence.
Both instances can be considered “first contact” between two beings that don’t know or understand each other at first, with non-anthropomorphic bodies, that need to find a way to learn how to communicate non-verbally to achieve some sort of objective.
Pilot
This first prototype of the entire system gave rise to an initial overall conceptual framework and has been refined and turned into an interactive installation, “Connect to the Machine”, and presented at the Milano Digital Week 2022. This art installation was instrumental for the testing of the system outside of the laboratory, with a variety of users, and feedback data was collected that drove the following steps of the research. The initial prototype, the framework and the art installation were the subjects of a paper published at the xCoAx conference on aesthetics and computation.
The research is divided into two main projects, each an incarnation of the main setup: “The Game” and “The Room”. They all share objectives and methodologies, but based on each project’s main focus, there are slight variations.
The Game
The Controller and Visitor play a collaborative game. The Visitor interacts with a mobile robot in the real world while the Controller plays a game in VR, unknowingly controlling the robot in the physical space in real time. They both find themselves in a maze, and they have to escape it in a short time.
Either one is privy to information that the other requires, but only the other player can actually perform actions required to win. To each user, the other is an unfamiliar alien. Moreover, the two cannot speak: they must communicate non-verbally, and they will need to learn how to cooperate to escape. With this project we push the boundaries of communication, driving the users to the extreme with a time limit objective.
The Visitor enters a closed space, a room, and finds it to be an entire abstract living organism. A space that has come to life. This space is none other than the avatar that the Controller is embodying. This research pushes the boundaries of interaction itself. Notions of gaze, body, personal distance, attention, they are all disrupted, as these two beings find themselves one inside the other. What kind of interaction is possible in such a condition? How can a space be alive?
Beyond the two main tracks, spin-off projects that are objective-oriented address a subset of the topics of the main research.
Raw Machines
“Raw Machines” was born as a device for a workshop merging physical theatre and robotics. Taking place in a theatrical space, participants control non-humanoid social robots via wearable devices and wireless systems, promoting flexibility and creative experimentation. The workshop enhances body awareness and non-verbal communication, transcending cultural barriers. It fosters digital literacy by demystifying technology components. Led by a myself and a professional dancer, it culminates in group performances. Aims of this project are to foster digital literacy reaching a broad and diverse audience, and push the boundaries of the expressive potentials of the developed robot bodies.
An administration manual for this website is available at
https://docs.google.com/document/d/1YQHSmfYedhOWCBrFSKqS9xBvWZN1LW4-ZWMf5M9_5zc/edit?usp=sharing
This page introduces “POLIMI-ITW-S” dataset “POLIMI-ITW-S” contains 37 action classes and 22,164 video samples with total . The average duration of each clip is about 7 seconds. The dataset contains RGB videos, 2-D skeletal data, bounding boxes and labels for each sample. This dataset was taken from RGB cameras of two smartphones by two recorders with resolution 1920×1080 pixels, 30 fps, held by hands about 90 cm from the floor. The 2-D skeletal data and person bounding boxes are generated by the OpenPifPaf. The 2-D skeletal data contains the 2-D coordinates of 17 body joints at each frame. The recorders imitated the mobile robot, keeping moving or staying till by looking around to capture the persons who are performing the actions. We did not mount the camera on a robot in order to avoid uncommon situations that the presence of a robot could trigger.
1. Action Classes
As shown in the tables below, the actions in the dataset are distributed on three levels: • General Level: labels are used for single action. • Modifier Level: labels are used for actions of multiple persons. • Aggregate Level detailed labels aim at describing multiple actions in a single label.
1.1 General Level Actions (10)
A1: cleaning
A2: crouching
A3: jumping
A4: laying
A5: riding
A6: running
A7: scooter
A8: sitting
A16: standing
A27: walking
–
–
1.2 Modifier Level Actions (3)
A8: sittingTogether
A17: standingTogether
A27: walkingTogether
1.3 Aggregate Level Actions (24)
A10: sittingWhileCalling
A11: sittingWhileDrinking
A12: sittingWhileEating
A13: sittingWhileHoldingBabyInArms
A14: sittingWhileTalkingTogether
A15: sittingWhileWatchingPhone
A18: standingWhileCalling
A19: standingWhileDrinking
A20: standingWhileEating
A21: standingWhileHoldingBabyInArms
A22: standingWhileHoldingCart
A23: standingWhileHoldingStroller
A24: standingWhileLookingAtShops
A25: standingWhileTalkingTogether
A26: standingWhileWatchingPhone
A29: walkingWhileCalling
A30: walkingWhileDrinking
A31: walkingWhileEating
A32: walkingWhileHoldingBabyInArms
A33: walkingWhileHoldingCart
A34: walkingWhileHoldingStroller
A35: walkingWhileLookingAtShops
A36: walkingWhileTalkingTogether
A37: walkingWhileWatchingPhone
2. Size of Datasets
The dataset includes three types of files:
RGB videos: collected RGB videos.
2-D skeletons + bounding boxes + labels: JSON format files including 2-D skeletons, bounding boxes and labels for each RGB video.
pre-processed data: splitted into “training” (70%) and “test” (30%) data with format “.npy” for joint body data and “.pkl” for label data.
The size of each type is shown in the below table:
Data Type
POLIMI-ITW-S
RGB videos
335 GB
2-D skeletons + bounding boxes + labels
39.4 GB
pre-processed data (.npy and .pkl)
17.7 GB
Total
392.1 GB
3. More Information (FAQs and Sample Codes)
We have developed the annotation tool which should be used to visualize poses, bounding boxes and labels on the video clips.
We provide the developed annotation tool, data pre-processing script, information about the data, answers to FAQs, samples codes to read the data, and the latest published results on our datasets here.
The datasets are released for academic research only, and are free to researchers from educational or research institutes for non-commercial purposes. The use of the dataset is governed by the following terms and conditions:
Without the expressed permission of the AIRLab, any of the following will be considered illegal: redistribution, derivation or generation of a new dataset from this dataset, and commercial usage of any of these datasets in any way or form, either partially or in its entirety.
For the sake of privacy, images of all subjects in any of these datasets are only allowed for the demonstration in academic publications and presentations.
All users of “POLIMI-ITW-S” dataset agree to indemnify, defend and hold harmless, the AIRLab and its officers, employees, and agents, individually and collectively, from any and all losses, expenses, and damages.
Everyday objects can be animated to improve their functionalities and to provide a more interesting environment. Moreover, it is interesting to explore new interaction situations. Proper design of shape and interaction is needed to obtain interesting objects. Emotional expression is an interesting aspect to explore.
We have developed a couple of emotional trash bins, going around to invite to trash selected materials by using the lid movements and sounds, a coat-hanger (IGHOR), welcoming people entering and asking to have their coats, showing sadness if they keep it on, a naughty fan, coming close and suddenly investing the person with an air flow, a naughty money saver that has to be chased to give it money, a kind of pillow that react with sounds to the way it is touched.
Robots can be used as artistic media, able to perform and interact with people in artistic representations.
Interacting with a different entity: First contact
In order to explore the interaction possibilities, we are developing experiences where a person can interact with another entity, having different form and abilities, and should find strategies to communicate, possibly to perform shared tasks. The experiences are done both with robots and in virtual reality. More details on this post.
Robot actor
We have developed an autonomous robotic actor, able to participate to a public representation with a defined role and script, and we are developing an Improv robotic actor able to adapt its performance to external stimuli. We have firstly developed a robot able to move in classical scenes (e.g. the balcony scene of Romeo and Juliet) selecting the proper emotional expressions for the situation, and a framework to define emotional expressions according to the social setting among characters, and the situation. We then developed Robocchio, a robot able to implement a script and adapt to the timing of the partner on stage, tested on an adapted excerpt of Pinocchio by Collodi. We then improved Robocchio to improvise by visually recognizing 17 “scenic actions” and reacting to them with scenic actions influenced by one of 16 different psychological types. The final step to include improvised verbal interaction is under development.
Interactive robotic art
Robots can have different shapes and play different roles in interactive artistic performances. We are exploiting materials like nets, polyethilene sheets, polyurethane foams and other materials to obtain shapes interesting to move in interactive exhibits. Emotional expression is also in this area, an interesting feature to explore.