Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 23;14(1):1945.
doi: 10.1038/s41598-024-52299-7.

On computational models of theory of mind and the imitative reinforcement learning in spiking neural networks

Affiliations

On computational models of theory of mind and the imitative reinforcement learning in spiking neural networks

Ashena Gorgan Mohammadi et al. Sci Rep. .

Abstract

Theory of Mind is referred to the ability of inferring other's mental states, and it plays a crucial role in social cognition and learning. Biological evidences indicate that complex circuits are involved in this ability, including the mirror neuron system. The mirror neuron system influences imitation abilities and action understanding, leading to learn through observing others. To simulate this imitative learning behavior, a Theory-of-Mind-based Imitative Reinforcement Learning (ToM-based ImRL) framework is proposed. Employing the bio-inspired spiking neural networks and the mechanisms of the mirror neuron system, ToM-based ImRL is a bio-inspired computational model which enables an agent to effectively learn how to act in an interactive environment through observing an expert, inferring its goals, and imitating its behaviors. The aim of this paper is to review some computational attempts in modeling ToM and to explain the proposed ToM-based ImRL framework which is tested in the environment of River Raid game from Atari 2600 series.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
The schematic representation of the network architecture. Input layer (STS) consists of different neural populations, each corresponding to a group of components in the game. The neurons in this layer have no dynamics and only pass the spike trains to the output action layer, composed of LIF neurons. The information from each game frame is extracted via convolutional template matching. These spatial information of components is then encoded into explicit spikes over the time. In other words, the neuron corresponding to the presence of a component in a region of the game frame will spike at a time correlated with its vertical distance from the plane; making it spike earlier if it is within a shorter distance from the plane.
Figure 2
Figure 2
An overview of the ToM-based ImRL framework. The first pass demonstrates the actions, chosen by the expert and the ToM-based agent for the current frame of the game; while the second pass indicates the actual learning process, depressing contradicting self action and potentiating the mirrored expert’s action.
Figure 3
Figure 3
Actions taken by the ToM-based agent and the expert. (a) The distribution of the actions chosen by ToM-based agent and expert, averaged over 100 independent runs. Note that the ToM-based agent tries to imitate the expert’s behavior by predicting his goals, rather than copying its actions. (b) An Atari 2600 joystick. The joystick has a single button corresponding to fire action, and a stick that can be pushed in one of eight directions. The joystick and the button can be used simultaneously, resulting in a total of 17 possible actions. The player can also neither move nor fire, which is the so-called ‘no action’ in the environment. The image is adapted from amazon.co.uk.
Figure 4
Figure 4
The comparative illustration of ToM-based and RL-based agents throughout the course of learning (significance confirmed by Welch’s T-test: ToM-based agent’s score is larger than the RL-based agent’s score at each episode with p<0.001). Here, the ToM-based agent learns by observing an expert agent achieving an average score of 14,185 in 100 independent runs.
Figure 5
Figure 5
A sample frame of the River Raid game, along with the visual access of ToM-based agent and the expert.

Similar articles

References

    1. Gallese V, Goldman A. Mirror neurons and the simulation theory of mind-reading. Trends Cognit. Sci. 1998;2:493–501. doi: 10.1016/S1364-6613(98)01262-5. - DOI - PubMed
    1. Gernsbacher MA, Yergeau M. Empirical failures of the claim that autistic people lack a theory of mind. TArchiv. Sci. Psychol. 2019;7:102. doi: 10.1037/arc0000067. - DOI - PMC - PubMed
    1. Shanton K, Goldman A. Simulation theory. WIREs. Cognit. Sci. 2010;1:527–538. doi: 10.1002/wcs.33. - DOI - PubMed
    1. Sabbagh MA, Bowman LC. Theory of Mind. Wiley; 2018.
    1. Winfield AFT. Experiments in artificial theory of mind: From safety to story-telling. Front. Robot. AI. 2018;5:75. doi: 10.3389/frobt.2018.00075. - DOI - PMC - PubMed