Artificial intelligence and human-machine interaction are becoming more and more important in the sphere of science and everyday life. Especially virtual assistants, such as Alexa or Siri, gained widespread popularity among people all over the world already. Yet, there has been little research on how people process information when interacting with virtual assistants.
The following article summarizes findings of my master thesis, which I have written during my first months at Norstat Germany. It sheds light on the most intuitive forms of human-machine-communication.
Verbal communication with virtual assistants fundamentally changes the way of how people interact with technical devices.
To investigate which modality of presenting different information by virtual assistants leads to the highest capacity of short-term memory, two different use cases were carried out to measure the number of correctly remembered items. Participants were asked to repeat the contents of six different videos in which a virtual assistant presented information either auditory, visual or visual-auditory.
In the first use case, a virtual assistant presented the top ten soccer clubs of three different seasons of the German Bundesliga. Afterwards, participants were asked to rank the clubs by their memory of the just seen video using drag and drop methods.
In the second use case, a virtual assistant presented three different types of shopping lists consisting of ten different items of everyday need. Participants were asked to remember the items and select those ten from the video out of twenty multiple-choice items.
Results of this study revealed that the combination of visual-auditory stimuli leads to the highest number of correctly remembered items.
The fact that items in the first use case were remembered less can be explained with the lower amount of provided information. Besides, findings of previous studies revealed that short words are better memorized than longer words. This can also be a reason for the higher number of correctly remembered items in the second use case. The highest results in the combination of visual-auditory stimuli could explain the launch of Amazon Echo Spot and Echo Show, that not only have microphones and speakers, but also a display to show the information asked by the user.
When comparing female with male participants, no significant differences could be found in the first use case (football teams), whereas differences in the second use case were highly significant. It is shown that women could remember the items of the shopping lists better than men.
For the first use case, it has been suggested that men could remember the football teams better than women as the topic football is still male-dominated. This cannot be confirmed with results of this study as there is only a slight trend in the suggested direction.
The importance of visual presentation of information has increased during the last years, not only because of a high usage of smartphones and tablets. Besides that, language as the most natural medium of human beings becomes more and more important in human-machine-interaction. Even in online-surveys verbal communication is already used to answer open text boxes by using transcription from speech to text. This can be time-saving and improves the quality of open-ended questions. Therefore, verbal communication as part of artificial intelligence already finds its way into market research and will influence it in the future.