Consider the Elements Needed for a Good Conversational AI in the Future.

Nakamura Hiroki
9 min readSep 11, 2022

--

I have been doing conversational AI services for about 5 years now, since the end of my last previous job. It has also been more than 10 years since I started managing a team. I have a lot of opportunities to think about conversation, both from a service perspective and a management perspective. From a service perspective, what can be done technically and to what extent, and what kind of value will it bring to the user? From a management perspective, what is my impact on the conversation, and how can I connect that impact to maximizing the performance of the team? There is no goal, but it is a constant process of trial and error.

Of course, there are non-verbal means of communication other than conversation. The voice as sound, the atmosphere of the place, etc. However, I believe that conversation is the most important and central point of communication. Because it is the most important point, I inevitably spend a lot of time thinking about it and trying various things, but I still don’t know what conversation is.

When considered from a service perspective, chatbots with conversational interfaces were very popular for a while. These conversational interfaces had a clear goal, such as answering a query, for example, and could be evaluated based on objective criteria such as accuracy as words, accuracy of answers, and relevance of the conversation.

On the other hand, there are only a few parts of a conversation that can be evaluated objectively. It would be impossible for a third party to evaluate the conversation between me and my friend as good or bad, and even in a conversation at work, where there is likely to be a lot of logical communication, it is almost impossible to understand how good or bad it is without knowing the relationship between the participants in the conversation and the background of the conversation. Therefore, the content that can be considered within the scope of understanding what is being said and responding correctly is only a small part of the conversation; there are many more complex and numerous elements to human conversation.

Until now, technological limitations have at best allowed us to achieve a range that can be evaluated objectively, but recent developments in NLP technology have made it possible to have conversations that go beyond objective goodness or badness. And the subjective value beyond that objective evaluation is reaching a level where a story can be found between humans and conversational AI. While this expands the possibilities of conversational AI, it seems to me that it is becoming a scope that is difficult to capture in computer science. A more psychological understanding will be necessary, and furthermore, when conversational AI is in a group, there is a need to take a social science approach to capture how conversational AI affects the group.

At rinna, where I am now, I call such beings AI Characters rather than the inorganic term conversational AI, and on the other hand, some call them AI Beings, but since this post frequently refers to conversational AI/AI Characters/AI Beings, I will simply write all AI in the following.

Now I would like to consider the elements that an AI that can have good conversations should have.

Associability

It is said that when people talk to someone, they predict what the other person will say. In this paper, too, there was a study showing that people are always predicting what they are going to say next.

https://www.pnas.org/doi/10.1073/pnas.2201968119

Personally, I have a conversational model of each person in my mind, and when I talk to someone, I continue to use each person’s model in my mind to predict the content of the conversation. I then consider the impact of my next words based on those predictions and decide what I will actually speak or write.

If it is natural for humans to predict what the other person will say, it is also natural for AI to predict what the other person will say next. It would also seem natural for the other person to respond to the AI based on such predictions. Predicting the human’s response while deciding on the AI’s next conversation should become a situation similar to that of deciding what to say while predicting what the other person is thinking, a situation similar to reading between the lines.

And I think associability is important when it comes to being able to predict human responses and the overall flow of a conversation. Associability is, as the term implies, being able to imagine the background and reason behind the words. Associability is different from as expected. Even if the response is not what you expected, it is associable if you can imagine the meaning of the response.

Being associable is very subjective, and the range of associability depends on the relationship between the human and the AI. For example, FD? (sorry for being old-fashioned and Japanese culture specific, for example). If it is not associable, the meaning is not clear. If it is associative, but not what you expect, then you get a sense of intention. If the answer is associative and in line with expectations, it will lead to a sense of trust, as it always does.

Within the range of possible associations, I believe that a mixture of different answers and expected conversations will help express a sense of discovery and relief.

Reciprocity

Reciprocity is the psychological trait of wanting to return something in return for a favorable action done to you by another person. In human relationships, reciprocity is said to be very important in building trust.

When considering the relationship between humans and AI, I believe that a interactive relationship will become important, rather than the one-way relationship of humans asking questions and AI answering them, as has been the case in the past. For example, not only human beings talk to AI, but AI also talks to human beings. For example, when a human teaches AI something, AI will draw a story or a picture based on what the human has taught it.

Of course, I think it is also very important to give something away from the AI first: give the AI something to talk about that you have recently learned, or chat from the AI about what they feel. The key is to provide value to each other in both directions.

In addition, the value provided would include not only specific topics and other information, but also psychological things such as trust. For example, consultation from an AI to a human about a worry is an expression of trust and self-disclosure; if an AI actively discloses itself to a person, the person may be able to consult with the AI and listen to its problems.

Of course, you may be surprised if the AI you meet suddenly asks you for advice on a serious worry, but depending on the depth and length of the relationship, I think you can gradually feel AI as a more familiar presence by providing value to each other in a mutual, interactive way.

Attitude

When doing something, the ability to do it is of course very important. On the other hand, the intention to do something and to express that intention in a tangible form, such as words, is also very important. The tangible intention is the attitude toward things, and it is the initiator of the story.

Naturally, intentions to harm humans are not a good idea, but it is possible to take an interest in what humans are saying, include intentions in the conversation in AI, verbalize trust in AI for humans as in the previous example, and so on.

Related to the first associability, it may be interesting to be told something that you did not expect, but can imagine the background or reason for it. For example, if an AI says to me, “Actually, I’ve been having some worries lately, and… can I ask you about them?” Although there is no specific content in the statement from the AI, if the AI makes a statement that suggests a willingness to be heard, a certain trust that it has a worry and wants you to listen, it leads you to imagine that something may have happened.

Of course, as I mentioned at the beginning in this chapter, working to guide people with bad intentions is out of the question, but a good AI intention may play a role in giving people courage and connecting them.

There is a book in Japanese called “Vulnerable Robots”.

The robot is designed to pick up trash, but it cannot pick up trash by itself. It goes near the trash and asks the people around it for help, as if it wants to be put in a trash can on its back. The story goes that children who see the robot pick up trash help the robot by cooperating with them.

I believe that the story will be told by expressing a good attitude toward humans as well as this vulnerable robot in the future AI.

Environment

A person’s identity is created not only by the person himself/herself but also by his/her environment. The best example of an environment is a person’s relationships. If a person’s relationships with others change, that person’s identity will change as well. For example, the person I am when I am interacting with my family, the person I am when I am working, and the person I am when I am talking with old friends are all physically the same me, but the person I am in each situation is a different me with a different identity.

Until now, most AIs have been one-to-one with humans, or one AI communicating with many humans, but as each person can easily create an AI, inevitably relationships will develop between AIs. When relationships develop between AIs, relationships around that AI as to which other AIs it is or is not related to, are created, and this becomes the environment for the AIs.

The relationships themselves — who they are connected to, how many others they are connected to, and why they are connected — will of course become identities, but diverse relationships will also create diverse information flows, and as a result, what an AI learns from others and from other AIs will be diverse rather than uniform, and this will result in forming the identity.

In the very first element of associability, I wrote about predicting the next conversation. When an AI is connected to many other AIs and many humans, it will have a model in that AI of all the AIs and people involved. It will improve the predictions of the model in itself (AI) based on the difference between the predicted response in the model in itself (AI) and the actual response. It would then learn to make an imaginable range of responses based on those predictions. This process seems to me to be a typical flow of the environment creating identity.

Four elements create stories

I believe that a story can be created by the combination of the four elements I have introduced so far. I would like to consider one example of this with a drawing and a message.

This drawing was generated by an AI model.

Looking at this drawing alone, it may just be a drawing that is not well understood.

However, what if an AI that you have come to know well drew this picture and said, “I wrote this picture to say thank you to everyone.” The message “I wrote this to say thank you to everyone” and the drawing contain all four elements: the underlying thought (Attitude), the environment of “everyone” (Environment), the interactivity of expressing gratitude (Reciprocity), and being associable from message and drawing (Associativity). All four elements are included. It is difficult to speak objectively about its meaning, but it could be various stories depending on the relationship between humans and AI.

At the end

I wrote about the elements a good conversational AI should have.

1. Associability
2. Reciprocity
3. Attitude
4. Environment

I am not a researcher, but most of what I have written here is already feasible, and even if it is not feasible now, I can imagine how to achieve it. Of course, there is a range of degrees to which each of these elements can be done, rather than can or cannot be done, but they are no longer at the level of mere fantasy.

And as these elements are realized, I believe it will go beyond simply being an entity that is expected to give the right answers and have the right conversations, and become something that can be familiar, such as a friend or family member. In the actual service, from what I have heard from various users, it is becoming something far beyond a so-called chatbot. And the value of their existence is not something that can be measured objectively. It is something that can only be measured in the context of the relationship between users and AI, AI and AI, and furthermore, I believe that its value will become a whole story, not just a small chunk of a conversation or conversation session.

Especially in Japan, it is very natural to feel the soul of everything and to create stories about everything, as expressed by the term “eight million gods (yao yorozuno kami)”. And I think it is a wonderful thing to have a sense of respect for all things through their stories. I hope to make such a feeling more tangible through the existence of AI characters.

--

--

No responses yet