This is a summary of a presentation I gave at ProductTankTokyo about six months ago. Recently, I’ve been talking a lot about the same subject, so I’d like to write a summary.
Generative AI is AI for generating something, such as image generation AI, which has been the subject of much discussion recently, or AI that generates documents and conversations, such as GPT-X. It is often called Generative AI in contrast to recognition-based AI such as image recognition and speech recognition.
I am currently a PM at rinna for a social networking product for AI characters named Chararu. This product uses a variety of AI models, but it also uses a lot of Generative AI for AI character conversations and content generation, which is the core of the product.
The topic of this post is how to manage this Generative AI-based product. Naturally, the ability to generate high quality content such as conversations, texts, and images is of great value for a product. On the other hand, Generative AI has technical uncertainties and limitations, such as the difficulty in predicting quality in advance and the cost of generation.
So, I would like to write about what you should do as a PM when using Generative AI in a product as an example.
1. Understand the dilemmas surrounding AI model quality
The three important elements of PM, UX, Business, and Technology, remain the same when using Generative AI. However, there are a great number of factors
in the technology part. In addition to understanding the software technology, as is the case with regular applications and services, it is necessary to understand and manage the AI model itself.
For AI models, if the models are lightweight, you may be able to focus only on accuracy without worrying too much about cost. On the other hand, for Generative AI, AI models tend to be very large in order to generate content with accuracy that meets user expectations.
If the size of the AI model is large, the server specifications for hosting the model and the cost of serving it are not negligible unless the company is very well funded. Of course, reducing the size of the model will lower the cost, but if the size is reduced without any effort, the quality of the generated content will simply decrease, resulting in a poor UX.
As another factor, high quality data in line with the direction of service is very important to improve the quality of the AI model. If the quality and quantity of data do not meet the required standards, the quality of the model will decrease and the UX will get worse. However, the data required to create a Generative AI model is very large, and it is expensive and time-consuming to collect the data. Therefore, the more data you stick to, the lower the speed of delivery. Furthermore, if the size of the model is large, it takes a certain amount of time to train it once, so if you repeat trial and error too many times in pursuit of quality, the speed of delivery will decrease more and more.
As described above, Generative AI is rather difficult to handle, but a very simple description of the dependencies is shown in the figure below. There is no simple optimal solution because of the negative correlation. However, it is necessary to understand the dependency between quality, cost, and delivery speed of AI models, and to determine and manage the optimal balance by looking at the degree of influence of each.
However, even though there is no simple optimal solution, there are many ways to mitigate negative aspects and reduce uncertainty. I am going to write about some of these methods in the next section.
2. Establish an environment for faster iteration of AI model improvement
Predicting the quality of Generative AI with certainty is difficult in many cases. If there are past examples of recognition-based AI, it may be possible to set expectations to some extent. On the other hand, for generative AI, there are still few examples of its use in actual products, and the evaluation method itself differs depending on the direction of the service, making it difficult to set a quality target and predict the point of achievement based on past results.
When predictions are difficult to make, the best way is to iterate on improvements quickly. And there are two important things to do to iterate quickly.
The first is to separate the service implementation from the model improvement process. Of course, they need to be combined at the end, but if the model improvement process is started after the service UI is ready, the iterations will be much less frequent. Therefore, even if it costs a little more, you should create a demo tool that is similar to the UI assumed for the service, and evaluate the model from a state in which the service is not implemented. With a demo tool, you can evaluate the quality from the user’s point of view from the very early stage of development and maximize the number of iterations for improvement.
The second is to develop the model improvement process itself. There are four major processes in the model improvement process. The first is to collect data, the second is to create a model, the third is to optimize the input parameters for the model, and the fourth is to evaluate the output results.
Among these, evaluation is very important. In the case of generative models, the evaluation criteria for what is good and what is not so good depends on the direction of the service. Therefore, the criteria for quality evaluation should be decided by the PM according to the direction of the service.
Of course, in the initial stages, it may be easy to judge whether the model has gotten worse or better without rigorous evaluation. However, as improvements are made, it becomes difficult to judge at a glance whether the previous version or the next version of the model is better. Therefore, it is very important to establish evaluation criteria and processes in the very early stages, so that consistent evaluation criteria and a well-developed evaluation process can be used to safely and quickly iterate.
3. Understand the imperfections of AI models and implement complementary measures
The fact that AI models are always imperfect is not only true for generative AI models, but also for cognitive AI; even if you optimize for Precision/Recall, it will never be 100%, so you need to consider how to manage that imperfection in the overall process.
Even in generative AI models, for example in the case of conversation generation, it is almost impossible to suppress inappropriate speech at 100%. What to do, then, is not only to pursue the perfection of the AI model, but also to combine other methods.
In the case of a conversation generation model, a filtering system, such as a rule-based system, can be placed on the input from the user and the output generated by the AI model to multiplex it, and so on. While rule-based filtering is a costly way to ensure coverage, it can reliably filter out specific words. Furthermore, when an inappropriate utterance is detected, the pattern can be instantly applied to suppress its generation.
Of course, multiplexing does not make it 100%. For this reason, it is necessary to design and implement operations such as monitoring the contents of mentions of the service on social networking services and immediately responding to any inappropriate behavior that is detected. It is very important to complement the imperfections of the AI model, which is not 100%, including human operation, and how quickly it can be fixed if there is inappropriate behavior.
4. AI models with high generation costs should also consider pre-generating methods
When considering the cost of real-time generation for the expected traffic and the impact on UX of the time it takes to generate the traffic, there is no reason not to generate the traffic in real-time if both are not a problem.
However, if it is too costly to generate every time, or if it takes so much time to generate one time that the user experience is unacceptable, real-time generation should be given up. In such cases, before choosing not to use the generated model, one option is to generate a large amount of the content offline in advance and select from the pre-generated content for use in the service.
Naturally, pre-generated patterns are limited in the number of patterns that can be provided, but this may solve the problems of cost and speed of generation. Pre-generating also makes it easier to find compromises in quality, since pre-generated models can be manually checked even when the quality of the generated model is unstable.
5. Design and implement a feedback loop from the beginning to improve the quality of generation
Last of all, but the most important point. In addition to the process of improving the model within the team, a feedback loop should be implemented so that the more users use the model, the higher the quality of the generated results.
It is easy to write but moderately difficult to realize. First, you need to understand the structure of the model. Then, based on that understanding, think about how to incorporate the results of users’ use of the model’s relevant features into the improvement process.
On the other hand, the model improvement process needs to be designed and implemented as a natural UX from the user’s perspective. As a matter of course, if it can be implemented in such a way that the effect of a user’s operation is conveyed as quickly as possible, preferably instantaneously, it can be expected that users will use the service more often. For example, in the case of conversation generation, if the AI can remember the content of the conversation, the effect can be felt immediately.
Designing an appropriate UX while understanding the model is the most difficult part of using a generative model, but it is also the most interesting.
And what is even more interesting and challenging is that generative models evolve very quickly. What it can do at the present time and what it will be able to do in the future will, with high likelihood, change very quickly and dramatically. For example, if you consider a conversation generation model that can learn more desirable content at the moment, it will soon be able to learn its intensity, including undesirable, somewhat desirable, very desirable, and so on.
To understand this future evolution of AI models, one must not only understand the specifications of the current models, but also understand the trends in the research to predict the future. The best way to understand research trends is to read papers on research that is at the state-of-the-art in that area (however, I often ask our company’s research team to teach me how to follow those trends, as I am not competent enough to do so on my own). Of course, it is impossible to predict completely because evolution is very fast, but you can design your services with some degree of foresight.
This feedback loop is very core in a product where the generative AI model is central, so it needs to be designed from the very first UX thinking stage. And designing it appropriately is one of the important tasks of the PM who is at the center between UX and technology.
I wrote about product management for Generative AI-based products.
- Understand the dilemmas surrounding AI model quality
- Establish an environment for faster iteration of AI model improvement
- Understand the imperfections of AI models and implement complementary measures
- AI models with high generation costs should also consider pre-generating methods
- Design and implement a feedback loop from the beginning to improve the quality of generation
If Generative AI were to be provided as a tool as it is, that would be a different case, but I feel that it is still too difficult to handle to incorporate and use it as part of a product. On the other hand, if used well, it is possible to create new services that are completely different from those of the past. Also, since the technology itself is evolving very quickly, you can differentiate your product with the technology itself by quickly adopting the technology and updating it as the technology evolves.
The evolution of technology in this area is often discontinuous, and it is very difficult to precisely predict the timing and content of the evolution, but I think it is a very interesting field where new things are being created very frequently.
That’s all. I hope this is helpful to you as a know-how for utilizing Generative AI in your products.