In this episode, Richard H. Miller shares keys to creating enterprise-grade Gen AI-infused applications that resonate with users.
The Keys to Quality Enterprise-Grade Gen AI Apps
About This Episode
Special Guest
Transcript
(This transcript has been edited for brevity.)
DAVID CARTY: 3D printing can enable artists to create some amazing designs in very short amounts of time. It can also be a great resource for home improvement tasks. Of all things to capture Richard H. Miller’s interest in the technology, it was a broken backyard swing that needed attention, and that sparked that classic build-versus-buy decision.
RICHARD H. MILLER: Yeah, it’s actually a pretty funny story. I had a swing in the backyard, one of those swings where two or three people can sit on and you swing. And every few years, they’d have to buy a new mattress for it, a new cover. But eventually, the feet on the bottom started to degrade.
So I was like, hey, I’ll just go to China, and I’ll find somebody. I’ll buy the four feet. It’ll cost like $50. Doesn’t exist, doesn’t exist in the world. So I’m like, OK, I’m in Tech Valley. I’ll get somebody to print them for me. And so I’m like, oh, yeah, we can do that. The model is $300. Each one to print will be $200. It was maybe $900 to make four feet for a swing that is worth $50, $100. So I’m like, well, if I have to spend that much for the feet, maybe I should just make them myself. And that’s what I did.
It was my first model. I made little miniature versions of the models to make sure the angles were right and all that stuff. And then eventually, I printed them out of ABS, which is a little more complicated to print out of. You need to vent it. You don’t want to breathe the fumes and all that. So I had to build an enclosure for the printer. So it took me a while just to get ready to print the thing I needed to print because I had to print all the parts to create the enclosure and all that. And so eventually I printed them, and they each took like 18 hours to print. And then I learned about different nozzle sizes and different types of filaments, the plastic that you use. And now I’ve designed maybe over 100 models. So just fun stuff that helped me be more efficient or more productive or help me adapt something for something else.
CARTY: Richard has since built all kinds of items for use around the house, sometimes even letting his printer run while he sleeps. And as much as he enjoys creating those items, he enjoys sharing them too. Think open-source contribution but for 3D designs.
MILLER: Usually when I do these new models, I post them back up to one of the sites. Because of course, it’s a community. You’re sharing, and it’s like, sharing software. It’s just sick sharing models. And it’s a lot of fun, and people are really supportive and give good feedback. And it’s just a fun thing to do. And in fact, the MIT licenses are used on this stuff. And sometimes they want attribution, usually not. I don’t. I could care less. Please use my models.
CARTY: For a design-minded guy like Richard, 3D printing is a practical need, but it scratches a creative itch, as well.
MILLER: It’s just fun. I usually just do tools and things like that. I don’t build art projects. I do things that solve problems. It’s a lot of fun. And it makes me think about problem solving, which is, again, what UX designers do.
This summer, I went to this thing called Burning Man, which I strongly recommend to people to try. It’s definitely tough living, but the upside is people there are just spectacular. And the innovation and the creativity is just beyond mind boggling. Plus they got a lot of solar and LED lights and computer control and stuff that I think is very interesting. And I helped build the temple this summer, which is one of the big things on playa. And eventually, in the end, they burn it down. And so one of the things I did — and people like to share and give things out. So what I did, as my sharing, a lot of people do stickers, for example, I built little 3D-printed men. There’s a Burning Man. It’s a little symbol. You can just google it, burning man. And I built these little symbols out of plastic, and people put them on the back of their car, sort of as a 3D version of a sticker. And that was my share. I would go around. And when you’d meet somebody, I’d hand them one of these. And they’d probably hand me a sticker or something, which I put on my water bottles.
I would just say, try it. If you feel like you have that interest to be a maker, there’s not a big barrier to entry. It’s not that hard to get started. I built my printer from scratch. They sent me instructions. I followed the instructions exactly. I learned a lot about the printers along the way. Not a big barrier to entry. Really, I’ve taught many people how to do it. And I can teach you the basics in 30 minutes, so it’s a lot of fun if you have that kind of itch.
CARTY: This is the Ready, Test, Go. podcast brought to you by Applause. I’m David Carty. Today’s guest is 3D-printing tinkerer and conversational AI expert, Richard H. Miller.
Richard is a technology advisor and coach guiding startups in emerging technologies across CRM, social networking, and other sectors. He spent 17 years at Oracle, most recently as Senior Director Level Architect specializing in conversational AI and UX design strategy. He also wrote the book UX for Enterprise ChatGPT Solutions, which published just a couple of months ago.
We talk a lot about the theoretical implementation of generative AI in the enterprise, but Richard’s book aims to provide some practical guidance for creating engaging, generative AI-based experiences. His book really takes more of a UX-centered approach and deals with matters like usability and functional quality, which are all too important for systems that return outputs in a matter of seconds. Let’s learn from Richard how to harness the power of Gen AI in a practical way.
Richard, your book focuses on designing enterprise-grade solutions using ChatGPT and other generative AI platforms, not necessarily specific to ChatGPT, but that’s kind of where your focal point is in that book. But let’s start by asking, what are the guiding principles of designing a high-quality, usable, Gen AI-infused application or feature?
MILLER: So is this going to be two hours on just this one answer? Because that’s the kind of time it takes to cover that. But obviously, we’re going to go a little shorter than that. There’s a lot of things in the design and development process. And in my book, I go through this whole life cycle. So there’s principles that we apply practices, guidelines that might apply to the research end.
Applause is really well known for this kind of stuff. How many users should we talk to? How do we gather feedback properly? How do we answer survey questions? How do we figure out what’s the right thing to build? Because if you don’t build the right thing, you might build it well, but then who needs it? And what I found, deeper into some of the chapters, was I started looking at the guidelines and principles that have been around a long time in UX design. And in 1984, Smith and Mosier wrote this compendium of user interface guidelines. Now, if you do the math, because for many of the audience, that’s before they were born, GUIs weren’t around, Graphical User Interfaces. The Mac came out in 1984. So people didn’t know about Xerox star and things before that. It wasn’t mainstream. No one had them. Well, it turns out what’s old is new again. The stuff in Smith and Mosier, for example, has a bunch of 944 guidelines that were based on user research, user research methods and studies.
And a lot of designers today, they don’t come from that kind of — I almost want to say hardcore or interaction design background. They might come from visual design background or other areas that maybe don’t have that scientific underpinning. And you go back and look at these, and you realize that what’s old is new again, the stuff that they talked about with command line interfaces are totally applicable to the kind of interfaces we’re doing now with chat with conversational assistants.
And so if you read those, you realize that there’s things that we need to put into place, either through the kind of content we’re creating or how we’re wrapping up our conversational assistant with instructions. The instructions to tell it how to behave or the type of behavior we want out of it in order for our end customer to then ask a question because we usually create these wrappers. And so a few that I’m reminded of, to inject into, say, prompt engineering and things like that, and I’ll just give you a few examples. Consistent display format, so that’s one of the guidelines in there. Well, if I talk to you in military time, then I would want the generative assistant to talk back to me in military time. And I’d want those these dollars and dates and currencies or whatever, all to be in things that are consistent with my expectations. It’s very common for people to say, I want something on June 5, but if I said I want it on 6/1, is that June 1, or is that the 6 of January?
CARTY: It depends on where you are in the world, right?
MILLER: Exactly, right, so understanding the user’s context and who they are and their profile and then being able to apply these correctly, has a big say. And just one other brief example, only necessary information displayed. So there’s an old guideline on that. And we talk about this a lot with things like active mirroring. Like if I say to the conversational assistant, I need to book a trip for 3:00 PM on Thursday to San Francisco. It comes back, and it comes back to me and says, OK, you’d like something at 3:00 PM in the afternoon on Thursday the 5th to go to the San Francisco airport or whatever. And so not only is it mirroring back to me what I’m doing, but it’s giving me the information so that I know that it knows what I was asking.
And so all these kinds of guidelines that are out there apply to how we need to design our solutions today. And I don’t mean just conversational systems where we’re going back and forth with a chatbot, which I don’t like the term chatbot. And we can get into that if you want. But there’s also applications of generative systems that are behind the scenes or are part of a hybrid solution, like a GUI, that also has conversational feedback. Like if you’re doing something in an enterprise solution like Salesforce or ServiceNow, and it might be providing guidance or feedback. Oh, you should add this offer to the deal, or here’s a better way of saying that. So these are hybrid experiences. All those have that sort of prompt engineering, those instructions wrapping it to help it guide it in the right direction. And these kinds of guidelines can be used to help you form those solutions.
CARTY: Yeah I always thought it was interesting having your answer read back to you. From the conversational assistance sort of lens, it helps the user, who’s providing the input, feel like they’re heard. So it’s kind of a humanistic sort of quality that it’s giving to the AI.
MILLER: It is. It is.
CARTY: Which I’m sure is not unintentional. But to jump into the next question, in the book, you reference Boomi’s six tenets of AI readiness, and you align it back to your own findings. Now, it’s too much to cover in one question, and we don’t have two hours for this podcast, unfortunately, so let’s frame it in a different way. Based on what organizations should have in place for AI implementation, where do you see them making mistakes in terms of rushing and taking a wrong turn?
MILLER: Yeah, I think it goes back to the history of conversational AI and the stuff we’ve seen with Facebook, when Facebook first came out with chatbots. And everybody built a chatbot. And 99% of them failed. Why? Well, I have a term for that. It’s called care and feeding. So they didn’t care and feed their conversational assistant. They turned something on, and then they just never looked at it, never monitored it, never improved it. They’re just like, oh, well, this must just work. And of course, it absolutely did not work. And the few that invested in creating a full-service life cycle, thinking about this as, oh, this isn’t just a technology, this is a solution.
And we need to figure out how to do that in a way that makes sense for our organization. Therefore, you’re going to have to monitor it. You’re going to want to create a life cycle of improvement, care and feeding. And so I used to show this picture of a fifth grader because there used to be this TV show, this Jeff Foxworthy — and they would ask questions of fifth grade — fifth-grade questions. And you’d have to answer them. And so sometimes people are surprised, like, wow, fifth graders should know that. And it’s the same with these conversational assistants. They start as babies. Or they’re first graders saying, you’ve got to mature them. You’ve got to teach them. You’ve got to coach them. And that’s all part of creating a generative AI solution. This concept of fine tuning, of including additional data, which we’ll talk about, I’m sure retrieval, augmentation, and things like that generation, RAG. All these kinds of things help you create an enterprise solution that sort of is fed with the right data, provides the right results, and you know that because you’ve monitored it. You’ve addressed that. You see where it’s failed, and then you go back and improve it. And I think that’s the biggest mistake is that people think they can just throw this stuff over the wall and just turn it on, and it just doesn’t work that way.
CARTY: Yeah, well, you got into RAG. Let’s jump into that. You write quite a bit, quite a bit about Retrieval Augmented Generation assessments. And those identify problems from the user’s perspective. And you also write about metric-driven development, which is sort of a way to iterate on data-driven findings. How do these elements come together to help individuals refine those LLM-based outputs?
MILLER: Yeah, so I’m a big fan of making sure that we say what we’re going to do, we do what we’re going to do. And then we know we can verify that we did what we said we were going to do. Tell them what you’re going to tell them. Tell them what you told them, in the presentation sense.
So these metrics — and there’s a lot of metrics out there. I mean, I don’t know how much of the metrics we really want to go into. I spent a chapter on them. But there’s user-experience metrics, and there’s the harder-core, conversational AI metrics, answers semantic similarity, which these kinds of things. And the importance isn’t what the score is. The importance is how to understand how to get them to be better. How to say, OK, this is where we are today. And it might not be very good. There’s some great videos from OpenAI that showed their life cycle and how they approached solutions and how they improved it over time. And I referenced it readily in the book. In my reference section, I have all the links as QR codes and things like that, so people can get to this content easily, even if they buy the physical book. And it’s really quite compelling because they show you, in those examples and any examples that hey, we might have started. We’re like 50% good, then we’re like 70% good, and then we’re like 80% good.
And I have this really great graph in the book, that I designed, that talks about, well, how good is good enough? And so if you use these metrics — and I also go into how to create your own metrics. If you use these metrics, you can then decide what is good enough? And the funny thing is, some people will say, oh, well 60% is good enough or 80% is good enough. Another of you going to be like 100% is good enough. I’m like, well, what percentage is your human agents doing now if you’re talking about agent support or FAQs or support on, say, some type of social media channels? How good are they now? How accurate are they? Are they 85% accurate, 90% accurate? So you probably can do better.
But just like that agent, that agent who’s 95% accurate, I bet you there’s something unique about him or her, than the agent that’s 70% accurate. So how can we translate that into a digital assistant agent that, of course, runs 24/7, can support 1,000 interactions simultaneously? How do you model that? And so these metrics are what you can use to see that you’re approaching your goals. And of course, you might have this expectation, well, the closer you get to your goal, the more costly it is to get there. It takes 20% of the effort to get from 70% to 90% success, but it might take 60% of the effort to get from 90% to 97% success. So there’s a cost-benefit ratio to quality. Well, there always has been. I mean, cars, anything, it’s the same way. So I think if you look at those metrics, you can use those to drive your success. The engineers can use them. Your product management can use them. And certainly designers and researchers can use them so that you can make better informed decisions.
CARTY: Yeah, you mentioned the metrics. Some of them that you mentioned in the book include semantic similarity, answer correctness.
MILLER: Right.
CARTY: Definitely make sure you check that out in the book. It’s really well outlined. But it sounds like what’s really more important is that you are improving, for one, and that your team has trust in the metrics that you’re using. Is that correct?
MILLER: Yeah, it’s a good point. There are metrics, and I even go through this in the book, that I struggle with. That I don’t think are on point yet. And I think what we see here is that we have this immature space. I mean, I’ve been doing conversational assistance for about seven years. Some of the metrics only came out this year. And so I saw some examples where if you run the metric again, because of the way the math works, the answer is not the same. It’s not deterministic. Because if you understand conversational assistance and the way these models are built, the models themselves are not deterministic. And what we mean by that is, if you ask it the same question again, you’re not going to get the same answer, and that’s just the way they work. Whereas with the older systems, the conversational systems we’ve had traditionally, like Google and Alexa and things like that, that they were based on originally, if you ask it the same question, you’re going to get the same answer, or at least in the same format. Obviously, the data might change. If you ask the weather, it’s going to tell you the new number. But that structure of that sentence is basically a template. And that template was filled in with the data. Now, with conversational assistants based on AI, you can adjust. There’s these sliders, effectively, these variables you can set to say, well, how deterministic do you want this answer to be? And it will tend to pick — if you push it to the right, you’re going to tend to pick a higher likely answer to repeat. It’ll give you more consistent answers. But it’s also not as — I don’t want to say human, but it’s going to be not as natural as you might expect. And we want that natural thing, because that also helps build that trust that you’re mentioning. And trust, trust is really in the data, but it’s also in the way you say something. If somebody comes up to you and says — and they’re a salesman, they say, oh, trust me. I’m going to get you the best deal. What’s your first instinct?
CARTY: Why should I trust you, right?
MILLER: Of course, there’s no way. Of course, this person is not going to be trustworthy. Anybody who tells you they’re trustworthy is probably not. So we have to be careful with that. And it’s just one set of approaches to create that conversational system that’s trustworthy. And from my research, the thousands and thousands of analysts’ analytics we’ve gone through, people do tend to trust this stuff. And so when they lie or create hallucinations and things like that, or just go off the rails, we would say, meaning they start answering questions or start saying things that aren’t actually what you’re talking about, there’s something else.
And the user generally doesn’t read. That’s a mantra that we talk about, users don’t read. So they don’t notice that they’re actually off track, and they start answering questions based on something that they’re not even in the right ballpark. And so their answers are wrong because the question was wrong. And so there’s a lot of this going on in conversational systems, and people complain about it. So this kind of hallucinations and these issues, these are things that help build trust. And if we can get rid of them, the more we can get rid of them, the more we can keep the user on track, the better the solutions will be.
And I’m pretty bullish on it. There’s a lot of pace and speed going on in this industry to really improve this kind of stuff. And I do think that — yeah, a lot of smart people. A lot of way smarter people than me that they’re really — these data analysts, data scientists and the people at all these different model solutions. We don’t expect most enterprises are going to build their own model. It’s just it’s not practical, just like most companies don’t build their own operating system, they don’t build their own Java language. They use stuff that’s out there. But they can adapt it to be their own. And they certainly, the thing that interests me, is in the enterprise space, is that they have a wealth of data about their customer that is not exposed to these models, it’s not exposed to the internet, and that’s where their value is. So if we can build that trust, and use these metrics and things like that to create a high-quality solution, which includes generative AI, they’re going to be able to create a very high return on investment.
CARTY: Yeah, a lot of smart people and a lot of investment was the other thing I was going to say there. But the QA team should be establishing a test matrix for these types of systems. And the value of having these different kinds of test suites. Specifically, you write about having in domain, out of domain, random, neighbor, and language test cases for the business areas under test. At a high level, can you give us an idea of what those different test cases cover, and how do you go about finding that right mix?
MILLER: Yeah, the right mix is actually tough to do. I cover it extensively in the book. So I adapted this from some friends of mine who were building test matrix for a conversational assistant that was doing an FAQ kind of thing for their product. They’re like, oh, what platform can I use, and what version of the software can I use, and how do I get it to do this? And they wound up eventually growing their test cases to the hundreds of thousands. And the reason why it grows, and of course, this is where automation comes in.
I mean, Applause again, this is why you guys should be talking about this kind of stuff. The automation that comes into play, means that it doesn’t matter if I have hundreds of thousands because the automation will run it for me. But you have to have the right test cases, and I think that’s your point. So in domain is the idea that if I have a conversational system that covers, say, 10 areas, 10 topics, each of those topics alone should have a set of test cases. So if I’m going to cover something like iPhone battery and screen fixes, we all have phones, they all break screens, they all need batteries, that’s a domain, like how to fix your iPhone for performance or screen being broken, you’re going to have a set of questions that are very specific to that area. And you’re going to generate a set of test cases to maybe get the system to make sure that it understands those questions.
But if you take something else that has a broken screen, like a TV, and you ask, well, how do I fix my broken screen, you don’t want it to be answering about an iPhone when you’re really talking about the TV. And let’s maybe take Samsung as an example. The Samsung would have a phone, and it would have a TV, dead pixels on the phone and dead pixels on TV, broken screen, broken screen. You don’t want those to overlap, or you might wind up with multiple in-domain tests that look very similar. And so by creating a matrix out of this, out of each of them having their own standalone questions that apply to it, when you go across the matrix to say, well, let me ask the Samsung TV questions, in a larger context that includes phones and TVs, how will they respond? And you want them to respond correctly, which means you might need to have some kind of additional data. Oh, did you mean a phone, or did you mean a TV? So the conversational assistant needs to understand how to disambiguate between these two very common questions. And then you have things like random, which are like, can I mambo dogface to the banana patch.
CARTY: I can’t tell you how many times I’ve asked that question to generative AI.
MILLER: It happens all the time, right? That’s actually a Steve Martin — my favorite album of all time, Wild and Crazy Guy. But that’s how to teach a kid wrong. First day of school, you teach the kid wrong language. First day of school, he raises his hand and says, can I mambo dogface to the banana patch?
What are you saying? So you want to see how your system responds to random or off-axis questions. And so if you think about each of these as, I have 10 areas, 10 products, or 10 areas of interest going down the left of a spreadsheet, and across, you have in-domain and out-of-domain and then random and then neighbor, neighbor means ask me the Samsung TV question, but do it in context of the phone, you wind up with a matrix of a lot of questions. And that’s OK, right? And then you layer that in with language support, and all of a sudden, all that gets multiplied. And that’s why this matrix can get so big so quickly.
To answer your question about how many, I have some rules of thumb in the book, and you’re going to look at, well, how big is my area? Is it a really niche thing where I only could possibly ask five questions? And then even if you ask ChatGPT to generate other questions, to synthetically generate questions to build into my test cases, they might all look the same. And so it’s a very small area. So you’ll have less test cases.
But overall when we’re talking about businesses, generally speaking, there’s a lot of things going on. I’ll give you one brief example, if you want. There was a company that has a hotel. I think it was in the Netherlands, and they were like, oh, we’re going to build a conversational system. And there’s, oh, there’s 25 things that a customer would ask when they call the front desk. And of course, the front desk gets busy in the morning because everybody’s checking out. So they want to create a conversational system that would handle these kinds of questions, like, can I get a late checkout? Or, what time is checkout? Or, I don’t know, what time is breakfast? These common questions. Well, it turns out by the time they were done, they had over 500 topics. Not 25, 500. So their perception of what was in the data was off by more than an order of magnitude.
And I go into that extensively in the first few chapters of the book because if you do your user research, for example, if you look at existing chat logs or you look at the type of questions that customers ask through service requests, or you go into social media and see, yes, you can get a very good idea of what’s going on and then you can start prioritizing that backlog, if you will, and the kinds of things, what kind of data do we need to answer these questions? What kind of APIs do we need so that we can get the right answers for the right model, for the right version, so that we don’t give them the wrong answer? If you want to reboot a Mac, it’s different, depending on the kind of Mac you have. So that kind of creates this wealth of test cases that you’d want to do. And the worst part is that when you go to a new model, there’s no guarantee that it’s backwardly compatible. So having all these test cases in place is really important because you might not be at the same level you think you’re going to be when you go from 3.5 to 4.0 in a model like OpenAI.
So there’s a lot that’s dependent on doing good test cases. And the other part about test cases, that I think is important, it’s not just about the QA engineer anymore. Because the QA engineer, English might not be their first language. And they might have a language that might be useful to that language matrix. But who’s going to be able to interpret the quality of those results? Because it’s not black and white. It’s not yes and no, did it pass the test case? Well, it might pass the test case. Well, how do you judge that? Well, you have to read the results. You have to see how it responded. Because again, we said it’s not deterministic. So the team has to be more than just the QA engineer who’s really good about the automation and the systems. It’s got to be those linguists and those writers and the product managers and the designers providing and contributing to what are good answers to those test metrics.
CARTY: This kind of dovetails a little bit into the dynamic nature of generative-AI technology and the demands of the business. Those things are kind of fluid. Throughout the book, you encourage the reader, perhaps most importantly, to continue to learn and evolve. Tell us why this is so important in this area, especially within the context of design thinking?
MILLER: Yeah, OK, yeah, it’s a good question. So if you read a newspaper from 50 years ago, you can read it fine. Now, 300 years ago, it might be a little more of a challenge. Language does evolve. But humans have not really evolved. My response rate, my ability to react to something, my visual system, my brain, the way that I remember things has not changed during the life cycle of the compute era. It’s just we don’t evolve that fast. But what has changed is the speed of which technology has incurred, the speed at which change now occurs. I mean, we get new iPhones every year. Sometimes they’re more innovative than others.
Now, generative AI, you could say the dawn of that was basically two years ago with OpenAI’s model release. Granted, we’ve been working in the field for years before that, but that was really when it stepped into the limelight. So when you think about that and all the channels that might occur, the voice channels, the GUI channels, text channels we as design people design thinking, need to take that stuff we started the first of the conversation, the historical underpinnings of interaction design, and we need to apply them. The psychology, the human factors, the linguists the product-design expertise and apply them now. ChatGPT is not very good with novel solutions. It can’t invent things, not very well. That’s still on us.
So we need to evolve our understanding of how people interact. And the thing is, in our space and the design space, so are people who interaction design and psychology and human factors, and then there’s a lot of designers who don’t. And so I think it behooves us to get those designers to understand the psychological underpinnings and the human factors of humans, so that they can adapt their solutions to be more functional, usable, apply to necessary interactions, and be engaging. And I say it in that order because in my book, that’s one of my little mantras. It’s F-U-N-E, funny, that’s how I remember it. Because we want functionality. It’s got to do the things we want. We want it to be usable. We’ve learned that from the iPhone. We want products that we can connect with. But it has to be those needed features. There had to be the necessary features. Don’t give me stuff that I don’t need, gets in my way. And of course, engagement is this thing about color and style and feeling and tone. And we do a lot of that with conversational AI, like creating the right tone for a conversational system. If you’re a bank, you’re going to have a very different tone than if you’re a surf shop, dude, let’s hang 10. And we want that connection. We want that connection in our conversational systems when they’re interacting like that. So yeah, I think there’s still a lot for us to learn and evolve and adapt to the current technology.
CARTY: Especially when you’re talking usability across the scale of these generative AI platforms with the massive, quick adoption that they’ve all had. Because it’s going to be usable by a wide range of people, and they all have to have some measure of usability there. But I want to get into the final question here for you, Richard.
MILLER: OK.
CARTY: Your career has spanned engineering, UI, and conversational design, among many other areas. Over the long term, how do you see AI, and Gen AI in particular, disrupting those disciplines? What’s a realistic assessment there?
MILLER: Yeah, I mean, we all want to pontificate on the future. I spent the day with this famous author, Alvin Toffler, and he wrote Future Shock and his Waves of Innovation and stuff like that. And I think that when you write a book like this, you try not to– I tried not to write the book so that it would be out of date the next day. I wrote it along generalizations and things that we can apply two, three, five years down the road. And that was hard to do because I had to stay away from certain kinds of decisions or things I would execute on because they’re going to be out of date.
And so what’s coming next? It’s hard for me to say. I think that we need to still get past this chasm of chaos and the crap that’s out there, and there’s plenty of that. I mean, I just think there’s a wealth of dead soldiers on the side of the road, so to speak. If you take that concept. And what’s going to be left, though, are things that are kind of functional and will grow and mature over time and will keep you on track. I don’t think there’s one good answer here, but you mentioned something a minute ago, I’d like to bring into this part of the conversation. And you mentioned, and I sort of was like, different types of audiences. And, we know that conversational assistants can have biases because the data is biased. And we try to get that out of the data. And then even those data in — enterprise data can be biased. And I don’t mean necessarily like social issue bias African-American, or Chinese, or it’s just biased in the data. It could be against another product, for example, biases against a third party. Because it could be written up that way. And then all of a sudden, thinks oh, I should say bad things about it.
But I would also focus on things like the age of the adults, who’s using this product? Is it for kids? We have to be more careful. Is it for younger adults? They’re probably more flexible. Can we get it to the older adults? So when the older adults start to experience this stuff and use it effectively, I think that’s when things have arrived, when somebody who’s not naturally going to use this kind of technology, can use this technology, then it’s arrived. That is, many people now use iPhones that are in their 60s and 70s that my dad was 90 something, never quite got it. But his wife, my stepmom, she uses her iPhone. She texts us regularly. So when you get that level of adoption, then you’re going to see more disruption because it’s going to permeate into far more parts of life.
And I think ChatGPT and AI in general, it’s going to be everywhere. Some are going to be harder to get there. It’ll take more time. Certain areas that we want to be more careful and more cautious. But I think there’s a bigger return on investment when you talk broader, I think it’s going to be disruptive everywhere. So I’m hopeful that we will make the right decisions.
CARTY: All right, Richard, lightning round questions for you today. First, what is your definition of digital quality?
MILLER: Does it meet the user’s expectations. So if it meets the expectations, then we have quality. Do we want higher quality? Great, let’s keep going.
CARTY: What is one digital quality trend that you find promising?
MILLER: User metric development, user metric-driven development. So we talked a lot about that today. I would even settle for something as simple as a net promoter score. If I had to start with something really trivial, it’s that one that some people probably know it. It’s that one question that asks, how likely are you to recommend this service or product to your friends? It’s on a 1 to 10 scale. That’s a start. Some people don’t think it’s enough. I can agree with that, but at least it’s a start, so just getting data-driven decisions, I think, is a big deal now.
CARTY: What is your favorite app to use in your downtime?
MILLER: Well, I kind of mentioned, Tinkercad and Prusa 3D, so I don’t want to belabor that. I am a word person now because we write copy. We talk about style and tone. So I actually play a lot of word games. I don’t play 3D shooter games and things like that. But I play Words with Friends and Wordle and even the LinkedIn games. Anything with words, I like to play those kinds of things. And that’s my 10 minutes of downtime a day.
CARTY: Finally, Richard, what is something you are hopeful for?
MILLER: I want to see more successes and failures. I want to be able to read The New York Times and not have them discuss all the crappy AI by the side of the road. I want the rhetoric to die down. And I want us to get into that curve up of successful implementations.
CARTY: Well, Richard, this has been a lot of fun. Thank you so much, and congratulations on the book.
MILLER: Thank you so much. It was fun talking with you today.