Mark Lippett CEO of XMOS on the Chips that Make Voice Assistants Work - Voicebot Podcast Ep 87


Pizazz number eighty seven of the voice podcast, today's guest, Mark lapel CEO of Xmas. Welcome back for about nation. We have a great guest today. Marc Lopez is CEO of ex-boss they make the chips in bed. It software that enable voice interactions on a wide variety of devices ranging from smart speakers, too, smart, TV's and remote controls. We often focus in this show voices system and what they can do today. We learn about the enabling layer in the role that hardware plays. I think you're really going to like it before we get to Mark. I have a quick listener shout. Sean weathers from jargon left us a five star review on the Alexis. Skill store for our flash briefing called voiced by daily. So he says I quote, I listen to the voice by podcast. Religiously and love the insights, it brings from conversations with our friends in the voice community voiced by daily is a great supplement to that. It's an excellent way to quickly get the voice industry news. I need to know start my day. No other source provides this convenience in value to me, great job, and quote, there you have it. You Alexa, fans can add voice daily as a flash briefing and get a daily one. Minute summary of the top story of the day. You can also just say anytime Alexa, or hey, Google launch voice. Bot says you'll hear the same content, and you can hear it and sequence. So check both those out. But in particular, if you use flash briefings, you should set that up because it's really quick. And you get information that'll be useful to you. Thank you so much, Shawn. I appreciate your view in the comments. If you the listener today, leave us review and the Alexa skills store and the Google action directory or an apple podcasts you to make it a shout out on the voice podcast. Okay. It's time for our interview with Mark lapel of ex-boss, so Markle bet joined ex- moss in two thousand six is vice president of engineering was later promoted to COO and then CEO about three years ago before Xmas Mark was CTO co-founder of silicon IP and better software provider Ignace earlier in his career. He was a network systems engineer, Texas Instruments, Mark work, Mark earned an MBA from Henley management college in a master's engineering for electrical. And electric engineering at the university of Surrey, Mark lapel. Welcome to the voice podcast. Thanks very much for having me. Well, I'm gonna have you. We don't normally talk about hardware. We have a few times. But we tend to talk more about software, and, you know, the big voice systems, and then design and a lot of different things. So I was really intrigued when I saw some news coverage of Xmas around CS, and I thought, wow, it we're long overdue to have a good hardware and systems level discussion for voice by listeners. Okay. Well, glad to oblige her great. So I think good place to start is ex-boss Leno talk about what the company does. And how you came to work there. Sure. Well, so I I'm this is going away back as you say two thousand six I exited my company Ignace you mentioned in the intro there. I had a youthful summer off during which the then CEO of ex-boss goading contact with me. Looking for somebody to run the engineering team? The company was really small. I was employee number six I've known the CEO for many years. We could friends I was very happy to join. It was also no Pacino with David May who's one of the prominent compute, scientists avow era. So I was very very happy to very closely with him in the Elliot's too. So around two thousand six I guess Q three two thousand six I joined the company company funded series funded in two thousand seven as you mentioned where we're hardway company. Business model is to sell silicon chips and software and development of silicon chips is somewhat capital intensive and takes a little bit of time. We were out in the mall kits in round two thousand nine with how novel microcontroller I'll check and the problem that we were solving was that consumer electronics designers in particular, very constraints by the platforms that were available to them firstly, very cost constrained in general, as we will know the. To some extent those calls contrite constraints driven them into a welding, which like it really only by me too products. So they they've been forced into situation where they couldn't really differentiate end products. And so we had a solution novel microcontroller architecture that allow those kinds of engine is to continue developing sea-based solutions, but to create really really differentiated products right from the show, so they could actually Ryo and protocols in software and change the hog way of a heavier that designs while still say writes in control software and DSP to right. The the high level application so we about to market almost immediately involved in in the audio space. It was around the time that file was being dropped from books and the procedure. Rodeo business was was heavily dependent on MAC books. And and consequently on fire y an apple was suggesting that that community needed to use US VDI to that wasn't a available chipset for that. But we had this platform. Technology that's them suited to fos spin for a new kind of interface class. And and we have we were in the market in a number of months and the rest is really history. We've dominated that USB multi business pretty much ever since alongside that obviously built significant audio expertise during which time we we observed the emergence of voice around about two thousand and two thousand fourteen we started to see some about customers using the solution for rain microphones, so taking our software that that we'd written for platform, but modifying it to use to interface to numerous microphones. And it was almost I think two or three months later, the Amazon launch the echo, and we decided that decided that that point that the audio space that we'd been in just grown by factor of ten or hundred and really have been doubling down on voice voice ever since. So what percentage of your current businesses voice related? So it's about between ten and twenty percent about current business. But it's it's. Comfortably growing growth rate is very strong right now. So we've been we've had revenue in the voice space. So this is selling those chips, we voice voice specific foam where embedded within them. So we've been doing that since two thousand seventeen and we grew substantially in two thousand eighteen when we expect to grow substantially again. That's that call of business this year and beyond and what about audio general what percent of your businesses audio based? So it does vary. Little the audio business that we're in is is mentioned, it's the procedure space is quite quite a mature market. It's very demanding. And so we've we built a reputation for quality in the years that we've been playing in that space, which is paying dividends in the voice face. But it's eighty percent. I would say, okay. Yeah. So really significance. So between the two. That's most your business. Correct. Yeah. We have some. So the the platform that we developed back in the early years is almost entirely general purpose. It doesn't apply. From the sort of performance in the cost footprint. It's not specifically dedicated tools audio it can do many many other things. So we have a very active community of uses using the technology full also things from toys to to robots but as far as our own Goto market emphasis is concerned. It's really all about audio and increasingly about voice, right? So I think it's correct to say, you're a fabulous semiconductor company. That's right. And so explain to the voice by listeners. What does that mean? Well, that means is that some we don't own manufacturing capabilities. So I wear somewhat rice mile when I'm saying this. But if you're being successful as a fabulous semiconductor company, you really have a seal products. So we ask. A five a chip five to to manufacture wafers for us using our design. We also them to ship that those waivers to to act Jim company packages tests, south design chips, we then send those chips to logistics company, which is another third party. And then we instruct that logistics company to sell to send those chips to customers. So we we all right. The entire supply chain arms, and what that means for us is business is that primarily will business traits in intellectual property as primarily engineers the hospice business, right? So in this case, your customers actually do buy chips from you then. But you you don't manufacture them you send the design to somebody else and you coordinate that that product will be shipped to your customer. That's right. Yeah. So those chips get shit with ex-boss written on the on the lid, and right and the bed. It software that you supply is is your only software. The software that runs on the chip that. Shipped with the chip or do you have other embedded software that you sell is an add on. Typically, we can do either. So, but but the vast majority of the of the self way that we foams Paul about product is software that runs on the on the the x moss chips themselves. So we have in the past and licensed technology that might for example, run on the laptop. So, you know, USB multi channel host, for example. And but primarily out our IP our expertise is really about deeply Betty technologies in the in the end product. And so what are some examples of products that Xmas technology is in out in the market today? Okay. So while looking at the voice space in particular, the there's a fantastic Tiffany sound ball wherein. There is a free box delta, which you're if you've seen that that was released a few months ago, set a box a convergence product to set a home hub, a wireless speaker that's being marketed to the French market at the moment. Primarily without some allow they plans to expand just picking another category at random you may have seen the pillow health human wellness robot. That's being mocked by black and decker that has an X most voice solution in it. We have a TV accessory with sky with in China with the next most voice, so they Shen and numerous conferencing products from from Alexis elaborate s- and an EMMY China as well. So we have many actually many provex already in the field fine launch. They've mo-. They've quite recently been introduced over the loss of six to nine months. Okay. Yeah. And we have written about pillow health and black Deckers prio product, which is the exact same thing. Exactly. Which which flagging decker was upset that we wanted out that it was just basically a new label on the exact thing ridiculous. It's the same. Isn't it? It's fine perfectly fine strategy to do that because they're focused on other things than black and decker head sort of a mixed history in terms of acquisitions and building products that it was a good way for them to get into that market. But in any event, we digress. So just sort of at the high level, though, you know, we're talking about you getting into voice and some of these other applications devices, his it surprised you how popular smart speakers of them. Yes. It has. I think it's one of those things I think it's one of those technologies where you know, you whilst whilst the in the early days, at least, I'm really only open the door, a crack, you know, there was lots of potential for improvement in those those early products looking beyond that was was clear clearly a tremendous amount of potential. I think I think what's most being surprising is the rate of adoption not not the fact that the with selling millions of products that was always going to happen. But it's happened over a comparatively short space time. And he's really gross the public's mind chair. I used to back in the day back in two thousand fourteen you know, part of the pitch I would take around companies would be, you know, of you heard of this voice thing, it's pretty important. It's coming you need to be prepared for it. You know? That's that's part of my presentation. I absolutely don't need to to give any more. It's really more about where voices going to be found next. And is it just gonna be voices? It going to be voice plus contextual awareness, and so on and so forth. So you know, it's been a tremendous help to us really that the adoption has been as quick as it has. I still think that the the hype is bigger than the market right now. But I said anything that we're seeing we're seeing that turn around and what really starting to see the market start to arrive just done. The smart speaker side. Obviously, we see I've got some new data out or we'll have it out shortly this publishes. I will have it out that the adoption rate in the US is about a little over twenty six percent of US adults. So it's sixty six million is a size of that. And that does not include necessarily other. Devices setup boxes right bars things like that. Most people don't characterize smart speakers that verse odes, even so lightly broader than that. And and what I what I see there is that in the UK Germany little bit behind that couple. Other markets are actually going to jump ahead of the US this year, which I thought was interesting, but yet really really popular in the I think for someone like you say that you never questioned it. But when it happened you were surprised at the the pace of Dopp Shen. And I wonder if I wonder what? I wonder what your reaction would be. If I said that the rate of adoption of voice in general right now has been catalyzed more by the introduction of far field voice than it has in recent advances in SR the viewpoint is one that we show. I'm actually seeing some data behinds behind that one about customers has a number of products in the market. I would say they all but so maybe number in the market with. So it's across the range of costs points with pushed tool with NIA field microphones with field microphones, and their vacation has been that's off microphones is being used by mole individuals within a family context will say more frequently than than the push to token field variant. So I'm not annoying tiny surprised by that. I think in particular push to talk sort of misses the point. I mean, if you're looking at something like a small TV, for example, the he Timoti of Falfield voices that you don't need the remote control anymore than as a father of. Six children, you know, I can never find the remote controls, and that's one of my APs of ours. Fall voices concerned and having to find the remote control to push the button until into it doesn't really cut it for me. So I think versus pushed tool it's very clear that fields very important. Now having said that if the company's adulting voice interface, I think pushed to a sensible interim step because it's not just about the bit that we do. It's not just about the voice isolation. The contextual awareness. It's a tremendous amount of f on the user interface, which shouldn't be underestimated obviously, pushed it to gives you an opportunity to get that nailed before you at the at the fall field. So I can see why people do it taking that route. But really it's about fall field to to really deliver on the potential of the the small invoice interfaces that we work on and even the Amazon echo offers. Push talk is an alternative, which is useful particularly in testing environments. It actually makes me think of my my friend amid boozy runs a company in the states software development company called whittling go, and he wrote a book a couple years ago called don't make me tap. You know, this whole idea of field. You know, we think about it are definitely got a lot better in the NFL Hugh has gotten much much better. And that's those are critical. I think everyone recognizes that those were critical developments. But I think there's something else going on here in that the explosion in the use of voice, and sort of the next step level change after sort of people had gotten accustomed to series of sort of something in the background. They were using it. But they were only using it narrowly this idea that that smart speakers actually become training devices to reintroduce people voice did explain to them or introduced to them the broad set of use cases that they never they never even considered. Right. And so I look at like last year the in for a US listeners sort of the equivalent for Europe. If if you had a high end speaker product, low ends a little bit different. But if you had a high end speaker product, you're at IVA, and you didn't have voice there was really nothing to talk about. It's. It's essentially came from within two years from an interesting novelty to a must have feature any for anybody. In the audio space for this ambient auto audio type of solution show, the wig we characterize the emergence of voice in in three phases. I think the speak is definitely obviously the the entry product. And as you say, it's almost a must have is table stakes for many many, while speak accounts agrees, that voice voice interfaces, adding voice interfaces to existing categories is is really for me. At least the first phase of first phase of adoption. But I think that's whilst may not be proliferating yet in the market with with at least seeing things like yet. You know, TV's adopting voice interfaces seeing sound buzz voice interfaces and increasingly convergence products that some combination of TV sound bar and a set top box all including voice interfaces. So we're seeing that that in the sort of design. Funnel. If you like. And we'll see we'll say starting to see phase two which which I think he's characterized by somewhat more entrepreneurial ideas, where where you've got you know, what you power and potentially. You've got wifi adding outing voice interfaces to things like we have. We have a light switch designed, for example with the voice face. We have sun shade. Would you believe by Shane cross that has an moss? We we wrote about that. That was you. Yes. Yeah. Who knew you know, it's one of those things right? When you think about it? It's it's potentially, obviously entertainment, but also potentially if you think about the hospital industry, it's the way of wave ordering a drink. We don't want to get out from us online just wanted to get gin and tonic every every half an hour say whichever whatever it is. So some various different categories that starting to starting to introduce this kind of technology, and then phase three I think is what I would describe as Ambien voices everywhere. These ambient technology everywhere. This so much technology that the only way of dealing with it is by interacting using natural language interface. And at that point, I think the whole whole sort of consumer electronic space starts to ask the question that the currently the speaker people are asking, which is well, how could we exist without a voice interface. What else is there to sell out products that point United States any moment enormous market? So. Pretty exciting. Absolutely. So I would add into face to the way. I think about this too. I sort of go beyond Ambien. I think it's starting to be a catalyst for near field. Mike again, and in headphones your buds, hearing aids all getting voice assistance. And so it we talk about that proliferation. Are you seeing that as well for near field in the the in year over your devices? Yeah, let's no so much space at the moment, but Sunday in the next twelve to eighteen months will start to to look more closely at the the most of the wearable site it suddenly a market that we brush up against regularly. But the the space rain right now with mostly tethered products that we we hang on the end of the USB cable, or we have power of some other description, Allah flexibility gives a certain aspects certain capabilities in the small home small devices whether wearables devices have a slightly different set of requirements and primarily I'm talking here about delivering these kinds of processes. Thing loads very very low power. And at the moment, we don't have a platform for that. But by this time next year, we will got it yet, low power, low low processing overhead Sportin because of battery life. How do you see that playing out over time? I think there was a lot of discussion couple of years ago where largely by people who were skeptical about smart speakers ambient losing that you know, we're going to have the phone the watch that your buds. It's going to be voice is going to be a personal tool that people are going to us through their wearables and the devices that they have as opposed to in the places they have it. How do you see this playing out over time? What's going to be the general breakdown of usage in those different scenarios? So I think it's important to be able to walk between, for example, be your home in you'll call may continue to maintain a dialogue with with your voice assistant that would suggest something like an end budwood or a phone all some somewhere -able would be the the appropriate medium for that. But. I think there's a place for all of it to be to be to be honest. I think they'll be some use cases that will be appropriate for mobile phones, a car remember where I saw some statistics about how many people tend to to ditch them all phone as soon as they will consider home. They have some some psychological attachment between mobile phone work, and they distance themselves from it similarly, bedrooms bathrooms, and so on people don't tend to wearing electronics tend to be wearing wearables. So I think there's certain scenarios where for example, a static small home devices appropriate, and then this certain scenarios and things like where Ables and phones more appropriate. So I didn't really see the dominance of any one of those categories. I think they'll be insane really in the same way as their off today a mix of solutions, but important thing will be the continuity of the the user experience you can move between those spaces and those devices without encountering vastly different characteristics in the end the voice interface. Yeah, that's great. Okay. So let's let's transition a little bit. I wanna talk about some of the technical aspects. What you do because you have technical background. And I like to take it as you've that when that's an option, and I'm gonna read something from your website, which I don't normally do that often quoting marketing materials, but I thought this could help spur some discussion engineering that goes into making far field, voice recognition work. Here's what it said. So today, we detect voice commands accurately from across the room. Even busy environments when the person's speaking softly significant challenges in doing that. For example, the output of the device needs to adapt to the acoustic environment saw furnishings absorb noise, whereas hard surfaces reflect sound bouncing around the room. And of course, the user may be moving around the room while talking altering the quality of the voice feed added to it. There'll be a range of background noise, and the voice enabled device itself may already be playing music. So there's a lot in that statement. So that's why I thought. Could you break that down force a little bit? Just talk about the technology that goes into what Xmas does for these devices around voice. Yeah. She'll say the the null star is for a company like us to create a boom microphone experience. If you imagine you've got a room full of people and they're all wearing their own boom. Microphones a mic microphone directly under the nose. And all the other noise is is cancelled by virtue of the proximity of that microphone. So that's what we that's what we try to shoot for and conspires against us in many, many ways, and to some extent the consumer electronics devices themselves conspire against us. So the first thing is obviously in any environment is potentially noise, and I'm sat here in a meeting room with an conditioning unit above my head. So that's that's a fanny diffuse noise. But it's it's pretty noisy nonetheless. And that noise needs to be cancelled to give us that boom microphone experience. The second type of noise that we might experiences is a point. Annoy solution so distracted that we we will distract us something like, for example, if if I've got my Amazon echo and things got my echo unit. But but I'm actually listening to the TV in the kitchen. So so so the Amazon echo would need to deal with the fact that there is a persistent point noise souls, which it needs to eliminate in order to to to to hear that that voice without virtual boom microphone. So that say, that's point noise source. So we've got digger annoyed. I like, how do you? How do you limit it that is that really just frequency range? They're very sways of doing it one way is to actively direct to microphone. So we talk a lot about being foaming microphones and sound source separation being fully microphones bit like shining toll. She in the direction of sound that you want to listen to. So if you can imagine a sonic tool that the picked up noise from a particular clash light for my English list. Yeah. Flashlight so so that that can help obviously because he gives you a high gain comparatively high gain in the direction of the flashlight, so by definition, everything else's is attenuated. There's another which is another technique, which is always the inverse of that, which is an interference cancel where everything else every other direction gets a uniform gain except the direction of the interfering sound source which actually gets attenuated significantly. So that's like an inverse of that flashlight concert. Well, there's a third way which is sound so separation in my history, the beginning of these pace, I didn't mention that some between two thousand sixteen in two thousand eighteen we acquired a company Boston set technologies, and we acquired that company because they had a really really exciting sound so separation technology, which allows individuals essentially to be picked out in a room and listen to individually, so we could we could pick out. Three or four individuals in a noisy space and simply listened to their voices. As though they were wearing this microphone, and that uses a sound soil separation technology, which is a different technique, which basically breaks the soundscape down into individual pieces figures out, which pieces are related to one another and then puts them back together Cording to the source and eliminate sources that you don't want and amplifies those that you do so to human hearing. Right. I mean because humans can do this. Yeah. I mean, that's one of the challenges that she when you demoing this stuff we were mobile world congress last week in the NBA noise in in the room that we ran with seventy five. So it was a bit like custody having a truck next with its engine running. Right. But humans have really really good at this. You know, that's one of the one of the challenges of demoing this stuff is that, you know, you've got human beings in front of us saying, hey, you know, this stuff's not so hard. To do it from sort of signal processing techniques is a significant challenge. So that's one of the one of the sort of perceptions. That says sometimes we face is wrong. And I like to do this is called me that difficult. That's interesting. So what actually just sorta come back to your previous previous question about about source of noise? The one thing that we didn't touch on was was self noise was the fact that these units and five most of them make their own noise. Probably music in case of speaker. Of course, we have to eliminate that. And that school, but that's that allows people to everybody's familiar with ball in even if they don't know it by that name allows you to interrupt while speak is whilst they're playing music, and that's that's a significant challenge because you have to eliminate that music. And it's it's ridge inform. But also all of the reflections that hit all of the surfaces in the room, and that's cool an echo cancer. And that's where a lot of the horse power gets used up in these solutions. So bargin the biggest challenge I would say it's one of the biggest challenges, I think that bargin is. Is is technology that is proportional to the number of output channels? So if you've got a stereo image that you'll project into the room, you have to have echo consulates, each one of those those two channels, and obviously if you're talking about a surround sound system what you might have seven point one emitting into the room, then than in theory after co cancel this all of those channels to get a good user experience. So it becomes quite a significant problem as you scale the number of channels. Yes. Want to one of the key technologies, but it tights probably typically three four or five separate techniques to really really clean up an audio. An audio signal voice signal that's been recorded in a in a Reverbere noisy. Ring is high range, low range frequency more difficult. That's emotional. I'm qualified to answer that question. But typically low range frequency of isn't very directional. We tend to find that to some extent you can ignore the low range free. Quincy's and the will be tolerant to us doing that. So for example, being fully with struggle to eliminate low range noise because it would be it would appear to come from every direction. So you can use techniques that said that that eliminate those low frequencies before sending the voice to the speech recognition by kens. But says aside sliding be all night my expertise to come into much on that while despite your protest actions. You did answer the questions we'll take. All right. So let's talk about on device versus cloud, speech recognition. I don't know if you mentioned today, but I know you you've done work with VS, Alexa, voice service. It's clearly cloud-based maybe some others as well. You talk. I think on your website or some of your material talk about kid AI for local how do you think about those in terms of your system, and how you work with the s are a really good question. I think there's there's three Hase's to consider. We'll see those, you know, the cloud base case, which will very familiar with. There's a hybrid which has some Esau capability, local and some in the cloud, and they mail rate, the local capability male parade offline so when there's no cloud connection or male rate as a FOSS respond to well known queries, and it will defer to the cloud for things that that it can't address locally. And then this purely off-line, which has used cases some use cases based on Welt, depending on what set technology deploy. If you put some some sort of heavy iron processing. Flying than Oviously or addressing any privacy. Consents? You might have fast speech going into the cloud or you might use a somewhat less capable processing device, but just have a very small dictionary. That is perfectly reasonable for something like a light switch, for example. I think the problem because when you transition I think I've had a convincing case full equality user experience where you flip between low dictionary local implementation of estate, very rich natural user interface online. I think that managing the user experience through that transition is problematic. So I think that's one one limitation of this hybrid mixing of owning off-line right now. Most of the hybrid is being used for wake word right so awake. Where locally maybe you also verify it in the cloud, but think of wake word locally. And then all the other queries are cloud that seems to be a pretty good use case. I'm not familiar with anybody who's doing. Well, actually, I do know. Somebody's doing. I think I think it's fair to say that some of the work that Mercedes does in the car does do this type of negotiation between what types of queries. It's gonna handle locally versus which it's going to send off to the cloud and have something like how to so. Yeah, I think what's his is a great example of use case, which is quite safety conscious. So, you know, having a fully online experience. He's very limiting, and tens of what it can be useful in automated context do need some local processing capability, so he's a good use case the other one that I've heard Cullman league quoted as if you're if you will in the presence of a robot the domestic Roble or a service Roble in store, for example, and you shout stall, you don't want to go to the cloud to determine what you actually meant by that camman you wanted to actually to stop moving. So there are certain cases around security. I think where I think it makes sense to have a very school dictionary. But the transition between the flying on the online dictionary has to be carefully managed to thing. Yeah. Yeah. I do that. That's that would take a lot of attention. I could manage imagine over time. So what is your expectation going for in terms of cloud based versus offline and local do expect the vast majority of the solutions to really focus on this cloud base solution because of the robustness of it or do you see a future where we're gonna have so many of these devices with voice interaction, many of which you don't want to always have to have wifi connection or persistent or that maybe that's not necessary because narrow vocabulary is going to be more important where you thinking about this in terms of how it's gonna play out local versus cloud. So I think first of all I think they'll be some countries that will just have local dictionaries. But that kind of you know, they'll be limited in scope, they'll be they'll have a limited amounts of functionality on probably below end of the mall kit things, I'd like switches, and so on in terms of connected components. I think there's two architectures that the will prevail. One is. Weather's local intelligence. So there's some sort of a hub and amongst out customer base was seeing competition for ownership of that that entity. So that could potentially run I SA locally it could run a off-line ISR reasonably ritual. Flonase ARIN deferred to the cloud on certain occasions. I think that's valid architecture where you've got sort of Central Intelligence in the home, potentially in the car. And then this the case where you get what you devote to the cloud all the time. And that's really the model that we're in now. And I think that's okay. And it depends certain geographies feel better about that than others. I think in the west we tend to concern us a little bit more with security in our experience in China is less concerns. And consequently, the always connected WI fi abled everything is more of a tractable solution for them. But I think today's market is really all about everything connected to the cloud as say this in the customer base now designed funnel. We starting to see hybrid products. That are starting to try to dominate the small hob- space. But that's that's yet to shake out. I think that makes sense now I have had previous gas talk about China and India particular when you get outside the large cities where the the lower reliability of broadband connections has potential to undermine some of these in Home Solutions, and that's why they're focused more on LT EBay solutions, if they're going to do cloud and everything else might have more local processing because if the natural wifi. But if the internet connection is down than there. Centrally nonfunctional, and I even had a conversation with an Emma's on executive. It's ES in twenty eight teen said that they're looking at ways that they can make the echo functional even if internet goes out, so we'll see where that goes ahead. Another thing I noticed on your website. What looked exactly like Deutsche Telekom's, magenta, smart speaker? What did you do for them? So we collaborated with a I'm being slightly hesitant because this is in the public domain. We liberated with frown Hoffa only microphone design for that. Okay. Great. Hey, own just on the microphone design, you mention that when we talk about the beam forming linear verse circular, how do you have to handle those differently? It's really just a question of refining, the owner them, so the don't change dramatically. So obviously a lot of the mall case to microphone at the moments. So to Mike offensively is all you'll be leading role circular. But when we've spent most of all time is focusing on what we perceive to be more of a trend, which is towards Linnea microphones, the the thesis that was Linnea microphones fit well into flat services and most consumer electronics actually is flat against the wall. And we had a particular island smart TV's when we when we were thinking about this. So we weren't interested in the sort of coq candidates. That's being napping quite successful. Late in the small speaker space, but we were interested in other categories. And if you look at white Goodson TV's and sound balls and so on they typically flat surfaces. They haven't won eighty degree range of of interest. If you will and that set favors Alenia solution. So the algorithms broadly, speaking out rhythms of the same. They adopt differently. A secular microphone, obviously has a capability sweeping three hundred sixty degree range, typically Linnea microphone aliases front and back. So you'll have one hundred and eighty degrees. But you gotta tell whether it's in front or behind the array, but in the electric that we that we've fitted into that's not relevant because behind is wall. So it's it's generally better addressing as a say, no flat Pinal solutions, and we didn't have a secular. In fact, we talked about and they using secure, right? And obviously we have a product roadmap as well. But most of success is being because we made that selection to focus on Linea microphone analyze. Yeah. That makes glass I question. Okay. So for those of you listeners who really didn't want to talk about the technology. We're moving on to use cases. And one of the things Mark that you said an interview this year was getting rid of the remote control is among the strongest use cases for voice technology, and you just mentioned that ago. So let's talk about that. Like, why is in particular? It's it's it's more than just losing the micro misplacing, the remote, right? Yes. It is. I think just a moment ago when we touched on this one of the things that quite often if a customer is new to this kind of technology, and they know they want voice interface, but they don't they haven't pressed through from from front to back as it were the thing that often gets forgotten is the user experience needs needs quite a bit of attention and. For things like TV's, and and boxes, and so on this one thing to be able to capture the voice and to understand the words that were being spoken. But to actually reflect that meaningfully in the user interfaces is is quite a different problem. So we just have to make sure that those customers are aware that they need to consider that from the early days, and essentially typical customer engagement for us done. This a few times. Now is we tend to be involved with the the the design right from the conception will do evaluations and testing with customers, and then will pretty much hold a hand through the whole process. Generally, speaking introduce the back end, the SA provider at some point as well just to make sure that all of the interfaces required to work together to make an excellent voice interface in place says quite an involved process. Yeah. That makes sense to me. I wonder too like how is this different weather? It's pushed to talk which we talked about earlier or far field voice. I have both in my house because I exp-. A lot of things. My set-top box provided my cable provider has a push to talk option, which is great. If the remote is near you. And then I have a one of the fire TV cubes which is far field, which also has a remote with push push to talk. So you've got both options there. How do you think about those differently? Or are you only working in the far field with your customers? Yeah. We only work in the fall field. So we really actually when we counts customer this doing push tool. We generally feel pretty positive about that because the transition from pushed to a fall field voice interface is significantly easier. Because generally speaking that customer who's gone through the thought process of what does it mean to make voice interface, a really good experience of the customer, and then the step from from push the tool to field and significantly easier. One thing that we're all seeing which I think is kind of an interesting development is third step which is having a fall field. Remote control unit, which is sensually a another step towards a built-in fall field. But actually means that you can leave the remote control unit on the on the I'm gonna use another English. Click rotel mental peace only beneath the TV, and you can use a fall voice interface. You know, if you don't if you don't want to hold the thing in your hand. So I think there's a few purchase hit. But the ultimate destination is a is a fall voice interface. I think can push the tool tactical approach and to to segment the problem. But as the utility of fall field is demonstrably bathroom using a push to remote so controls televisions. A lot of sense that that's of interest to you. You're doing a lot of work in that space. What are some other use cases for ambient and far field voice that you think are going to really drive voice adoption going forward? Yeah. It's a good question. I think there's a there's a lot of examples some of them quite cookie. You know, we talked about shade Croft as a as an example of you know, I would say very full with thinking. Company in in. What would otherwise be a one might think? Low tech space thing they've so carefully about use case and very specific market in that case of hospitality market. I then you've go to think about this thing from room to room. So I would say the kitchen in the kitchen, we've Oviously small speakers. They've they've really carved out place in the kitchen, but we're now starting to see you may have seen the G voice enabled crooked hood that has an X most off field microphone in that was launched at CS this year. That's a good use cases. It's directly above the the cooking surfaces has the ability to shine a camera down onto the cooking surfaces all towards the user. And clearly you can you can deliver content to that. Use a Welsh there is and tans of busy, which you think is one of the key qualifiers to to to the the compelling. Very compelling use cases speakers and and cook a hoods in the kitchen. I think obviously there's within something like a pantry than. The price of lights, which is the most appropriate lamp shades. Would you believe unites Scott power, it has generally speaking good line of sight to the entire room? So that's not such a city idea either lamps in light shades. And then washing machines, if I'm carrying mouldering to the washing machine, I wanna say from the dole because once again, my hands of busy, those kinds of things that perhaps not the not the most obvious most charismatic, but they have they have genuine utility, and you can walk through the house and figure out case OMB in my boss from well. I kinda like being having a quick catch up on using my boss from so we can see now we're seeing smart mirrors in the bedroom that the the the the along clock and so on and so forth. So there's I think there's a number of use cases in the home, and then Oviously outside of the home in office environment things, I it conditioning units typically out of reach and people have playing with them. So voice interface won't conditioning unit is is a smart move. We talked to some companies about the broader category of human sensing in work. Spaces. So that we can control not just the conditioning, but also lighting and so on so. Different use cases in different parts of our lives to motive as well. So this thing lots and lots of use cases where there's obvious utility right up until the tipping point where if anything needs to be directed with. Then everybody expects to have a user voice interface often often liken it to. Touchscreen suppose came on the scene. My my children was three to five years old. And they said we'll catchable a ton of visions television sets in the store and put their grubby fingers on it and swipe, and it was it was a natural expectation that was the that was ubiquitous user interface. And I think the same thing will be true voice into you mentioned a number of use cases. They're all utilitarian, not really conversational, but voice command oriented. How do you see that developing in terms of conversational versus command? Oh, I think so I guess my underlying premises not going to need ten of these things in every room, but you are going to want to carry on a conversation with your eye with your voice assistant throughout your home. So I think whilst easy to argue that a light switch intrinsically doesn't need to have a natural language interface. If it's the only voice assistant point in a in a room. Then the argument is well in order to have a consistent conversation with a voice assistant throughout the home. Then yes, it does. Need a natural language into. Case, right. That makes. So let me ask you a question, then so this brings up this other point, and I don't know if you've worked with Sonos or not, but so famously said two years ago that they were going to put who will Sistan and Amazon Alexa in the same device. Also allow you to access Siri is that a good idea. I think yes. And no. So I think it's a good idea. Because. If you're thinking purely from the consumer's perspective the consumer wants to access those fraud. I said the great the companies that deliver great technology and great. Great use cases. But. I don't wanna have to access those through four different speakers on my kitchen kitchen table. So from a from a user experience I'd like to have one device, and I'd like to be able to speak to those things now one trend where observing in in the third party space. And when I say third party, I'm talking about the non Amazon's and the non giggles jammies. And so on but companies that were existing brands that retrofitting voice interfaces to the categories that they already selling those companies want to put their own voice assistant alongside one of the either Google Amazon all whichever which company they want to ally themselves to and so then then you end up with somewhat schizophrenic device which shown if you say Alexa than that. I was in response. If you say, whatever their own key. What is then their own Irish funds? And so you've got this this branding conflict and right from the get-go. And I think that's just you know, if you had more and more different is to that with different keywords steering. Capability studies just gonna get worse. And it's just gonna result in a police or experience. So I think some point in the future in the may be somewhat heretical. But at some point in the future, I think there's going to be an intermediary. Let's going to be a digital Butler if you will or a digital twins. We've sometimes cooled it that sits between you. And and you know, I was in Google and the thousands of other is going to be competing for your attention is essentially protecting your attention. Your attention span whilst giving you access to all of the benefits of these various different providers. So one voice assistant concept is interesting. So we talked about precedes earlier and the US actually does Xs got nuance and how to find the back end. Now, all the people who have that architecture. But it it reminds me of my interview with that chair last November Adams, the one of the co founders of Siri and co-founder vivid which was later quired by Samsung, and they were really adamant about this sort of one. Voice assistant concept, and that people need that that otherwise there's going to be too much. And is I think about the Facebook did this with Alexa, longside in largely because Facebook voice interface wasn't robust enough to meet basic consumer expectations. And so Alexa gets in the game. Jingo and France by orange who is the same thing. They they they spent two years and realized that they were just going to be too narrow Sumer expectations. So they put Alexa on there as well. The sort of say, well, at least it's Alexa device. So you know, when we think about this. Then you really believe that we're gonna lead towards this single voice assistant, which is going to be personal. And then that's going to negotiate all our interactions with other assistance. Yeah. I think that's a possible future. I struggled with the idea that the the existing players going to dominate this space on an individual level in into the into the distant future. Whether it's five years ten years fifteen years, but some I think this is going to be so important that it's. Not going to be something that people will see to a particular play unless they have a very special relationship, and that relationship might well be it might even be a business to consumer kind of subscription relationship. A bit light. We have with all mobile handsets where you know, there was a contract, and it's well-known by both sides. And it's well, it's well, trusted people understand what the business model is that comfortable with it. So I think that there is the potential maybe that's already play in in the in in other forms of communication that could take that role. Maybe it's one of the existing vendors that can that can dominate that space. But my suspicion is going to be another party that's going to sit between between the user and all of these competing is protecting the privacy and security and also making sure that any information that they get that use it gets his timely relevant and not not an interruption to their everyday lives to think it'll be one party that gets dominant national or. Global market share. That's sort of the personal assistant that people use to connect assistance, or do you think there's going to be a number of those? And they all have this ability to connect to a lot of back ends. I think I think you'll be a number of them. You know, I think one I guess a number of clauses of company, you might pick out that could potentially do this. And suddenly, you know, wouldn't completely eliminate the idea that it might be Google or or an Amazon or a one of the other existing businesses in this space tool, but you know, you can imagine the telcos doing it. You can imagine all the service providers doing it. Other companies with subscription relationships with consumers could step into that space and claim some sort of moral high ground, if you will that would enable them to to be the the protectors of consumer data and privacy, and so on and also their attention or maybe a new play I'm tempted to think about. L in in in the nineties was the window on the internet until companies like mosaic Netscape came along democratized access to the internet and created enormous off gene ities for through other companies to to to help people access the the enormity of information that was there. I think there's a tremendous amount of Pachuca see between the the current state of voice. And and where it could go as a as a fully democratized access to to the internet and all of the, you know, the intelligence that resides there. Well, there you haven't the future of oyster systens singly summed up by Mark lip that I appreciate you taking so much time today, the share your background tell the Xmas story to the voice by audience. How can our listeners? Learn more about the company follow up with you track on social media. What's the best way for them to stay engaged? Well, actual still comb is going to be the the access point that would lead to the others. I'm on linked to in of course. And more than happy to have people reach out to me owning and starting conversation. That'd be great from. Okay. That's great. You one more thing. I just wanted to. Thank you feel time as well. It was great comb sation enjoyed it. Well, I appreciate everyone spending time and sharing the thoughts with the audience. And then, you know, as I've told many people I learned something every week. It's one of my favorite things. I've done. Over eighty of these now, so that's eighty consecutive weeks of doing this. But boy, it's eliminating for me. So I definitely appreciate it. And for those of you wanna find these things, we'll do the show notes, we'll do some links for you an Xmas X M O S dot com. So just in case you're wondering how that translates into the URL world. So thank you very much Marcle pet for spending so much time today. I'm Brechin sell. You can find me on the Twitter at brick and sell checkout voice about that voice. About says on Alexa, and Google assistant to get a daily update. We also have a flash briefing. You guys know all of these things and checkout research, we got some new research out on smart speaker option in the US and Australia coming out soon. So you will see the most in depth analysis of of how people are using these. How many they have what they think about? Them and come back next week. We have another mazing guests lined up. Thanks a lot Margaret appreciate spend some time today. Thank you.

Coming up next