Why Microsoft Research could be good for British retail.

James

Thu 15th Nov 2012

On the 8th of November Microsoft Research posted a video on their website featuring Rick Rashid, Microsoft’s Chief Research Officer, talking about speech recognition software at a conference in China. In that video he demonstrates their deep neural networks system which recognised his speech in real-time, converted it into Chinese text and finally pushed it out the other end as spoken Chinese. As the system has listened to him for more than an hour, it even is supposed to sound like him, though if you ask me, I think it sounds like a 1980s Speak & Spell. With this one piece of kit, place and location becomes irrelevant. Rashid could have walked into any city, or any high street, and if not quite sounding like a native, he certainly could be understood as one. But just how far can we take this?

I expect you might be thinking, well my Smartphone does this with the likes of Apple’s Siri system – but I’m sure you’ve experienced the frustration and disappointment when it gets your speech command wrong. This is where the tech gets interesting. Rashid suggests that current speech recognition software gets it right between 75-80% of the time, but in reality that means one in four words are incorrect. His software hit rate is heading towards a 90% accuracy level, which means only one in ten words is incorrect. With further development, he cites that within two years they will be approaching accuracy levels in the high nineties. Within three to five years our Smartphones will have the potential to translate any language into your mother tongue with near perfect accuracy (Babel Fish for everyone?). No longer will we need the likes of Rosetta Stone, language schools or the Tricolore books (yes, I know where the hospital is in La Rochelle). Certainly I think there will be some cultural, educational and ethical friction, but I’m sure they said that about the calculator too.

But this is just the start. If you coupled accurate translation with augmented reality technology, such as the Google Glass project or Vuzix’s M100 Smart Glasses, it becomes really, really interesting. Such a combination has the potential to translate written words, street signs, menus and product labels in real-time, whilst superimposing the translation back onto the physical objects in front of you (think Word Lens for the iPhone on steroids). The mind boggles at the possibilities. In less than five years I think we’ll be able to walk a high street in a foreign country speaking and reading the local language by proxy.

This is all very nice, but what does this mean for retail in the future?

According to Visitbritain.com, in 2011 there were 30.798 million ‘Inbound Tourists’ who visited our fair shores. On average each of those visitors spent £584.00 on accommodation, food, services and retail goods. I expect many of those tourists walked the high streets of Britain in search of some retail therapy but ended putting the item back on the shelf because they couldn’t read the label. Maybe they didn’t have the confidence to talk to the shop assistant. Perhaps they couldn’t find the shop, understand the ingredients on the menu, workout the exchange rate, the list goes on. But what if we eliminated the language barriers from each retail touch point? Every label, every description and every price translated and delivered in real-time in the visitor’s respective language. I’m certain the average basket value for foreign tourists would increase.

We’re on the cusp of becoming a closer and more connected global community where language and cultural barriers are becoming a thing of the past. As Britain the 7th most visited country in the world, Microsoft Research can only be a good thing for British retail in the future.