Bill Gates fala sobre reconhecimento de voz

Ultimamente tenho andado a ver umas coisas sobre o Windows Speech Recognition/System.Speech. O motor de conhecimento de  voz já está presente desde o Windows XP, mas a versão incluida no Vista está bem melhor e é bastante poderosa. É uma funcionalidade que, a meu ver,  pode ajudar principalmente pessoas com alguma limitação na utilização dos inputs mais comuns (rato e teclado ou mesmo touchscreens) já que não implica contacto físico. Mas mesmo para os utilizadores comuns pode ser uma funcionalidade que melhora a interacção com o computador. À primeira utilização parece um pouco complicado mas depois a aprender os comandos disponíveis e o modo de funcionamento, torna-se bastante mais fácil; é ainda possível treinar o computador para reconhecer melhor o padrão de voz de cada utilizador. A melhor parte: estão disponíveis API’s para interagir com os motores de reconhecimento e síntese de voz! Podemos fazer as nossas próprias aplicações que reagem à voz do utilizador.
Andava pela web a ler umas coisas sobre o tema e encontrei uma entrevista ao Bill Gates. Ao que parece ele sempre foi bastante entusiasta do reconhecimento e síntese de voz dentro e fora da Microsoft. Deixo aqui um excerto da entrevista:

What are some of the areas where you see voice going that people aren’t necessarily thinking about today?
Gates: To me, voice is in the broad realm of natural interface. And natural interface is (the notion of) screens everywhere–screen in your desk, screen in your tables, screen on your walls, no more white boards, touching, which is like Surface, where you can manipulate things. It’s a pen so you can have ink wherever you want. You know, pull up an article, write a little note on it and get it sent off to a friend.

The speech recognition comes into it–all these things about natural interface are coming to the fore, and they are probably the thing that’s most underestimated right now about the digital revolution. People kind of gasp when they see how touch works on Surface, when they touch their iPhone then, "Ooooh, wow," you know, that’s just such a natural thing.

When voice recognition is used in the right way–let’s say you’re in the car and you want to pick somebody to call–that’s improved very dramatically, or speech output, text to speech, these things have gotten very good.

You talked about different natural language interfaces. You know, with multitouch, it seems to have really captured people’s imaginations, both with what you guys have shown with Surface, certainly with the iPhone. Voice seems to be a little slower in terms of speech recognition as a mainstream computer interface.
Gates: Well, that’s fair. Voice recognition is a harder thing. There are certainly tons of people, and I mean millions, who for some reason, the keyboard’s not attractive to them. Either they have repetitive stress injury, or they’re in a work environment where they’re doing something else with their hands, where they’ve taken the time to learn the software and adapt to the software and gone through the training process there. And they love it. They can’t believe other people don’t use it.

For the rest of us, the keyboard has worked so well that we are even getting the keyboard into phones. I think voice search on the phone is one of those applications that would really drive it forward. I mean, why should I have to try and type something in? I’ve got a phone, I’ve got a talk button; so that’s one of the areas we’re betting on.

You guys built a pretty significant voice recognition engine into Vista. It hardly gets talked about. Are you surprised that some of the things you did in Vista aren’t getting more attention?
Gates: Well, when you sell a product to hundreds of millions of users, there are features that millions of users love that you can call an obscure feature because, percentage wise, it’s not very many. You know, Butler Lampson, one of our great researchers who has done great work going all the way back to his days at Xerox, was just sending me mail about how fantastic the improvements in the speech stuff are in Vista and, you know, we’re hard at work on the next version of Windows. We’re going to take this speech stuff even further.

Fica a ideia de que a Microsoft vai continuar a apostar nesta área. Entrevista completa aqui.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s