Make us your home page

Today’s top headlines delivered to you daily.

(View our Privacy Policy)

You still type? How quaint!

Speech-recognition technology has improved dramatically.

Speech-recognition technology has improved dramatically.

I'm not typing this article. I'm dictating it to my iPhone as I walk down to my office in New York City.

Admittedly the iPhone's speech-recognition features went [sic] meant for composing full-length articles for publication. Sorry, that should have been "weren't." Some transcription errors are inevitable, but I'm doing this to make a point. Our mobile devices have gotten surprisingly good at understanding us — probably a lot better than you remember, if you haven't tried talking to your phone in a while.

Speech recognition technology got a lot of hype when Apple released Siri, four years ago this week. But if you're like most iPhone users, you soon just missed the haunted voice assistant as little more than a parlor trick. (Sorry that was supposed to be "dismissed" not "just missed." And "vaunted" not "haunted.") Series frequent misunderstandings — whoops, I mean Siri is a frequent misunderstandings — darn it, I mean the frequent misunderstandings by Siri – gave it more comedic value than practical value.

Believe it or not, despite the voice typos above, that's no longer the case. Not only is Siri a better listener than it used to be, but Apple's notes and mail apps have sprouted serviceable dictation features, too. And as much as Apple's speech-recognition capabilities have improved, the ones Google has added to its apps and android operating system may be even better. In both cases, typing by voice is now easier in many cases than doing it by touchscreen, especially if you're on the go.

Clearly the technology is not yet perfect. How men's are still problematic, for one thing. I mean homonyms are still problematic. And if you want punctuation marks, you have to speak them out loud.

I'm going to go back to typing on my laptop now, both because I'm sure both you and my editor are tired of the typos. And to be honest, I was starting to feel a little like Joaquin Phoenix in Her, murmuring sweet nothings to my phone as I moseyed down the street.

Still, I wouldn't have dreamed of trying to compose even a brief work-related email on a smartphone by voice just a couple of years ago, let alone a full-length column. Now I do the former regularly. And for some basic tasks, like typing up a grocery list, I almost never use the keypad anymore. Which reminds me of one other obstacle: Talking to your mobile device typically requires an Internet connection.

Speech recognition software's reliance on the cloud is both an inconvenience and the source of its power. You notice that when you dictate something, there's a brief lag before it shows up on the screen. That's because your device is zipping your voice signals to remote servers for processing.

One reason Google's technology has improved so rapidly, explains engineering director Scott Huffman, is that all that incoming voice data gives the company's machine-learning algorithms a lot to work with. And the algorithms have gotten more powerful. "One of the big advances over the last year or two," he says, "has been in using new kinds of machine-learning technology that are scaled to many, many machines. We're now able to apply very large-scale parallel computing to interpret the sounds that you make."

The software's first job is to figure out which sounds are your words, as opposed to ambient noise or the words of people around you. For a nonhuman, that's harder than you might think. Then it has to parse your speech by evaluating not only each sound you make, but also the linguistic context that surrounds it — just as people do subconsciously when they listen to one another.

Sometimes you can actually see the software recalibrating on the fly. Recently I told my Google app, "Remind me to email Ben at 4 o'clock." At first it typed, "Remind me to email Bennett." But when it heard the words "4 o'clock," it realized I had more likely said "Ben at" then "Bennett," and it duly set the proper reminder.

This is exactly the type of computing problem at which Google excels. Its core product, web search, relies on the ability to intuit the intent behind a string of search terms, even if they're misspelled or ambiguously phrased. A search for "bank" will turn up different results based on your location and search history. Similar smarts could soon be applied to speech recognition technology, Huffman said. When you're in Boston, for instance, Google might be more likely to render "red socks" as "Red Sox," especially if it knows you're a baseball fan.

The smoother the technology gets, the less typing we'll do on our phones. Several of my colleagues already use voice functions for a range of applications, from setting alarms to settling a bet at a bar. When you're out with friends, pulling out a phone and typing a query into Google feels antisocial, one colleague said. But asking Google a question out loud and getting a spoken response "just feels like part of the conversation."

And it isn't just the young who are doing it. Several people told me their parents use their phones' voice features the most — because they're the ones who most hate typing.

You still type? How quaint! 04/23/14 [Last modified: Wednesday, April 23, 2014 6:20pm]
Photo reprints | Article reprints

© 2017 Tampa Bay Times


Join the discussion: Click to view comments, add yours

  1. AP-US-Unemployment-Benefits,130

    Working Life

    WASHINGTON — Slightly more people sought U.S. unemployment benefits last week, but the number of applications remained at a historically low level that suggests the job market is healthy.

    On Thursday, June 22, 2017, the Labor Department reports on the number of people who applied for unemployment benefits a week earlier. [Associated Press]
  2. Study: States with legalized marijuana have more car crash claims


    DENVER — A recent insurance study links increased car crash claims to legalized recreational marijuana.

    A close-up of a flowering marijuana plant in the production room of Modern Health Concepts' greenhouse on Tuesday, Jan. 17, 2017. [C.M. Guerrero | Miami Herald/TNS]
  3. Black lawmaker: I was called 'monkey' at protest to change Confederate street signs


    A black state legislator says he was called a "n-----" and a "monkey" Wednesday by pro-Confederates who want Hollywood to keep three roads named after Confederate generals, including one of the founders of the Ku Klux Klan.

    Rep. Shevrin Jones.
  4. Senate GOP set to release health-care bill (w/video)


    WASHINGTON -— Senate Republicans on Thursday plan to release a health-care bill that would curtail federal Medicaid funding, repeal taxes on the wealthy and eliminate funding for Planned Parenthood as part of an effort to fulfill a years-long promise to undo Barack Obama's signature health-care law.

    From left, Uplift Executive Director Heidi Mansir, of Gardiner, Maine, former West Virginia State Rep. Denise Campbell, Elkins, W. Va., University of Alaska-Anchorage student Moira Pyhala of Soldotna, Alaska, and National Farmers Union President Roger Johnson appear before Democratic senators holding a hearing about how the GOP health care bill could hurt rural Americans, at the Capitol in Washington, Wednesday, June 21, 2017. Senate Majority Leader Mitch McConnell was expected to push for a vote next week on the legislation, which would eliminate much of Obama's 2010 overhaul and leave government with a diminished role in providing coverage and helping people afford it. [Associated Press]
  5. Pasco fire station reopens after hundreds of bats forced crews out

    Human Interest

    Fire crews have returned to a Hudson fire station nearly two weeks after they were forced out by possibly thousands of bats.

    Fire crews returned to Station 39 in Hudson on June 21, 2017, nearly twoo weeks after the building was closed due to a rat infestation. [Times files]