Some insights on voice interfaces from Ian Bicking:

Voice interfaces are voice interfaces. They are a way for the user to express their desire, using patterns that might be skeuomorphism of regular voice interactions, or might be specific learned behaviors. It’s not a conversation. You aren’t talking with the computer.

I’ve been speaking with Alexa for quite some time now — we like to talk about the status of the lights in the living room and which is the most suitable time for me to wake up.

Also very true:

I hate how voice interfaces force us to speak without pauses because a pause is treated as the end of the statement. Many systems use a “wakeword” or “keyword spotting” to start the interaction. What if we used keyword spotting to determine the end as well? “Please” might be a good choice. It’s like the Enter key.

For voice to work really well you need a narrow and predictable domain. You need to know what the user might ask and the user needs to know what they can ask. This was the structural problem with Siri – no matter how well the voice recognition part worked, there were still only 20 things that you could ask, yet Apple managed to give people the impression that you could ask anything, so you were bound so ask something that wasn’t on the list and get a computerized shrug.

Conversely, Amazon’s Alexa seems to have done a much better job at communicating what you can and cannot ask. Other narrow domains (hotel rooms, music, maps) also seem to work well, again, because you know what you can ask. You have to pick a field where it doesn’t matter that you can’t scale.

Walt Mossberg:

So why does Siri seem so dumb? Why are its talents so limited? Why does it stumble so often? When was the last time Siri delighted you with a satisfying and surprising answer or action?

For me, at least, and for many people I know, it’s been years. Siri’s huge promise has been shrunk to just making voice calls and sending messages to contacts, and maybe getting the weather, using voice commands. Some users find it a reliable way to set timers, alarms, notes and reminders, or to find restaurants. But many of these tasks could be done with the crude, pre-Siri voice-command features on the iPhone and other phones, albeit in a more clumsy way.

Julian Lepinski:

Apple doesn’t seem to be factoring in the cost of a failed query, which erodes a user’s confidence in the system (and makes them less likely to ask another complex question). Apple’s Siri homepage cites a wide range of questions you can ask Siri, from the weather to sports scores to when the sun sets in Paris. The overall effect is to imply that you can simply ask Siri a question and get an answer, but in practice that almost never works outside of a few narrow domains (sports, weather, math and unit conversions).

Il problema descritto da Julian è reale: anche se Siri fosse migliorata col tempo nel rispondere a certe domande, non lo saprei e non lo verrei a sapere, perché non essendo stata in grado di rispondermi in passato semplicemente certe cose ho rinunciato a chiedergliele.

Da quando si integra con applicazioni di terze parti ho ripreso ad utilizzarla, leggermente, un po'[1. La cosa più recente che ho fatto è stata pagare un amico, con Siri, tramite Monzo], altrimenti Siri è per me inesistente. Sia perché come scrive Julian oramai ho perso la speranza in molti ambiti e situazioni, sia per i problemi di connessione (continua ad essere lentissima).

Secondo quanto riportato da The Information, Apple aprirà Siri agli sviluppatori al prossimo WWDC, rilasciando una SDK e API affinché gli sviluppatori possano rendere il contenuto delle proprie app accessibile tramite Siri:

Apple is upping its game in the field of intelligent assistants. After years of internal debate and discussion about how to do so, the company is preparing to open up Siri to apps made by others. And it is working on an Amazon Echo-like device with a speaker and microphone that people can use to turn on music, get news headlines or set a timer.

Apple potrebbe anche rilasciare un device simile all’Echo di Amazon.

Interessante, anche, questo articolo su come funziona Viv — l’assistente virtuale sviluppato dai creatori di Siri. Pare dovrebbe facilitare molto la vita agli sviluppatori, impedendo la creazione di tanti “silos” e comandi specifici a un’app:

Viv uses a patented [1] exponential self learning system as opposed to the linear programed systems currently used by systems like Siri, Echo and Cortana. What this means is that the technology in use by Viv is orders of magnitude more powerful because Viv’s operational software requires just a few lines of seed code to establish the domain [2], ontology [3] and taxonomy [4] to operate on a word or phrase.

In the old paradigm each task or skill in Siri, Echo and Cortana needed to be hard coded by the developer and siloed in to itself, with little connection to the entire custom lexicon of domains custom programmed. This means that these systems are limited to how fast and how large they can scale. Ultimately each silo can contact though related ontologies and taxonomies but it is highly inefficient. At some point the lexicon of words and phrases will become a very large task to maintain and update. Viv solves this rather large problem with simplicity for both the system and the developer.

Kontra immagina come sarebbe un dispositivo basato solo su Siri, una specie di iPhone Nano; o meglio: uno smartwatch.

Non è imminente, per ragioni tecniche: la necessità di avere una connessione alla rete sempre presente perché funzioni (non va dimenticato che Siri ha bisogno di internet, per elaborare le frasi) e svariati dati accessori (quelli che forniamo alle applicazioni e che l’iPhone raccoglie in background) che Siri può utilizzare grazie all’iPhone per rendersi più utile.

The best interface is no interface. Is about objects and tools that we interact with that no longer require elaborate or even minimal user interfaces to get things done. Like self-opening doors, it’s about giving form to objects so that their user interface is hidden in their user experience.

Siri ha bisogno di internet per funzionare: ogni domanda che gli ponete viene prima inviata ai data center Apple, dove viene analizzata, e in seguito, sempre tramite internet, il vostro iPhone riceve la risposta ad essa relativa. Questo in qualsiasi caso, e in qualsiasi situazione. Anche se la vostra richiesta a Siri non necessità dell’uso della rete, anche se è semplicemente ‘impostami una sveglia alle 3 del pomeriggio’, anche se state dettando al telefono la stesura di un SMS. In tutti questi casi, vi serve internet.

La cosa può essere piuttosto frustrante, se la connessione è lenta o non disponibile. Ma soprattutto – e questo è il quesito che molti si sono posti – quanti dati consuma Siri? La risposta, la faccio breve, è piuttosto pochi. Secondo Arstechnica, utilizzarlo 10-15 volte equivale ad un consumo di circa 945KB:

Se possiedi un iPhone 4S e ricorri a Siri 11 volte al giorno per un mese sempre attraverso la connessione 3G del tuo operatore, puoi aspettarti di consumare circa 20MB in 30 giorni.