For voice to work really well you need a narrow and predictable domain. You need to know what the user might ask and the user needs to know what they can ask. This was the structural problem with Siri – no matter how well the voice recognition part worked, there were still only 20 things that you could ask, yet Apple managed to give people the impression that you could ask anything, so you were bound so ask something that wasn’t on the list and get a computerized shrug.
Conversely, Amazon’s Alexa seems to have done a much better job at communicating what you can and cannot ask. Other narrow domains (hotel rooms, music, maps) also seem to work well, again, because you know what you can ask. You have to pick a field where it doesn’t matter that you can’t scale.