This past weekend we had a hackathon at work, focused on developing Skills for the (relatively) new Amazon Echo. The purpose of the hackathon was to expose us to a technology we hadn't used before and explore what use cases might exist that we could leverage either internally, or for customers. We had a great time and built some really fun Skills along the way to learning the Alexa Skill development tools. We found there were pros and cons, as with anything, and I personally took away a few key lessons from my short time working with the Echo.
I want to take a moment here to point out that I wasn't paid or encouraged by anyone, anyone at all, to share my experience on this blog. Amazon, to my knowledge, had no part in this hackathon; it was an internal event we did just for fun and exploration. I just want to solidify what I learned by revisiting it in a writeup. So let's talk about the Echo!
When you buy the Amazon Echo, you are buying a very nice piece of hardware that connects you to a service named Alexa, which is where the magic happens. You make your voice request, and then Alexa uses Skills (voice applications) to do something useful or fun in response to your request. First of all, I want to say that Alexa is a lot of fun to work with from a creative product standpoint. Working with Alexa's voice API provides a vast landscape of opportunities to be creative in ways that feel fresh and new. Alexa is also a lot of fun to use from the consumer standpoint, for me at least. I've heard many people say they don't know what they would do with an Echo if they had one, but to someone with that complaint, I'd say the cliche that there's an app for that holds true here.
Alexa has tons of capabilities out of the box, covering everything from checking the weather, to listening to podcasts, news, and music, to telling jokes, to making purchases on Amazon through your account. The system can integrate with home automation tools, and work in concert with apps on your phone to manage shopping lists, to-do lists, and other handy utility features. Additional Skills can be obtained from the community through the Skill Store to cover all kinds of additional fun and useful scenarios. Alexa has lots to offer.
In case this is reading too much like an advertisement, don't worry. There's plenty for Amazon to work out with the Echo before I would ever buy one at full price. That being the first of my concerns: I find it to be unjustifiably expensive. You're buying a speaker, with a hardcore microphone array, connected to the web via wi-fi; that's really all the hardware you get, and it comes in just shy of $200. It is a very nice enclosure, and the speaker itself is exemplary; but without internet, this thing is a brick. The Echo does no work of its own locally. All services are performed by Alexa on Amazon's side, and piped back to you over the web. I can go to OK Google, or Siri, or even Cortana for a lot of what Alexa can provide; so $180 feels brutal for the sole perk of having your voice assistant always-on. A big downside of the Echo, compared with voice services provided by Google, Apple, and even Microsoft, is that the Echo is immobile, whereas your phone is always with you. Adding to the cost, some of the handy features (music providers being a prominent one) require subscription fees; and the buy-by-voice feature, while convenient, can feel a little dangerous since Alexa doesn't recognize which voice is yours. (Buying each other socks without permission became a running gag throughout the hackathon.)
Along those lines, Alexa doesn't always understand voice input clearly, so you sometimes find yourself repeating commands to get her to hear you right. I find Google's voice recognition to be a tad more reliable. Alexa also does poorly if you try to give her a command in a room with other people speaking. She can't differentiate among voices, and this is very frustrating at times. Granted, this is an emerging technology, and natural language and voice recognition are very challenging problems for computers to handle, but from a practical, daily usage standpoint, this is often a frustrating shortcoming.
By far, though, my biggest gripe about working with Alexa is as a developer. Every technology has its idiosyncrasies, but I found learning the Alexa Skill developer tools to be a particularly harrowing experience. First of all, you need to use Amazon Lambda to host your Skills. Amazon Lambda is a cool idea, but developing Alexa Skills thereon is a new kind of challenge, and not in any particularly fulfilling kind of way. The tools for building Skills are a little clunky and the documentation is practically non-existent. They have some nice features, like the ability to test your Skills without needing to go through the time-consuming process of conversing with Alexa repeatedly. However, in order to deploy Alexa skills, you need to access two separate dashboards; the Skill itself, the Lambda logic, is deployed on AWS Lambda through the AWS console, but you have to separately log into the Amazon Developer Console to access Alexa configurations and tools to define things like user interactions, and recognized user phrases. Where is this documented? Great question.
The feedback from the system when something goes wrong is minimal. You'll get messages like "Syntax error." That's it. What kind of syntax error? Where? Your only option is to lint the code yourself and hope that any of your mistakes can be found that way, or be so good at programming that you never make typos or logic errors. There's not much available in terms of debugging help. By its nature, Amazon Lambda also faces the limitation of weak session and state management. This is one reason why some of us at the hackathon resorted to using Alexa as a forwarding service, simply to translate voice into API calls to a remote server (EC2 in this case) that hosted more complex service implementations outside of Amazon Lambda.
As mentioned earlier, the documentation for this set of developer tools is sorely lacking. The official Amazon guide points to a blog post from 4 months ago that is not only incomplete, but also already out of date, with screenshots and instructions that are both flat-out incorrect. It was a headache trial-and-erroring our way through the process of developing our first Alexa Skill and finding third-party tutorials for Skill development. We were able to pull it off, but it was a lot harder than it needed to be. It could have been an hour long process if the documentation had been half-way complete or at least up-to-date. I would expect sketchy documentation from open source projects made by volunteers; not from a top tech company, on a service they profit from, that other developers are meant to use as part of a business model. Maybe I expect too much, but fair or not, that is my expectation of a for-profit system provided by a sixty-five billion dollar tech company.
It really doesn't feel like Amazon is bringing their A-game to the Alexa developer community. If these issues cropped up during an open beta or in a staging environment, that would be one thing, but this is a public-facing, monetized platform. One or two of the features are sub-labeled as beta as of this writing, but the features work fine; the process and interface design are just abysmal, and the documentation is being provided by home-use amateurs because Amazon isn't providing it. I personally find that to be a weak effort on Amazon's part.
Overall, I do like the Echo and Alexa, and I would like to have one if the price were more reasonable. I find it fun and useful. I would also be excited at the opportunity to do more development on Alexa Skills now that I know the process, but it was needlessly painful to learn the ropes compared with other platforms.
Thanks for reading!
- Steven Kitzes