Amazon Alexa

Amazon Alexa logo. Source: Amazon Developer Docs 2018d.
Amazon Alexa logo. Source: Amazon Developer Docs 2018d.

Amazon Alexa is cloud-based voice service to bring natural voice experiences to users. It can seen as a virtual assistant with the capability to interact with users using voice rather than traditional computer interfaces of keyboard, mouse or monitor.

Initially, Alexa was released as the backend support service for Amazon's Echo device, a voice-activated connected speaker. Since then, Alexa has become smarter. It's been interfaced to a number of products beyond Amazon's Echo series of devices. It's moved from being just a smart speaker to central controller of your smart home.

Discussion

  • What are typical uses of Amazon Alexa?
    An introduction to Alexa. Source: Alexa Developers YouTube 2016.

    When Amazon Echo was released in 2014, Alexa could answer basic questions, add items to a shopping cart or play your favourite music album. Today, developers can add new capabilities to Alexa. Capabilities exist for managing chores, telling stories, quizzing, and gaming.

    Alexa can integrate with web services for ordering something, getting information, etc. It can control or query smart devices connected to the cloud. For example, you can ask to see the feed from a remote camera or dim the lights. In fact, Alexa's importance in the smart home is set to grow as it aims to handle complete home automation.

    In terms of some specific examples, Xbox One console will soon support Alexa so that gamers can issue voice commands. A number of lighting solutions can be controlled via Alexa: Philips Hue LED, Lifx, TP-Link, Osram, Stack LED. In healthcare space, Alexa has been used to get information on drug side-effects, medication schedule or gather patient survey.

  • As a developer, what can I build with Alexa?

    With Alexa, three things are possible:

    • Develop capabilities to Alexa for new use cases. These capabilities are called skills. Alexa Skills Kit (ASK) is a set of APIs, tools and documentation to help you develop skills.
    • If you wish to create a hands-free voice interface in your own products, use the Alexa Voice Service (AVS). Your products can then leverage Alexa's features. A simple example is to make your Raspberry Pi work with Alexa as explained in a detailed tutorial.
    • Let your customers control their connected devices in smart homes and other gadgets. Devices can reach Alexa via the Internet or via a local hub. Developers need to use Smart Home Skill API and Gadgets API.
  • What are the languages supported by Alexa?

    As of August 2018, Alexa's skills can support English, French, German, Italian, Japanese and Spanish. A device (such as Echo) when set to a particular language can access any skill available in that language. As a developer, you can publish your skill in multiple languages to target a wider user base.

    From a programming perspective, if the business logic of your skill is deployed on AWS Lambda, then any programming language that Lambda supports can be used: Node.js, Java, Python, or C#. Or you can deploy it as a web service on any cloud infrastructure. This means that you can code in any language, though the data is JSON via REST.

  • What's the architecture of Alexa?
    Alexa reference architecture. Source: Jensen 2017.
    Alexa reference architecture. Source: Jensen 2017.

    Alexa resides in the cloud. Access is via the Alexa Voice Service API. Alexa receives requests as speech streams. Alexa analyzes the speech and identifies the requested skill. A structured representation of the skill and intent is created.

    Now there are two possible paths. If the skill code is deployed in AWS Lambda, that service will be triggered within AWS. If it's a web service accessed via HTTPS then the REST API call is made. In either case, Alexa will receive the response as text. Optionally, the response may have images. Alexa will convert the text into speech, which gets streamed to the user via their current device. Images will be served where the device has a display.

    A skill is identified by an invocation name and mapped to a backend service. A skill can be designed to do multiple things, each of which is called an intent. Each intent can be identified by one or more words or phrases, each of which is called an utterance. For example, you may create a travel-related skill that has three intents: rent a car, book a flight, or book a hotel.

  • Could you define some Alexa-specific terminology?
    An illustration of some key Alexa terms. Source: Amazon Alexa 2018b.
    An illustration of some key Alexa terms. Source: Amazon Alexa 2018b.

    Here are a few essential terms and more are available in the Alexa Skills Kit Glossary:

    • Skills: A capability of Alexa. Alexa comes with built-in skills, such as playing music. Developers can also build new skills.
    • Intents: The main request or action associated with the user's command for a custom skill.
    • Slots: Arguments to an intent that provide Alexa more information. These can be required or optional.
    • Utterances: Words users say to convey what they want Alexa to do or in response to a question.
    • Interaction Model: The words and phrases users can say to make a skill do what they want. It determines the requests the skill can handle and the words users say to invoke those requests.
    • Invocation: The act of beginning an interaction with a particular Alexa ability.
    • Wake word: A word that "wakes up" Alexa for a task. Typically, this word is "Alexa".
    • Cards: Elements displayed in a visual interface to describe or enhance voice interaction.
  • What technologies power Alexa under the hood?
    A banking app chatbot using Lex and Polly but without Amazon Alexa. Source: Amazon Web Services 2018.
    A banking app chatbot using Lex and Polly but without Amazon Alexa. Source: Amazon Web Services 2018.

    Two fundamental technologies required to make Alexa work are speech-to-text and text-to-speech conversions. To make these work reliably, Alexa employs deep learning algorithms. More specifically, we have two technologies, both of which were opened for developers in November 2016:

    • Amazon Lex: This does the speech-to-text conversion using techniques of Automatic Speech Recognition and Natural Language Understanding. It can be used independently of Alexa to build conversational interfaces (such as chatbots) using a mix of voice and text.
    • Amazon Polly: This does the text-to-speech conversion by synthesizing speech in various languages and accents. In May 2018, Alexa developers got access to eight different voices from Amazon Polly. In fact, we can use Speech Synthesis Markup Language (SSML) to generate natural-sounding speech. One researcher showed how SSML can be used to teach Alexa to speak with a Boston accent.
  • What resources are available to learn about programming for Alexa?

    A good place to start is the Voice Design Guide. This includes a design checklist and a glossary. There are also guides on using Amazon Skill Builder.

    As a tutorial, you can start with this 6-step tutorial to build a fact skill. An easier way to get started is to use Alexa Skill Blueprints. These are templates that you can customize to make your own skill. Alexa on GitHub contains open source code.

    There are some useful developer tools. There are also open source frameworks for building voice apps. Two such frameworks are Jovo and Violet. These frameworks are cross-platform; that is, from a single codebase they can target Amazon Alexa, Google Assistant, etc. Though not open source, Alexa Flow is a tool to easily create, manage and host your custom skills without coding knowledge. An alternative is Storyline that offers a visual drag-n-drop design interface, previews, deployment and analytics. Many more tools, SDKs and platforms are listed in Wolter's blog.

    Blogs specific to Alexa are useful to get latest news or learn best practices. From the main Alexa website you can access all Alexa-specific documentation.

Milestones

Nov
2014
First Echo device and what it could do. Source: Bishop 2014.
First Echo device and what it could do. Source: Bishop 2014.

Amazon Echo becomes available by invite only to select Amazon Prime customers. This is a voice-activated connected speaker that can act on user requests or answer basic questions. It's actually powered in the cloud by a virtual assistant called Alexa.

Jun
2015

To enable third-party integration of Alexa into their connected devices, Alexa Voice Service (AVS) is released. In April 2016, Invoxia becomes the first maker to release Alexa service on a non-Amazon device. Called Triby, it's a kitchen device that can set timers, play music or read sports scores.

Apr
2016

The Alexa Smart Home API is launched. In May 2017, with the Smart Home Skill API, both users and developers can use appliance categories (camera, light, smartlock, etc.) to easily identify device types.

Apr
2017
Example screenshot of Alexa Skill Builder showing utterances. Source: Adapted from Kim 2018a.
Example screenshot of Alexa Skill Builder showing utterances. Source: Adapted from Kim 2018a.

Amazon brings out the Alexa Skill Builder, a more intuitive interface to build skills including the interaction model and multi-turn dialogues. Work on this started back in October 2016.

Apr
2018

Amazon releases Alexa Skill Blueprints. These are skill templates that you can customize rather than develop custom skills from scratch.

May
2018

Developers cannot sell Alexa skills but Amazon enables In-Skill Purchasing (ISP). If a skill drives user engagement, Amazon also offers rewards to developers. Amazon releases a preview in which developers can choose any of eight U.S. English voices to narrate skills. This is powered by Amazon Polly.

Aug
2018

Amazon releases Alexa Auto SDK on GitHub. Automobile makers can integrate Alexa directly into their vehicles. Amazon announces Quick Start Templates to accelerate skill development for popular skill samples.

References

  1. Alexa Developers YouTube. 2016. "What Is Alexa? An Introduction to Amazon's Voice Service." Alexa Developers, YouTube, September 14. Accessed 2018-08-31.
  2. Amazon Alexa. 2018a. "Glossary." Voice Design Guide. Accessed 2018-08-30.
  3. Amazon Alexa. 2018b. "What Users Say." Voice Design Guide. Accessed 2018-08-30.
  4. Amazon Blueprints. 2018. "Amazon Alexa Skill Blueprints." Accessed 2018-08-31.
  5. Amazon Developer. 2018. "Amazon Alexa." Accessed 2018-08-30.
  6. Amazon Developer Docs. 2018a. "Understand the Different Skill Models." Alexa Skills Kit. Accessed 2018-08-31.
  7. Amazon Developer Docs. 2018b. "Build Skills with the Alexa Skills Kit." Alexa Skills Kit. Accessed 2018-08-31.
  8. Amazon Developer Docs. 2018c. "Develop Skills in Multiple Languages." Alexa Skills Kit. Accessed 2018-08-30.
  9. Amazon Developer Docs. 2018d. "AVS UX Logo and Brand Usage." AVS Documentation. Accessed 2018-08-31.
  10. Amazon Web Services. 2018. "Amazon Lex – Build Conversation Bots." Accessed 2018-08-30.
  11. Bishop, Todd. 2014. "Amazon’s elusive ‘Echo’ intelligent home speaker falls short in first major review." GeekWire, December 02. Accessed 2018-08-31.
  12. Blankenburg, Jeff. 2017. "Define Your Appliance Category for a Better Customer Experience." Alexa Blogs, May 23. Accessed 2018-08-30.
  13. Burns, Matt. 2016. "Amazon Alexa is now available on first device not made by Amazon." TechCrunch, April 28. Accessed 2018-08-31.
  14. CNET. 2016. "Here's what works with the Amazon Echo." CNET, December 22. Accessed 2018-08-31.
  15. Campos, Ivan. 2017. "Teaching Alexa to speak with a Boston accent." Slalom Technology, June 25. Accessed 2018-08-30.
  16. Firstpost. 2016. "Amazon announces three new AI services called Lex, Polly and Rekognition for AWS- Technology News." December 01. Accessed 2018-08-30.
  17. Forrest, Conner. 2018. "How to become an Alexa developer: A cheat sheet." TechRepublic, May 09. Accessed 2018-08-30.
  18. Foster, Adam. 2018. "Announcing the Alexa Auto Software Development Kit (SDK)." Alexa Blogs, August 09. Accessed 2018-08-31.
  19. Haberkorn, BJ. 2018. "Announcing Quick Start Templates for Popular Skill Samples." Alexa Blogs, August 30. Accessed 2018-08-31.
  20. Isbitski, David. 2015. "Announcing the Alexa Voice Service (AVS)." Alexa Blogs, June 25. Accessed 2018-08-31.
  21. Isbitski, David. 2017. "Announcing New Alexa Skill Builder (Beta), a Tool for Creating Skills." Alexa Blogs, April 18. Accessed 2018-08-31.
  22. Jensen, Jenny. 2017. "Alexa App Development." KRM Associates, September 27. Accessed 2018-08-31.
  23. Kim, Daniel I. 2018a. "Amazon Alexa Skill Builder." Accessed 2018-08-31.
  24. Kim, Woo. 2018b. "New Developer Preview: Use Amazon Polly voices in Alexa skills." Alexa Blogs, May 16. Accessed 2018-08-31.
  25. Lorenzetti, Laura. 2014. "Forget Siri, Amazon now brings you Alexa." Fortune, November 06. Accessed 2018-08-31.
  26. Newman, Jared. 2017. "Amazon's Alexa Is A Real Smart Home Platform Now." Fast Company, September 28. Accessed 2018-08-30.
  27. Warren, Tom. 2018. "The Xbox One will reportedly soon support Alexa and Google Assistant." The Verge, June 03. Accessed 2018-08-31.
  28. Weinberger, Matt. 2017. "How Amazon's Echo went from a smart speaker to the center of your home." Business Insider India, May 24. Accessed 2018-08-30.

Further Reading

  1. Codeacademy. 2018. "Introduction to Alexa." Accessed 2018-09-03.
  2. Hwang, Yitaek. 2017. "Build Your First Custom Alexa Skill in 10 Minutes." IoT For All, Medium, December 06. Accessed 2018-08-30.
  3. Prickett, Simon. 2016. "Build an Alexa Skill with Python and AWS Lambda." Modus Create, August 11. Accessed 2018-08-30.
  4. Weinberger, Matt. 2017. "How Amazon's Echo went from a smart speaker to the center of your home." Business Insider India, May 24. Accessed 2018-08-30.
  5. Amazon Developer Docs. 2018a. "Understand the Different Skill Models." Alexa Skills Kit. Accessed 2018-08-31.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
4
0
1951
2
0
21
1535
Words
3
Likes
13K
Hits

Cite As

Devopedia. 2018. "Amazon Alexa." Version 6, December 1. Accessed 2023-11-13. https://devopedia.org/amazon-alexa
Contributed by
2 authors


Last updated on
2018-12-01 09:52:47