Technology Trends/Digital Assistants
|
|||||||
---|---|---|---|---|---|---|---|
200px | |||||||
Status | Translation | ||||||
Initial release | September 25, 2019 | ||||||
Latest version | February 14, 2020 | ||||||
Official publication | Digital Assistants.pdf | ||||||
|
Digital Assistant, also known as Conversational User Interfaces (CUI), attempt to bridge the gap between naturally spoken and written human language and devices. Traditional Graphical User Interfaces (GUI) permit users to navigate an electronic device through buttons, visual cues, and text. CUIs can cut through the potentially complicated steps that a user must go through to accomplish a task. This can be in the form of retrieving information, getting directions, sending emails, playing media, ordering food, organizing a calendar, etc.
Business Brief
In today’s market, Conversational UIs are produced in two forms that both consist of Artificial Intelligence technologies: voice assistants and chatbots[1]. Voice assistants such as Amazon Alexa, Google Assistant, Cortana, and Siri allow for audio and text based communication between users and devices while chatbots are text based only.
Conversational UI will continue to progress and secure a crucial role in all Internet of Things (IoT) devices moving forward. Tasks that were traditionally performed by secretarial positions can potentially be done by digital assistants that can communicate in the background with other applications. These tasks include managing personal information, providing appointment reminders, storing contact information, sending messages and much more. With their deep technology embedding, various functional capabilities and easy-to-use interfaces, DAs are very capable of providing useful support within any organization by using Natural Language Understanding (NLU)[4] and Natural Language Processing (NLP)[5], to enable human-to-human like communication with websites, mobile apps, and external devices.
Technology Brief
Conversational UI functions by having the user input natural language, either in audio or written form. The input is then processed by Artificial Intelligence (AI) systems and a response is given. Several voice assistants, including Apple’s Siri, Google Assistant, Samsung’s Bixby[6], and Amazon’s Alexa, use cloud-based technology to process the input speech given by the user. The advantage of this is the ability to construct large databases of audio allowing the voice assistant to process the input faster and predict what the individual will say.
Digital Assistants, also known as Chatbots or virtual assistants, function in a similar manner and can be divided into two broad categories: they are generally scripted (basic user interaction) or structured (engaging virtual assistants). They may also be integrated with collaboration tools and messaging applications.
Scripted bots[7] can be thought of as providing a guided conversation. Although the implementation of these is less involved, it also provides stricter limitations on the level of communication that can be performed. Structured chatbots rely on AI and more specifically cloud-based Natural Language Understanding (NLU) to generate machine actionable data from user input. Unlike their scripted counterparts, these types of chatbots are more complex and require more effort to properly implement, however, the end user is able to be less rigid in the way questions and interactions are structured.
Scripted bots follow decision trees to answer pre-defined questions. The reason why chatbots with the same capabilities end up with vastly different customer experiences is because of the quality of the underlying decision trees.[8] These bots need to ask enough questions and an appropriate level of depth to provide the customer with the most accurate answer but refrain from asking unneeded questions or leading a customer in circles without providing an answer to them. Customer journey maps[9] will assist with the development of the decision tree.
Digital assistants are a software agent that performs various tasks and services for its user. DAs, such as Samsung’s Bixby[10], use voice commands to control applications and handle these requests. This recent development of virtual assistants makes it relatively easy for end users to master the software usage. Digital Assistants also have skills, which are apps that can be acquired through a skill store. This allows users to install the skills that would be most useful to them.
Generally, voice activated digital assistants are constantly operating at a low processing rate until a key word is heard to notify the DA that a request is about to be made. For example, Google Home listens to its surroundings until the words “Ok Google” are said aloud. Digital assistants then translate a user’s proceeding commands into text that is analyzed by multiple algorithms to execute a task. These algorithms break up the requests into key parts, which make it easy to send emails, messages, or store records.
There are three main types of algorithms that are used to analyze your request: Natural Language Processing (NLP)[12], Natural Language Understanding[13], and Natural Language Generation (NLG).[14] NLP deals with how to write computer programs to collect and process large amounts of natural language data. This can be a difficult task when some languages like Japanese and Chinese use characters that represent words and letters with no spaces between which makes it harder for the computer to understand what’s being said. It is easier in languages like English to identify words since they are almost always separated by a space. Another challenge is when the same word can be used as a different part of the sentence (noun, verb, etc). For example, you can turn on a light (noun), something can be light (adjective) (not heavy), you can light a candle (verb).Industry Usage
Conversational UI provides an alternative way for humans to interact with devices. They allow users to interact with and navigate a system, app, or device using conversational input only. Conversational UIs and digital assistants can open applications, write text into an input box, and issue commands to applications. For example “Hey Siri, set a timer for 30 seconds”). So a conversational UI is a replacement for a graphical user interface (GUI).
These interfaces allow for better service delivery by providing a newer technology that end users have come to expect.
Some of the simpler customer interactions can be automated to allow human resources to perform more difficult tasks that are hard to automate in the service industry.
Sephora, a popular makeup retailer in the U.S., has a successful bot on Kik. Kik is a mobile messaging application that allows one-on-one chatting, group chats, and an internal web browser.[20] The app also has sub-apps that work within the browser, which encourages users to stay within the app.
Today, the bot engages users with a number of questions about makeup preferences and serves up content and offers relevant to the responses it receives. While it does not sound like a highly sophisticated process, the more consumers engage with the bot over time, the smarter the bot (as well as the brand) gets about consumer preferences and the better it can serve personalized content and offers.
Digital assistants offer an alternative way for users to perform their daily tasks. They have the potential to eliminate a lot of the time individuals spend typing emails, documents, checking for updates and more. The potential amount of time digital assistants can save could have countless benefits for an industry. As digital assistants develop, their capabilities increase rapidly. This means corporations can use the assistants to give instant voice messages, send broadcasts, play audio and more.
Digital assistants such as Amazon’s Alexa are already being implemented in businesses across North America. DA’s are being used to notify IT service desks of technology-related issues, to begin conference calls[24], locate open meeting rooms[25], operate office lights[26] and even check security camera feeds[27].
Canadian Government Use
The use of Conversational UI can provide several benefits to the Government of Canada. Since Conversational UI increases the quality of interaction between human and device, the GC can benefit from its use in the delivery of Services. For example, if Conversational UI were used to handle basic technical problems encountered by employees in the GC, this would ease the workload of IT support staff allowing them to deal with more complex issues that the Conversational UI cannot handle.
Conversational UIs could also be used on GC websites. This would aid Canadian citizens accessing the websites to quickly obtain information they are seeking through natural language requests with a chatbot. If the chatbot can’t retrieve information or direct the user where to go, the chatbot could connect the user with the relevant department’s contact information or with an appropriate help desk worker. It could also be beneficial when an individual is required to fill out forms or applications to give immediate feedback on whether the data they have entered is valid or needs to be modified.
Digital assistants have a unique capability in that user generated “skills” or functions can be added by users and business to deal with specific tasks. The GC could create it’s own “skills” to deal with specific scenarios unique to the federal workplace. Doing so can increase productivity by giving more time to workers to focus on important tasks and there is the related potential of saving money since less time will be wasted on repetitive tasks.
According to Sarah Turnbull, the first instance of the GC launching a chatbot was from December 25, 2017 to February 28, 2018 as part of an awareness campaign from Public Safety Canada called “Don’t Drive High”[31]. The chatbot was designed as a way to interactively educate people 16 to 24 year old about the risks of driving high, while also providing them a way to find help or find a ride to get home.
Under the CBSA Assessment and Revenue Management (CARM) initiative[32], the Canada Border Services Agency (CBSA) is currently proposing a chatbot[33] that can help Trade Chain Partners (TCPs) with getting to valuable information faster. The proposed chatbot will help TCPs with understanding CBSA regulations as they relate to questions asked by the TCP. The chatbot will help with the CARM initiative’s stated goal of “modernizing and streamlining the process of importing commercial goods into Canada”.
Implications for Government Agencies
Value Proposition
Shared Services Canada (SSC) could gain value from conversational UI internally, by allowing it to deal with employee technical issues in a self-serve fashion. It can also allow external individuals to gain information quickly about SSC and refine their questions, without having to navigate an information rich website they may be unfamiliar with. Usage data collected from user interactions can help with getting insights on how to improve service delivery to stakeholders, clients and Canadians.
DAs can also increase employee productivity[34] and help them focus on the core tasks of their mandate. DAs and chatbots alike can be used to fill out forms and can prompt users when a section has been filled out incorrectly and submit them on a user’s behalf to save time.[35]
Challenges
The launch of Conversational UI’s in the GC provides a few challenges. Using either voice-activated assistants or chatbots designed using platforms like API.ai and Wit.ai means that the GC will be using Google’s or any other company’s cloud computing network to process the information submitted to the conversational UI. SSC will need to assess the security and privacy implications this brings forth. If the Conversational UI were to be designed without the use of these platforms then that would mean a sizable investment into designing one as well as maintaining it, unless an open source solution is adopted.
The challenges with digital assistants are extremely important to note due to their level of severity. One of the issues with digital assistants is the security threat they can pose. Since the majority of DAs are voice-activated, they are vulnerable to attacks such as the DolphinAttack[36]. The DolphinAttack is based on the fact that Dolphins can hear frequencies that humans cannot.[37] Cyber criminals are able to resonate commands to devices such as Siri at a frequency inaudible to the human ear, yet clear to DAs.[38] By doing so, these cyber criminals are able to use the DA’s to visit malicious websites, or pose questions to the DAs that are critical to the operation of the GC.
The background audio recordings and legitimate questions are all sent back to a main database owned by the DAs operator, and then the security of these files is in the hands of those operators (ex: Google, Amazon, Microsoft, Samsung, etc.). If any of those companies have a security breach, sensitive information from the GC could be accessed and shared with malicious actors. On a smaller scale, this has accidentally happened when an Amazon Echo device misinterpreted a conversation being held in another room, and sent the entire audio file to a contact stored in a contact list[39].
Other security flaws with DAs involve the use of “skills”. Skills can added to a DA from a skills store operated by the DA owner. Some of them are user submitted and this can pose some problems. Voice squatting[40] is when the DA can be exploited through the way it launches skills. If a malicious user submitted skill is similarly spelled and pronounced like a legitimate one, the user may accidentally invoke the malicious skill. For example, a command like “Hey Alexa, open Capital One” could also be interpreted as “Hey Alexa, open Capitol Won” and the command might open a malicious skill.
Considerations
The draft of the Responsible Artificial Intelligence in the Government of Canada White Paper[41] from the Treasury Board of Canada Secretariat outlines some very good considerations for institutions within the GC looking to deploy chatbots:
- Chatbot conversations should be introduced with a brief privacy notice that is compliant with the Treasury Board Standard on Privacy and Web Analytics[42]. This notice should provide a link to a page with more information on the information collected in the course of the conversation, including any metadata, for example: time and date, duration, whether the conversation was ended by the user or the agent, whether and when the discussion was escalated to a human, etc.
- Whether the bot is able to provide a professional tone as a representative of the Government of Canada. Machine learning chatbots may learn language that is potentially unprofessional, abusive, or harassing if exposed to sufficient examples. Where possible, institutions should work with vendors to prevent them from learning this behaviour, whether using a keyword blacklist, or other methodology. It is important to be continually monitoring chatbots’ performance in this regard.
- Institutions should be mindful that people in rural or remote locations may encounter latency that will affect their ability to respond to the chatbot’s queries. It's important to ensure that response times from the user are permissive.
- Institutions should be mindful that people in rural or remote locations may encounter latency that will affect their ability to respond to the chatbot’s queries. It's important to ensure that response times from the user are permissive.
- Chatbots need to be accessible. They should use plain language so as to be understood by users with lower levels of education or comfort with Canada’s official languages. It is also important that chatbots be able to be read by screen readers, or are able themselves to communicate vocally, for persons with visual disabilities.
- Users should be provided with a clear escape from the conversation. If a user finds that a chatbot is no longer useful, or is incapable of answering their query, there should be a clear means to transfer the conversation to a human agent (if available), or to send email correspondence. Additionally, if a chatbot has answered a query and the user has ended the session or refrained from answering another question, the chatbot should politely end the conversation.
In accordance with the shift towards adopting open source tools, the GC should consider testing and adopting open source DAs. As with their commercial counterparts, they are continuously evolving thanks to user input and also have the same capabilities, without the risk of information being sent back to a central database. They also support custom “skills” that can potentially be made by the GC to cater to specific problems.
Ultimately, conversational user interfaces are still being developed and refined as more and more companies are developing their own solutions. This means that over time, the challenges introduced by the technology with become less significant and the solutions will become more useful and intuitive.
Digital assistants are a potentially disruptive technology that will save time but also might cut back on the need for employees whose sole tasks could be completed by a DA. There is currently not much enterprise use for digital assistants when compared to the consumer market. There is a lot of potential for DAs to help in the enterprise sphere, but it has yet to be realized.
References
- ↑ Browlee, J. (2016, April 4). Conversational Interfaces, Explained. Retrieved from fastcompany.com
- ↑ Reddy, T. (2017, October 17). How chatbots can help reduce customer service costs by 30%. Retrieved from ibm.com: https://www.ibm.com/blogs/watson/2017/10/how-chatbots-reduce-customer-service-costs-by-30-percent/
- ↑ Costa, A. D. (2018, November 8). HOW-TOActivate Your Windows 10 License via Microsoft Chat Support. Retrieved from groovypost.com
- ↑ Wikipedia. (2019, August 24). Natural-language understanding. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural-language_understanding
- ↑ Wikipedia. (2019, September 15). Natural language processing. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural_language_processing
- ↑ Jansen, M. (2019, March 13). How to use Samsung Bixby: Everything you need to know. Retrieved from digitaltrends.com
- ↑ Onlim. (2019, March 11). How Do Chatbots Work? Retrieved from onlim.com: https://onlim.com/en/how-do-chatbots-work/
- ↑ Steele, I. (2018, February 22). Journey Mapping for Chatbots: How to Create a Chatbot Decision Tree from Scratch. Retrieved from comm100.com: https://www.comm100.com/blog/journey-mapping-chatbot-decision-tree-from-scratch.html
- ↑ Atlassian. (2019, September 20). Customer Journey Mapping. Retrieved from atlassian.com
- ↑ Agence France-Presse. (2017, April 6). Samsung's new personal digital assistant Bixby faces a few tough challenges. Retrieved from scmp.com
- ↑ Moynihan, T. (2016, December 5). Alexa and Google Home Record What You Say. But What Happens to That Data? Retrieved from wired.com: https://www.wired.com/2016/12/alexa-and-google-record-your-voice/
- ↑ Wikipedia. (2019, September 15). Natural language processing. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural_language_processing
- ↑ Wikipedia. (2019, August 24). Natural-language understanding. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural-language_understanding
- ↑ Wikipedia. (2019, September 6). Natural-language generation. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural-language_generation
- ↑ Rouse, M. (2019, September 20). natural language understanding (NLU). Retrieved from searchenterpriseai.techtarget.com: https://searchenterpriseai.techtarget.com/definition/natural-language-understanding-NLU
- ↑ Wikipedia. (2019, September 6). Natural-language generation. Retrieved from en.wikipedia.org: https://en.wikipedia.org/wiki/Natural-language_generation
- ↑ Goldberg, E., Driedger, N., & Kittredge, R. I. (1994, April). Using Natural-Language Processing to Produce Weather Forecasts. Retrieved from dl.acm.org
- ↑ Automated Insights. (2018, January 30). The Ultimate Guide to Natural Language Generation. Retrieved from medium.com
- ↑ Mielke, C. (2016, July 18). Conversational Interfaces: Where Are We Today? Where Are We Heading? Retrieved from smashingmagazine.com
- ↑ webwise.ie. (2019, September 20). Explainer: What is Kik? Retrieved from webwise.ie: https://www.webwise.ie/parents/explainer-what-is-kik/
- ↑ Paulson, K. (2017, March 23). A beginner's guide to designing conversational interfaces. Retrieved from webdesignerdepot.com: https://www.webdesignerdepot.com/2017/03/a-beginners-guide-to-designing-conversational-interfaces/
- ↑ Catanzariti, P. (2017, May 22). How to Build Your Own AI Assistant Using Api.ai. Retrieved from sitepoint.com
- ↑ Rouse, M. (2016, January 29). Google Cloud Platform (GCP). Retrieved from searchcloudcomputing.techtarget.com: https://searchcloudcomputing.techtarget.com/definition/Google-Cloud-Platform
- ↑ Amazon. (2019, August 13). Alexa for Business. Retrieved from docs.aws.amazon.com
- ↑ Perez, S. (2018, October 10). Alexa can now reserve conference rooms. Retrieved from techcrunch.com: https://techcrunch.com/2018/10/10/alexa-can-now-reserve-conference-rooms/
- ↑ Smart Home Focus. (2019, March 9). Alexa turn on the lights. Retrieved from smarthomefocus.com: https://www.smarthomefocus.com/alexa-turn-on-lights/
- ↑ Lamkin, P. (2019, April 17). How to view security camera footage on your Amazon Echo devices. Retrieved from the-ambient.com
- ↑ Sutton, J. (2019, April 9). LivePerson helps McDonald's Canada launch conversational commerce on Google Assistant. Retrieved from newswire.ca: https://www.newswire.ca/news-releases/liveperson-helps-mcdonald-s-canada-launch-conversational-commerce-on-google-assistant-802328181.html
- ↑ AtTask. (2014, October 22). AtTask Study Shows Miscommunication and Distractions Overshadow Work Productivity. Retrieved from prnewswire.com
- ↑ Amazon. (2019, August 13). Alexa for Business. Retrieved from docs.aws.amazon.com
- ↑ Turnbull, S. (2018, April 9). Ottawa used Facebook chatbot for ‘driving high’ campaign. Retrieved from ipolitics.ca: https://ipolitics.ca/2018/04/09/facebook-chatbot-message-about-driving-high-on-pot-a-first-for-feds/
- ↑ Canada Border Services Agency. (2019, September 4). CBSA Assessment and Revenue Management. Retrieved from cbsa-asfc.gc.ca
- ↑ Canadian Society of Customs Brockers. (2019, April 10). CARM Trade Chain Partners (TCP) Consultation Meeting, April 2019. Retrieved from cscb.ca
- ↑ Gibbison, M. (2017, January 11). 7 ways digital assistants and AI will help transform public services. Retrieved from diginomica.com
- ↑ Clifford, C. (2014, November 23). How Much Time Do Your Employees Spend Doing Real Work? The Answer May Surprise You. (Infographic). Retrieved from entrepreneur.com
- ↑ Arntz, P. (2018, July 18). What’s the real value—and danger—of smart assistants? Retrieved from blog.malwarebytes.com
- ↑ Khandelwal, S. (2017, September 7). Hackers Can Silently Control Siri, Alexa & Other Voice Assistants Using Ultrasound. Retrieved from thehackernews.com
- ↑ Khandelwal, S. (2017, September 7). Hackers Can Silently Control Siri, Alexa & Other Voice Assistants Using Ultrasound. Retrieved from thehackernews.com
- ↑ Machkovech, S. (2018, May 24). Amazon confirms that Echo device secretly shared user's private audio. Retrieved from arstechnica.com
- ↑ Umawing, J. (2018, May 30). Researchers discover vulnerabilities in smart assistants’ voice commands. Retrieved from blog.malwarebytes.com: https://blog.malwarebytes.com/cybercrime/2018/05/security-vulnerabilities-smart-assistants/
- ↑ Karlin, M. (2017, October 16). Responsible AI in the Government of Canada. Retrieved from gccollab.ca
- ↑ Treasury Board of Canada Secretariat. (2013, January 31). Standard on Privacy and Web Analytics. Retrieved from tbs-sct.gc.ca: https://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=26761