This technical report describes the methods undertaken by a US-based Digital Health company (X2AI or X2 for short) to develop an ethical code for startup environments and other organizations delivering emotional Artificial Intelligence (AI) services, especially for mental health support. With growing demand worldwide for scalable, affordable, and accessible health care solutions, the use of AI offers tremendous potential to improve emotional wellbeing. To realize this potential, it is imperative that AI service providers prioritize clear and consistent ethical guidelines that align with global considerations on user safety and privacy. This report offers a template for an ethical code that can be implemented by other emotional AI services and their affiliates. It includes practical guidelines for integrating support from clients, collaborators, and research partners. It also shows how existing ethical systems can inform the development of AI ethics.
Chatbots are one of the most widely adopted iterations of artificial intelligence, as is the idea of creating a chatbot for therapeutic dialogue . But when combined with today’s greatly advanced Natural Language Processing (NLP) and other modes of AI that make possible far more sensitive communication with human users, emotionally supportive chatbots are anything but retrograde.
X2 is a company that creates customized chatbots (AI coaches) for an array of use-cases, most of which focus on exploring and uplifting emotional wellbeing. This technology is highly scalable, easy to use, available on demand, and swiftly adaptable across languages, cultures, and other important contexts. This means that a supportive AI coach can complement conventional mental health care, and even reach users in times and places where other modes of care cannot. However, to achieve this potential, it is imperative that members of this industry adopt clear and consistent ethical practices for AI pertaining to human emotions. We intend this ethical code as a template for startups and other organizations which provide emotional AI services, especially within the context of mental health support.
X2’s starting place for developing such an ethical code was the realization that for emotional support AI, it is important to consider emerging AI ethics alongside established mental health professional guidelines. We began by asking: “How and where do these ethical approaches converge?”
Establishing an Ethical Code for Emotional AI
The X2 ethical code’s foundation is the overlap observed between the American Psychological Association’s (APA) General Principles for Psychologists and the UK Lord’s Report AI Code Principles (as displayed in figure 1) [2,3]. The aim in integrating these fields is to produce a robust ethical code that adapts as emotional AI advances. X2 persistently applies these foundational ethical principles to the companies procedures for creating and maintaining the AI system. This approach can further ongoing discussion of AI ethics by providing a practical example of how an AI ethical code can emerge through detecting and deepening the link between different ethical systems.
Figure 1: X2 Ethical AI Code
This ethical code for emotional AI is designed to protect the wellbeing of all users across the development, delivery, and maintenance of a startup or other digital health organization’s AI services. The template for this code consists of four key components:
- Core ethical principles and sources
- Privacy & security measures
- Team expertise & training
- Research & development
Core Ethical Principles and Sources
X2’s ethical guiding principles arise from the intersection of mental health care and technology ethics: we refuse to pursue or deliver services, whether independently or in collaboration with other organizations, that are likely to cause harm. Aligning with Google’s 2018 AI principles, X2 commits to blocking any use of products and services, in which they would be part of the implementation of any weapons or other technologies whose purpose is to cause injury to people . X2 identifies commitment to harm reduction as a guiding principle for the implementation of emotional AI.
The principles recommended by Luxton et al., who comprehensively examine the ethical challenges of applying AI technologies to mental health care, further inform how we monitor and secure user privacy, safety, autonomy, and trust . X2’s most important method for sustaining this intersection of mental health care and technology ethics is seeking expert guidance on an ongoing basis. X2 is further motivated by the White House report on preparing for the future of AI, which calls for organizational strategies that focus on international, industry-led engagement .
To realize this aim, X2 regularly seeks guidance from industry and academic experts, inviting critical, reflective thinking on company practices, both established and developing. We maintain a standing medical ethical board consisting of faculty from academic institutions. The chairwoman of this advisory board also holds an observer seat on the board of directors of X2 and thereby can serve as a liaison for the interests of this advisory board. X2 also routinely reaches out to other experts for ethical guidance when implementing new technologies, or developing new content, such as expert research partners from Duke, Northwestern, and Stanford Universities. Likewise, when launching a service customized for a particular geographic region, X2 seeks local experts to evaluate and guide efforts at every step. Like any of the partnering organizations, the experts with whom X2 engages must demonstrate a clear commitment to harm reduction. By making this a regular component of developing and maintaining AI services, digital health organizations create more opportunities to link ethical principles to concrete practices.
The following sections describe policies and procedures developed by X2 to support and advance the companies ethical commitments.
Privacy & Security Measures
Ensuring the rights of users to confidentiality, privacy, and control over how their data is handled is of the utmost importance. Without earning the trust and confidence of users, AI for emotional support will not achieve its potential to help people. In particular, principles of transparency and accountability guide company practices.
Keeping pace with quickly evolving international data privacy standards, including the European Union General Data Protection Regulation (GDPR), is essential . X2 users have the same data privacy rights regardless of location or other demographic considerations. The company fully complies with the standards outlined by the GDPR and maintain these standards for all users worldwide. X2 also offers HIPAA compliant AI coaching services for integration with existing electronic health record (EHR) systems, and HIPAA compliant fully encrypted messaging platforms.
In instances where X2 may seek to use de-identified user data solely for research purposes, this is made transparent so that users can decide if they are willing to participate. Notably, any user may later delete their data from the system at any time. If users opt-in, results are reported in aggregate. For AI coaches customized for research institutions, such as Duke University, Palo Alto University, Rochester Institute of Research and Nemours Children's Hospital, these institutions and their Institutional Review Boards administer research participation consent.
The responsibility to ensure ethical AI rests with the people creating it. Accordingly, X2 prioritizes ongoing processes for monitoring and assessing decisions and actions. The policy, content, research, and technical leaders meet regularly to evaluate best practices and research priorities, as well as develop new strategies for mitigating potential misuse of AI and user data.
The medical ethical board, which meets semiannually, is the most important source of ongoing external feedback. X2 leadership convenes focused discussions with established mental health and technology scholars and experts to advance the companies understanding of the complex ethical questions that guide the work. In between these meetings, X2 maintains active discussion with advisory board members and seeks their input when any questions or concerns arise. Each member of the medical ethical board has acted as a content creator or provided guidance that directly contributed to the system development, making their insights especially impactful.
The focus on accountability extends beyond X2 to include partner organizations. Because X2’s services are designed to be highly customizable and widely accessible, the company works with clients to determine which communication platform(s) best serve users such as SMS (text-messaging), Facebook Messenger, Signal, Slack, and more. While the chat interfaces X2 works with include HIPAA- and GDPR-compliant offerings, X2 also prioritizes accessibility through everyday channels available to users worldwide. X2 works closely with clients alongside partner organizations to assess and mitigate any potential risks to users.
Team Expertise & Training
X2’s team expertise and training bolster the ethical commitments at every level of engagement. X2’s staff and advisors include licensed and experienced psychologists (PhD / PsyD / LLP), accredited psychiatrists (MD) and board licensed professional counselors (LPC). When new team members join X2, they participate in onboarding curriculum, and all team members take part in ongoing discussions concerning AI ethics and security requirements (apart from regular employee background checks). X2 provides all employees with career development opportunities such as continuing education courses for mental health professionals and coverage for registration fees for symposiums that enhance product knowledge and expertise, a policy that aligns with recommendations by both the American Psychological Association (APA) and the Association of Computing Machinery (ACM) [2,8]. Following recommendations by the White House stipulating that AI ethics should be clearly linked to practical efforts and outcomes, training at X2 is augmented with technical tools and methods for actively working to prevent unacceptable outcomes . For instance, X2 provides training to team members, research partners, and clients to whitelabel and customize each unique version of the core chatbot (known as “Tess”) through their proprietary HIPAA compliant platform. The knowledge and experience of team members ensure that ethics are never compromised by the technology used.
Monitoring & Minimizing Bias
One key benefit of integrating AI into mental health practices is to better provide a safe and non-judgemental space for people to explore sensitive concerns. Many people experience shame and marginalization due to a diagnosis, and go on to describe the consequences of stigma as more burdensome than those of the condition itself . While AI in this context may have the capacity to help alleviate bias and stigma, it is also the case that bias can arise in unanticipated ways within machine learning systems. X2 aims to prevent this by limiting the role of AI, and by carefully maintaining expert human oversight of it. Bias is further reduced thanks to the globally diverse user base across South America and Africa through X2 Foundationa, as well as customers in Europe, Asia, and Australia. The system is trained with evidence-based psychological modalities drafted by mental health professionals who are highly informed about bias and stigma. By strategically applying machine learning features to this closely managed content, X2 enables AI to develop appropriately through interactions with users.
X2 also globally monitors the AI system for any indication of absent or inadequate content areas, and for any instances of language or emotion error. By gathering and monitoring deidentified training data from an array of interactions, not only does the AI improve, but the team also learns how to more effectively anticipate and prevent bias and stigma in content creation. To ensure low bias is brought in through this process, X2 employs a diverse team of core staff and external content developers.b
Additionally, X2 frequently invites users to provide direct input on services in a number of ways. The AI coaches are designed to request feedback every few conversations, much in the way a human coach or therapist does. Feedback questions are structured to gather qualitative feedback such as “Is there anything I can do to better support you?” and more quantitative questions to evaluate net promoter score and user satisfaction. The team consolidates this feedback into categories that are regularly incorporated into core team discussions. The review process includes analyzing feedback for any indications that users may be experiencing, or be subject to, bias or stigma of any kind within X2 services. Occasionally, this review process overturns some anticipated areas of bias or exclusion. For instance, X2 expected that older adult users might be less inclined to engage with, or benefit from, an AI coach, but preliminary findings from a trial with a major senior care provider indicated otherwise (public results forthcoming).
“AI on a Leash”
AI systems require thorough training from humans on how to best support people. X2 believes that the role of AI should be restricted to ensure that human expertise appropriately guides all AI interactions and decisions. While the X2 system might provide insights into the effectiveness of certain wording or emotion identification, any adaptation is carefully reviewed by our experts before it is implemented. In general, any word or sentence that the AI coach responds with is pre-scripted and approved by subject matter experts. X2’s staff of mental health professionals and computer scientists work collaboratively at every step in the design and maintenance of the system.
X2 applies this same strategy of keeping “AI on a leash” when creating AI coaches and training them on which emotions, topics, psychological modalities, and interventions are applicable in what contexts. Without exception, X2 works closely with experienced clinical psychologists to create all of the emotional support content. Similarly, to form the personality of the primary English language AI coach (Tess), the APA’s ethical principles guided the selection of character traits . X2 used this framework to develop the AI coach’s “clinical style”—or in other words, elements such as tone, boundary setting, crisis intervention procedures, and the appropriate amount of “self-disclosure” (for instance, when asking a user to describe a relaxing activity they enjoy, Tess might indicate enjoying surfing the web). X2 worked closely with industry experts to achieve a character template that is highly customizable and responsive to the parameters set.
Ensuring User Inclusivity
One key benefit to the customizability of the X2 system is that existing content can be leveraged for each new iteration, making it easier to incorporate appropriate cultural nuances for users worldwide. For every AI coach that X2 creates, the team begins by identifying user needs and then developing a set of core modules. These modules include an introduction, brief intake, crisis support script, and relevant interventions categorized by psychological modality. These are then reviewed and revised by subject matter experts to improve reliability and applicability with regards to language, quality of support, local resources, and similar factors.
X2 currently offers AI coaching services in Arabic, Dutch, English, Japanese, and Spanish, and plans to expand its scope of service delivery by adding additional language options. X2 considers many factors during the translation process, and continually updates best practices as the range of languages and total user count increases. For example, because Spanish applies grammatical gender, the Spanish language AI coach must request the user’s gender preference in the first conversation. While professional and user feedback indicates this is an appropriate option at this time, X2 continues to work towards providing more inclusive and sensitive support to users whose gender may not be among the options stated.
When considering the role of AI coach in emotional support, it is important to recognize unique preferences across different cultures and groups of users . X2 has internally analyzed and identified some of these differences based on user-AI interactions and direct user feedback [11,12]. For example, Spanish-speaking users from Latin American countries have shown a tendency to reply with longer descriptions and more variety in words and phrases than English-speaking users from the United States, Australia, and Europe. Likewise, qualitative and subjective feedback from adolescent and student users have expressed a preference for a speedy response from the AI coach, while those over age 50 have expressed a preference for a more natural response speed that mimics a person-to-person message exchange.
X2 also evaluates AI coach word choice, tone, and intervention selection to ensure that the level of education does not limit user access. For instance, the Monterey County Health Department requested minor script edits featuring more general use language to improve user connection with the AI coach. X2 analyzed feedback, including drop-off and error rates, to precisely identify which words and phrases to edit. These findings enabled greater inclusivity, and also helped to develop new conversation modules. Community feedback directly informed of pressing areas of concern for Monterey County residents at the time, including political anxiety, fear of deportation, and substance abuse.
Research & Development
Strong emphasis on research and development is necessary to provide safe, confidential, and secure emotional support through AI. X2 allocates approximately one-third of time and resources to research efforts focused on evaluating the feasibility and efficacy of the AI system. This aligns with recommendations by the US and UK governments and the Partnership on AI (an international technology industry consortium) for international engagement with industry experts, organizations, and academia to exchange information and collaborate on AI research and development [6,13]. X2’s primary research categories are:
- Feasibility studies to evaluate the use of AI both as a stand-alone source of emotional support and as an adjunct to existing programs and treatment solutions. This includes partnerships with CABHI and Saint Elizabeth Healthcare to display how Tess is being used to support 9,000+ employees and 10,000+ patients with text-based and voice-enabled emotional support . Furthermore, IBH Corp. EAP case study revealed Tess interactions support employees and cut organizational costs with 10,000 messages exchanged pre-launch, which would have cost their staff 20,000 minutes, an equivalent to 2 months of work by one FTE when assuming an 8 hour workday, which at $65 per hour would be equivalent to $21,667 . Additional feasibility studies are underway with a major senior care provider and Universidad de Palermo.
- Generalizability and scalability studies to evaluate the use of AI as a source of accessible, affordable, and inclusive emotional support to all people on a global scale. Examples include Duke University’s IRB approved single-case experimental design pilot study to expand access to perinatal depression treatment in Kenya through Tess, and the Universidad Adventista del Plata RCT which revealed that Tess interactions led to significantly reduced symptoms of depression by 28% as measured by the PHQ-9 (p=0.02), and anxiety by 18% as measured by the GAD-7 (p=0.04) [16,12]. Through a grant awarded by the Baycrest Centre for Aging and Brain Health Innovation (CABHI), 20,000 older adults are being given access and able to talk to Tess through Facebook Messenger or Google Home . Ongoing generalizability and scalability studies include partnerships with a Nigerian Federal Psychiatric Hospital, a group of universities in Singapore, and Erasmus University in the Netherlands.
- Efficacy studies to evaluate the use of AI as a partner to professionals, including specialized emotional support areas/topics or coaching programs, as well as clinical treatment and symptom identification. This includes the randomized controlled trial with Northwestern University which revealed that Tess interactions led to significantly reduced symptoms of depression by 13% as measured by the PHQ-9 (p=0.03), and anxiety by 18% as measured by the GAD-7 (p=0.02) . Research with Nemours Children's Hospital revealed Tess to be a beneficial adjunct to pediatric care as adolescent patients experienced positive progress toward their goals 81% of the time and reported usefulness ratings 96% of the time . Ongoing and upcoming efficacy studies will be completed through research partners such as Palo Alto University, Stanford University, CHADIS, and Patients Centered Outcomes Research Institute (PCORI).
These researchers’ efforts consistently inform content development. X2’s AI technology is not currently designed to deliver emotional prediction or interpretation with authority, such as diagnosis. Instead, it is designed to respond to users with advice and interactions that produce a helpful effect. As noted above, any content expansion or system development that extends beyond the core team’s expertise is completed in collaboration with industry experts. These experts include clinical, ethical, and technical advisors, content developers, and research collaborators. By guiding content creation, AI personality development, privacy and security policies, research, and more, many subject matter experts have already contributed to the ethical improvement of X2’s services. Those who contributed through review of this paper include:
Medical Ethical Board Contributors:
Breanna Gentile, MS, MA, Palo Alto University
Divya Krishnamoorthy, MD, Child and Adolescent Psychiatry Private Practice
Eric Green, PhD, Duke University
Joy Ippolito, EdD, American Family Insurance Institute for Corporate & Social Impact
Nancy Spangler, PhD, Spangler Associates
Russell Fulmer, PhD, Northwestern University Department of Counseling Faculty
Eduardo Bunge, PhD, Palo Alto University
Elsa Rojas-Ashe, PhD, Stanford University
Efforts to develop ethical standards for AI are not always successful, as Google recently faced the shut down of their external advisory council on AI ethics in 2019 made apparent . First and foremost, Google drew strong critique for the selection of council members . This highlights the advantage of incorporating technology ethics alongside mental health professional ethics in the code. The commitment to harm reduction to which each of X2’s medical ethical board members are accountable provides a strong and clear basis for selecting external expert guidance. Google received further criticism for planning insufficient council interaction for a company operating at its scale. While the needs of a startup differs from that of one of the world’s largest companies, this critique nevertheless illustrates why X2’s ethical code incorporates both formal meetings as well as ongoing discussions between meetings. X2’s code reflects what was discovered in creating it: the importance of linking ethical considerations to concrete practices that are rooted in sustained interactivity and collaboration within and beyond the company.
Notably, an industry-led ethical code is not a rejection of the possibility of expanded regulatory oversight. Instead, it equally lays the groundwork to inform future regulation and industry-led efforts alike. Simply put, the stronger the alignment between members of this industry on ethical standards and procedures, the more likely it becomes that AI can provide emotional support for as many people as possible. With this technical report we hope to set the bar high for ethical standards within our industry (niche).
The guiding principles and practices described above provide an ethical code template that can assist startups and other organizations in creating effective emotional AI solutions while ensuring the safety and privacy of clients and users. AI coaches offer affordable and scalable on-demand care, and a growing body of research demonstrates their impact and efficacy in helping people to feel better. Industry-wide uptake of a process-driven ethical code will increase the accessibility of these services and facilitate wider access to support worldwide.
(Visit http://x2ai.com to learn more about X2, their product offerings, and medical ethical board).
Conflicts of interest: Authors AJ and MR are employees of the company of focus: X2 AI Inc., and therefore have a financial interest in the products success.
- Weizenbaum, J. (1966). ELIZA---a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. doi: 10.1145/365153.365168
- Ethical principles of psychologists and code of conduct. (2017). American Psychological Association. Retrieved from http://www.apa.org/ethics/code/
- AI in the UK: ready, willing and able? (2018). House of Lords. Retrieved from https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf
- Pichai, S. (2018). AI at Google: our principles. Google. Retrieved from https://blog.google/topics/ai/ai-principles/?utm_source=Benedict%27s+newsletter&utm_campaign=46fee441fc-Benedict%27s+Newsletter_COPY_01&utm_medium=email&utm_term=0_4999ca107f-46fee441fc-70610813
- Luxton, D. D., Anderson, S. L., & Anderson, M. (2016). Ethical Issues and Artificial Intelligence Technologies in Behavioral and Mental Health Care. Artificial Intelligence in Behavioral and Mental Health Care, 255-276. doi:10.1016/b978-0-12-420248-1.00011-8
- Bundy, A. (2016). Preparing for the future of Artificial Intelligence. AI & Society, 32(2), 285-287. doi:10.1007/s00146-016-0685-0
- Wilde, G. D., Delbare, W., & Keulenaer, J. D. (2018). Your GDPR compliance checklist. The GDPR Checklist. Retrieved from https://gdprchecklist.io/
- ACM code of ethics. (2018). Association for Computing Machinery. Retrieved from https://www.acm.org/about-acm/acm-code-of-ethics-and-professional-conduct
- Vigo, D. (2016). The health crisis of mental health stigma. Lancet, 3, 171-78. DOI: https://doi.org/10.1016/S0140-6736(16)00687-5
- Luxton, D. D. (2014). Artificial intelligence in psychological practice: Current and future applications and implications. Professional Psychology: Research and Practice, 45(5), 332-339. http://dx.doi.org/10.1037/a0034559
- Fulmer, R., Joerin, A., Gentile, B., Lakerink, L., & Rauws, M. (2018). Using psychological artificial intelligence (Tess) to relieve symptoms of depression and anxiety: A randomized controlled trial. JMIR Mental Health, 5(4), 64. doi:10.2196/mental.9782. Retrieved from https://www.x2ai.com/outcomes.
- Klos, M. C., Escoredo, M., Bunge, E., Lemos, V. The use of Artificial Intelligence for the reduction of symptoms of depression and anxiety in Argentine university students. (forthcoming).
- Tenets. (2018). Partnership on AI. Retrieved from https://www.partnershiponai.org/tenets/
- Joerin, A., Rauws, M., & Ackerman, M. (2019). Psychological Artificial Intelligence Service, Tess: Delivering On-demand Support to Patients and Their Caregivers: Technical Report. Cureus, 11(1), 3972. doi:10.7759/cureus.3972. Retrieved from https://www.x2ai.com/outcomes.
- Rauws, M., Quick, J., & Spangler, N. (2019). X2 AI Tess: Working with AI Technology Partners. The Journal of Employee Assistance, 49(1). Retrieved from https://www.x2ai.com/eapa
- Green EP, Pearson N, Rajasekharan S, Rauws M, Joerin A, Kwobah E, Musyimi C, Bhat C, Jones RM, Lai Y. (2019). Expanding Access to Depression Treatment in Kenya Through Automated Psychological Support: Protocol for a Single-Case Experimental Design Pilot Study. JMIR Res Protoc, 8(4), 11800. DOI: 10.2196/11800
- Stephens, T. N., Joerin, A., Rauws, M., & Werk, L. N. (2018). Feasibility of Pediatric Obesity & Pre-Diabetes Treatment Support through Tess, the AI Mental Health Chatbot. Translational Behavioral Medicine, doi: 10.1093/tbm/ibz043. Retrieved from https://www.x2ai.com/outcomes.
- Wakefield, J. (2019) Google’s ethics board shut down. BBC News. Retrieved from https://www.bbc.com/news/technology-47825833
- Piper, K. (2019) Exclusive: Google cancels AI ethics board in response to outcry. Vox. Retrieved from https://www.vox.com/future-perfect/2019/4/4/18295933/google-cancels-ai-ethics-board