Organizations must make a concerted effort to address the privacy and security questions raised by artificial intelligence if they are to realize the technology’s full potential.
In part one of this three-part series, I explored how, without adequate visibility into the mechanics of their artificial intelligence (AI) solutions, organizations risk unwittingly embedding unethical biases into their operations.
However, ethical challenges are not the only reason organizations should resist granting total operational autonomy to their increasingly capable AI solutions. Data privacy and security are enduring concerns for nearly every type of organization — healthcare organizations in particular — and haphazard use of AI will only complicate these matters further. In fact, while a recent KPMG report found that 89 percent of healthcare professionals claim that AI is currently creating efficiencies in their systems, it also found that 75 percent are concerned that AI could jeopardize the security and privacy of patient data.
Major regulatory bodies are already starting to recognize the importance of keeping AI systems and their associated data private and secure. For instance, the European Commission — which is tasked with overseeing the enforcement of GDPR — has identified “privacy and data governance” as one of the seven key requirements that AI systems must meet in order to be deemed trustworthy.
Ultimately, though, it is incumbent upon organizations themselves to bolster the privacy and security of their AI solutions as diligently as they root out unethical biases from the solutions. While this continues to be a challenge, it is by no means a prohibitive one, and overcoming it will enable organizations to tap into the full breadth of AI’s potential.
Preserving Data Privacy Rights in the AI Era
In highly regulated industries, extensive process auditing is a core component of compliance. If, when regulators call, an organization’s analytics professionals are unable to provide an explanation of how their AI solution arrived at a decision or why it recommended a specific action, the organization may end up facing financial penalties or other sanctions. In late 2018, this risk proved too great even for the world’s largest asset management firm, BlackRock. The firm discontinued the use of two proven AI solutions — one that forecasted market volume and another that predicted redemption risk — because, as BlackRock Head of Liquidity Research Stefano Pasquali explained, “The senior people want[ed] to see something they [could] understand.” Absent this explainability, they deemed the risk of improper data use to be too high.
The need for transparent, explainable AI solutions is just as great, if not greater, in healthcare. Patients’ privacy rights are protected by a range of HIPAA provisions, and the deployment of AI solutions adds a new wrinkle to longstanding HIPAA compliance best practices. Among other things, HIPAA requires organizations operating in the healthcare space to only grant access to protected health information (PHI) to individuals who need it as part of their job function, encrypt any PHI that will pass through a third-party server, and de-identify PHI so that it cannot be traced back to a specific individual.
To a certain extent, the functional principles underlying AI solutions inherently increase the complexity of complying with these requirements. While most traditional technology solutions are designed as closed systems for both privacy and security reasons, many AI solutions are left as at least partially open systems by design — the core premise of machine learning is that algorithms’ decision-making architectures will become progressively more sophisticated as the algorithms take in steady streams of new, varied input data. Overcoming this challenge often requires an approach that is firmly grounded in partitioning datasets so that personally identifiable information (PII) like names and addresses is kept separate from all other information. This strict de-identification enables organizations to derive value from their AI solutions while still honoring the privacy protections enshrined in HIPAA.
That said, as AI becomes increasingly powerful, analytics professionals will need to make a concerted effort to stave off accidental algorithmic re-identification. Many AI solutions are designed to identify patterns in massive datasets that would otherwise go unnoticed, and research has already demonstrated that, in the wrong circumstances, such a pattern can take the shape of a unification of PII and PHI that violates a patient’s privacy rights. An analyst may not be able to deduce an individual’s identity by looking at a spreadsheet with millions of rows of data – but an algorithm might – and organizations will need to put additional precautionary measures in place moving forward to prevent their AI solutions from becoming too effective for their own good.
A New Class of Cybersecurity Threats
Beyond reining in algorithms that roguely re-identify data, organizations must also take precautions to secure their AI solutions — suffering a breach is no way to ensure patients’ privacy rights remain intact. Because the designs of many AI solutions are different in kind from those of traditional solutions, this requires more than careful adherence to tried-and-tested cybersecurity best practices. Over the course of the last decade, a great deal of research has been done on adversarial attacks — cyberattacks that are tailored to AI systems’ unique weak spots. There are two types of adversarial attacks of which healthcare organizations should be particularly wary: evasion attacks and poisoning attacks.
An evasion attack consists of a bad actor developing an understanding of how an algorithm works and manipulating the algorithm by feeding it “noise data.” For instance, in 2018, researchers published a paper outlining how they had tricked smart digital assistants like Alexa, Siri, and Cortana into divulging highly sensitive information. After figuring out how these assistants’ natural language processing algorithms functioned, the researchers were able to create audio data that sounded like birds chirping to the human ear, but that actually instructed the assistants to execute a series of actions. This type of vulnerability could be exploited in a variety of ways in the healthcare space. For instance, many healthcare insurance providers rely on machine learning-driven solutions for billing code processing. As theorized by a recent study, a bad actor could leverage an understanding of an insurer’s algorithms to “automate the discovery of code combinations that maximize reimbursement or minimize the probability of claims rejection.”
A poisoning attack has the potential to be even more sinister, and usually involves elements of traditional cyberattacking, as well. In this class of attack, a bad actor hacks an organization’s systems and inserts “poison data” into an algorithm’s training data. As a result, the algorithm learns based on incorrect or otherwise corrupted data, meaning its foundational decision-making architecture will be inherently flawed. If such an attack were to be executed against an AI solution designed to diagnose a serious illness based on X-rays or CT scans, a doctor might end up making critical decisions based on false negatives, potentially jeopardizing patients’ lives in the process.
Trust but Verify
Advanced AI solutions are still very much an emergent technology. Consequently, neither analytics stakeholders nor healthcare stakeholders have come to a clear consensus on best practices for ensuring data privacy and security in this new terrain. What is clear, however, is that new safeguards and procedures will need to be developed and implemented in order for organizations to get the most out of their AI solutions.
Encouragingly, governing bodies like the European Commission and the Organisation for Economic Co-operation and Development (OECD) are taking an active role in leading the charge. As OECD’s guidelines for safe and secure AI solutions dictate:
- AI systems should be robust, secure, and safe throughout their entire lifecycle so that, in conditions of normal use, foreseeable use or misuse, or other adverse conditions, they function appropriately and do not pose unreasonable safety risk.
- AI actors should ensure traceability, including in relation to datasets, processes, and decisions made during the AI system lifecycle, to enable analysis of the AI system’s outcomes and responses to inquiry, appropriate to the context and consistent with the state of the art.
- AI actors should, based on their roles, the context, and their ability to act, apply a systematic risk management approach to each phase of the AI system lifecycle on a continuous basis to address risks related to AI systems, including privacy, digital security, safety, and bias.
By building on these cornerstones, organizations operating in every industry will be able to develop AI solutions that augment their existing capabilities in as of yet unimaginable ways. As long as analytics professionals take their privacy and security responsibilities seriously, there is no reason for concern. As with any powerful tool or organization, the responsible use of AI should proceed neither in the mode of naïve credulity nor in the mode of outright distrust, but in the mode of trust but verify.