The Privacy Implications of Artificial Intelligence
- June 30, 2025
- Clayton Rice, K.C.
The relentless collection of personal data is a persistent threat to privacy and freedom in the evolving era of artificial intelligence. The privacy implications of web scraping, social media monitoring and harvesting of biometric data are far-reaching and undermine the right to personal autonomy. Maintaining control over personal information is the singular challenge in the digital world as artificial intelligence becomes more integrated into everyday life with profound consequences for the future of democratic institutions.
1. Introduction
As artificially intelligent technologies harvest vast troves of personal information from their users, they employ machine learning techniques to identify traits and preferences that are not readily apparent to the users. Generative AI applications enhance this capability by allowing for predictive analytics which is “a branch of advanced analytics that makes predictions about future outcomes using historical data combined with statistical modeling, data mining techniques and machine learning.” (here) The potential for theft of personal data, exposing users to extortion and identity theft, highlights the “critical necessity to balance innovation with ethical considerations.” (here) There is a constant push-and-pull between innovation and regulation. Meta Platforms, for example, recently launched a standalone AI app with a social media component to compete with OpenAI’s ChatGPT while, in Britain, there are constant calls for stricter regulation of facial recognition technology.
2. What is Artificial Intelligence?
The Massachusetts Institute of Technology, the International Organization for Standardization and the Encyclopedia Britannica all agree on what artificial intelligence is. It is a “catchall term for a set of technologies” that make computers do things that are thought to require intelligence when done by people. (here) At its core, AI refers to a machine or computer system’s ability to perform tasks that would typically require human intelligence. It involves programming systems “to analyze data, learn from experiences, and make smart decisions.” (here) The term is frequently applied to “the project of developing systems endowed with the intellectual processes characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience.” (here)
The National Aeronautics and Space Administration agrees that AI refers to “computer systems that can perform complex tasks normally done by human reasoning” but asserts there is no single definition because AI tools are capable of a wide range of tasks and outputs. (here) NASA has adopted s. 238(g) of the U.S. National Defense Authorization Act of 2019 that includes the following in the definition:
- any artificial system that performs tasks under varying and unpredictable circumstances without significant human oversight, or that can learn from experience and improve performance when exposed to data sets;
- an artificial system developed in computer software, physical hardware, or other context that solves tasks requiring human-like perception, cognition, planning, learning, communication, or physical action;
- an artificial system designed to think or act like a human, including cognitive architectures and neural networks;
- a set of techniques, including machine learning that is designed to approximate a cognitive task; and,
- an artificial system designed to act rationally, including an intelligent software agent or embodied robot that achieves goals using perception, planning, reasoning, learning. communicating, decision making, and acting. (here)
Machine learning is a branch of artificial intelligence focused on “enabling computers and machines to imitate the way that humans learn, to perform tasks autonomously, and to improve their performance and accuracy through experience and exposure to more data.” (here) Rather than using pre-programmed instructions to process data “machine learning uses algorithms that can be trained to identify and adapt to statistical patterns.” (here) The algorithms can learn from datasets of numbers such as bank transactions, images, audio and text from books. Deep learning is a subset of machine learning that uses multilayered neural networks, called deep neural networks, “to simulate the complex decision-making power of the human brain.” (here)
3. Privacy Implications
In a white paper titled Rethinking Privacy in the AI Era: Policy Provocations for a Data-Centric World, Jennifer King and Caroline Meinhardt of the Stanford University Institute for Human-Centered Artificial Intelligence analyzed the risks presented by artificial intelligence and offered potential solutions. (here and here) They identified three kinds of risk as personal data is bought, sold and used by AI systems. I will condense the risks as follows.
First, AI systems pose many of the same privacy risks that have arisen during the recent decades of “internet commercialization” and “unrestrained data collection”. The difference is one of scale. AI systems are so “data-hungry and intransparent” that it is basically impossible to use online products or services without systemic digital surveillance across most facets of life. Second, generative AI tools trained with data scraped from the internet may memorize personal information about people as well as related data about their family and friends thus enabling spear-phishing and voice cloning. Third, data such as a resume or photograph, used for one purpose may be repurposed for training AI systems without the knowledge or consent of the person who posted them.
Predictive systems are being used by employers to screen job applicants and make decisions about who to interview. There have been reported instances, however, where the AI used has been biased. Another example is the use of facial recognition technology to identify and arrest individuals suspected of committing a crime. But, as a result of bias inherent in the data used to train facial recognition algorithms, there have been a number of high-profile wrongful arrests. Here are three recommendations the authors made to mitigate the privacy harms.
- Denormalize data collection by default by shifting away from opt-out to opt-in data collection. Data collectors must facilitate true data minimization through “privacy by default” strategies and adopt technical standards and infrastructure for meaningful consent mechanisms.
- Focus on the AI data supply chain to improve privacy and data protection. Ensuring dataset transparency and accountability across the entire life cycle must be a focus for any regulatory system that addresses data privacy.
- Flip the script on the creation and management of personal data. Policymakers should support the development of new governance mechanisms and technical infrastructure (e.g., data intermediaries and data permissioning infrastructure) to support and automate the exercise of individual data rights and preferences.
In A roadmap for trust, innovation and protecting the fundamental right to privacy in the digital age (2024-2027), the Privacy Commissioner of Canada highlighted three strategic priorities to guide the office into 2027. The second priority involves addressing and advocating for privacy in this time of fast-moving technological change “especially in the world of artificial intelligence and generative AI and encouraging privacy protective technological innovations.” (here) The roadmap specifically emphasized that embedding privacy in the design and implementation of technological advancements can result in technologies that are “responsible, trustworthy and privacy protective.”
On May 21, 2024, the European Union initiated a landmark legislative response with the adoption of the Artificial Intelligence Act hailed as “the first of its kind in the world”. (here and here) The new law came into force on August 1, 2024, and is being rolled out on a staggered timeline with the full rules scheduled to come into force next year. The statute categorizes different types of AI according to risk. Systems presenting only limited risk are subject to very light transparency obligations while high risk systems will be authorized but subject to a set of requirements and obligations to gain access to the EU market. AI systems such as “cognitive behavioural manipulation” and “social scoring” will be banned because their risk is deemed unacceptable. The law also prohibits the use of AI for predictive policing based on profiling and systems that use biometric data to categorize people according to race, religion or sexual orientation. (here)
4. Conclusion
“Artificial intelligence is the most disruptive technology of the modern era”, wrote Karl Mannheim and Lyric Kaplan in an article titled Artificial Intelligence: Risks to Privacy and Democracy published by the Yale Law Journal in 2019. “Recent events illustrate how AI can be ‘weaponized’ to corrupt elections and poison people’s faith in democratic institutions,” they said. (here) The 2016 elections in Britain and the United States exemplify the point. The biggest social cost of the burgeoning AI era is the erosion of trust in democratic institutions. Psychographic profiling by Cambridge Analytica is one poignant example. (here) Machines have already been given the power to make life-altering decisions. The Internet of Things, driverless cars and lethal autonomous weapons systems all have the capability to make these decisions without human oversight. Although the intelligence of machines is a potent tool to solve complex problems, and create new ones, it does both without transparency and accountability.