Employees Enter Sensitive Data Into GenAI Prompts Far Too Often

January 18, 2025

The letters "AI" in blue text with binary code running over top and in the background

Source: Marcos Alvarado via Alamy Stock Photo

A wide spectrum of data is being shared by employees through generative AI (GenAI) tools, researchers have found, legitimizing many organizations' hesitancy to fully adopt AI practices.

Every time a user enters data into a prompt for ChatGPT or a similar tool, the information is ingested into the service's LLM data set as source material used to train the next generation of the algorithm. The concern is that the information could be retrieved at a later date via savvy prompts, a vulnerability, or a hack, if proper data security isn't in place for the service.

That's according to researchers at Harmonic, who analyzed thousands of prompts submitted by users into GenAI platforms such as Microsoft, Copilot, OpenAI ChatGPT, Google Gemini, Anthropic's Clause, and Perplexity. In their research, they discovered that though in many cases employee behavior in using these tools was straightforward, such as wanting to summarize a piece of text, edit a blog, or some other relatively simple task, there were a subset of requests that were much more compromising. In all, 8.5% of the analyzed GenAI prompts included sensitive data, to be exact.

Customer Data Most Often Leaked to GenAI

The sensitive data that employees are sharing often falls into one of five categories: customer data, employee data, legal and finance, security, and sensitive code, according to Harmonic.

Customer data holds the biggest share of sensitive data prompts, at 45.77%, according to the researchers. An example of this is when employees submit insurance claims containing customer information into a GenAI platform to save time in processing claims. Though this might be effective in making things more efficient, inputting this kind of private and highly detailed information poses a high risk of exposing customer data such as billing information, customer authentication, customer profile, payment transactions, credit cards, and more.

Employee data makes up 27% of sensitive prompts in Harmonic's study, indicating that GenAI tools are increasingly used for internal processes. This could mean performance reviews, hiring decisions, and even decisions regarding yearly bonuses. Other information that ends up being offered up for potential compromise includes employment records, personally identifiable information (PII), and payroll data.

Legal and finance information is not as frequently exposed, at 14.88%, however, when it is, it can lead to great corporate risk, according to the researchers. Unfortunately, when GenAI is used in these fields, it's for simple tasks such as spell checks, translation, or summarizing legal texts. For something so small, the consequences are incredibly high, risking a variety of data such as sales pipeline details, mergers and acquisition information, and financial data.

Security information and security code each compose the smallest amount of leaked sensitive data, at 6.88% and 5.64%, respectively. However, though these two groups fall short compared to those previously mentioned, they are some of the fastest growing and most concerning, according to the researchers. Security data inputted into GenAI includes penetration test results, network configurations, backup plans, and more, providing exact guidelines and blueprints as to how bad actors can exploit vulnerabilities and take advantage of their victims. Code inputted into these tools could put technology companies at a competitive disadvantage, exposing vulnerabilities and allowing competitors to replicate unique functionalities.

Balancing GenAI Cyber-Risk & Reward

If the research shows that GenAI offers high-risk potential consequences, should businesses continue to use it? Experts say they might not have a choice.

"Organizations risk losing their competitive edge of if they expose sensitive data," said the researchers in the report. "Yet at the same time, they also risk losing out if they don't adopt GenAI and fall behind."

Stephen Kowski, field chief technology officer (CTO) at SlashNext Email Security+, agrees. "Companies that don’t adopt generative AI risk losing significant competitive advantages in efficiency, productivity, and innovation as the technology continues to reshape business operations," he said in an emailed statement to Dark Reading. "Without GenAI, businesses face higher operational costs and slower decision-making processes, while their competitors leverage AI to automate tasks, gain deeper customer insights, and accelerate product development."

Others, however, disagree that GenAI is necessary, or that an organization needs any artificial intelligence at all.

"Utilizing AI for the sake of using AI is destined to fail," said Kris Bondi, CEO and co-founder of Mimoto, in an emailed statement to Dark Reading. "Even if it gets fully implemented, if it isn't serving an established need, it will lose support when budgets are eventually cut or reappropriated."

Though Kowski believes that not incorporating GenAI is risky, success can still be achieved, he notes.

"Success without AI is still achievable if a company has a compelling value proposition and strong business model, particularly in sectors like engineering, agriculture, healthcare, or local services where non-AI solutions often have greater impact," he said.

If organizations do want to pursue incorporating GenAI tools but want to mitigate the high risks that come along with it, the researchers at Harmonic have recommendations on how to best approach this. The first is to move beyond "block strategies" and implement effective AI governance, including deploying systems to track input into GenAI tools in real time, identifying what plans are in use and ensuring that employees are using paid plans for their work and not plans that use inputted data to train systems, gaining full visibility over these tools, sensitive data classification, creating and enforcing workflows, and training employees on best practices and risks of responsible GenAI use.

source: DarkReading

Malicious PyPi package steals Discord auth tokens from devs

US Sanctions Chinese Hacker & Firm for Treasury, Critical Infrastructure Breaches

Free online web security scanner

Don't show website scan report rating on the leaderboard

Customer Data Most Often Leaked to GenAI

Balancing GenAI Cyber-Risk & Reward

Top News: