The aggressive use of large language models (LLMs) across enterprise environments in 2024 presents a new headache for CISOs. LLMs have their own cybersecurity challenges, especially with data leakage. The cloud has its own issues, with cloud platform providers making changes without always informing their tenants. What happens when the cybersecurity issues of both LLMs and the cloud collide? Nothing good.
Multiple cloud LLMs, shadow LLMs increase risk
The biggest issue is when an enterprise hosts multiple LLM iterations on one or more of their cloud environments. No matter what CISOs and CIOs do with LLMs, they will be accepting LLM cloud risks. Whether an enterprise hosts its LLM in the cloud or on-device or on-premises will likely have a negligible impact on their threat landscape. Even if an enterprise is hosting its end locally, the other end of the LLM will almost certainly be in the cloud, especially if that vendor is handling the training. In short, there is going to be extensive cloud exposure with LLMs regardless of the CISOâs decisions.
This is all focused on authorized and licensed LLM versions. Despite enterprise policies and rules, shadow IT absolutely extends to LLMs. Employees and department heads have easy access to public models, including ChatGPT and BingChat/Co-Pilot, whenever they want. They can then use those public models to create images, do analysis, write reports, code, and even make business decisions such as âWhich of these 128 possible locations should we purchase and use to build our next stores?â
When employees and contractors use those public models, especially for analysis, they will be feeding those models internal data. The public models then learn from that data and may leak those sensitive corporate secrets to a rival who asks a similar question.
âMitigating the risk of unauthorized use of LLMs, especially inadvertent or intentional input of proprietary, confidential, or material non-public data into LLMsâ is tricky, says George Chedzhemov, BigID’s cybersecurity strategist. Cloud security platforms can help, he adds, especially for access controls and user authentication, encryption of sensitive data, data loss prevention, and network security. Other tools are available for data discovery and surfacing sensitive information in structured, unstructured, and semi-structured repositories. â
It is impossible to protect data that the organization has lost track of, data that has been over-permissioned, or data that the organization is not even aware exists, so data discovery should be the first step in any data risk remediation strategy, including one that attempts to address AI/LLM risks,â says Chedzhemov.
Brian Levine, an Ernst & Young managing director for cybersecurity and data privacy, points to end usersâbe it employee, contractor, or third-party with privilegesâleveraging shadow LLMs as a massive problem for security and one that can be difficult to control. âIf employees are using their work devices, existing tools can identify when employees visit known unauthorized LLM sites or apps and even block access to such sites,â he says. âBut if employees use unauthorized AI on their own devices, companies have a bigger challenge because it is currently harder to reliably differentiate content generated by AI from user generated content.â
For the moment, enterprises are dependent on security controls within the LLM being licensed, assuming they are not deploying homegrown LLMs written by their own people. âIt is important that the company do appropriate third-party risk management on the AI vendor and product. As the threats to AI evolve, the methods for compensating for those threats will evolve as well,â Levine says. âCurrently, much of the compensating controls must exist within the AI/LLM algorithms themselves or rely on the users and their corporate policies to detect threats.â
Security testing and decision making must now take AI into account
Ideally, security teams need to make sure that AI awareness is baked into every single security decision, especially in an environment where zero trust is being considered. âTraditional EDR, XDR, and MDR tools are primarily designed to detect and respond to security threats on conventional IT infrastructure and endpoints,â says Chedzhemov. This makes them ill-equipped to handle the security challenges posed by cloud-based or on-premises AI applications, including LLMs.
âSecurity testing now must focus on AI-specific vulnerabilities, ensuring data security, and compliance with data protection regulations,â Chedzhemov adds. âFor example, there are additional risks and concerns around prompt hijacking, intentional breaking of alignment, and data leakage. Continuous re-evaluation of AI models is necessary to address drifts or biases.â
Chedzhemov recommends that secure development processes should embed AI security considerations throughout the development lifecycle to foster closer collaboration between AI developers and security teams. âRisk assessments should factor in unique AI-related challenges, such as data leaks and biased outputs,â he says.
Hasty LLM integration into cloud services create attack opportunities
Itamar Golan, the CEO of Prompt Security, points to an intense urgency in businesses these days as a critical concern. That urgency inside many firms developing these models is encouraging all manner of security shortcuts in coding. âThis urgency is pushing aside many security validations, allowing engineers and data scientists to build their GenAI apps sometimes without any limitations. To deliver impressive features as quickly as possible, we see more and more occasions when these LLMs are integrated into internal cloud services like databases, computing resources and more,â Golan said.
âThese integrations, often non-least privileged or not well-configured, create a direct attack vector from the chat interface exposed to the outside world to the crown jewels of the cloud environment. In simple words, we believe that it is a matter of months until we see a major attack executed through the GenAI interface that leads to an account takeover, unauthorized data access, etc. Due to the unstructured nature of natural language and the new frameworks and architecture surrounding a GenAI application, we see that the current security stack won’t be sufficient to protect against this kind of prompt injection attempts.â
Attackers will target LLMs
Yet another LLM fear is that the systems will be extremely attractive targets for attackers. Bob Rudis, the VP for data science at GreyNoise Intelligence, sees those attacks as having a reasonable chance of working. âBoth on-prem and cloud-provisioned GPU/AI-compute nodes will be prime targets for attackers seeking to freeload off these resources, much in the same way that massive CPUs and high-end endpoint GPUs have been harnessed by evildoers to mine cryptocurrency. Attackers will gladly use your unsecured infrastructure to train and run their models. Plus, they’ll likely use this infrastructure to mine data stolen from internal email, SharePoint, and file servers to use in advanced phishing campaigns,â Rudis said. âAttackers will also be quick to figure out which GPU/AI-compute systems organizations rely upon for business-critical functions and work out ways to cripple them in extortion or ransomware campaigns. This might not be in the traditional way, too, given there are many ways to reduce the compute capacity of these environments without disabling them completely.â
A different perspective is offered by Igor Baikalov, the chief scientist at Semperis. He argues that protections must be placed around all manner of sensitive intellectual property in the enterprise. LLMs including generative AI âare just dumb transformers prone to fits of hallucination. If it exposes sensitive data, then the security concerns should be around protecting that sensitive data it’s being trained on and of course securing access to the application itself, like with any SaaS offering,â he says. âWhether it’s deployed on-prem, on-chip, or in the cloud, the same principles apply.â
Go to Source