We recently unveiled a deeper look at Cohesity’s collaboration with Microsoft’s Azure OpenAI to bring organizations even more power around managing, securing, and protecting their data. In this blog post, we are going to double-click into how Cohesity is tapping into AI—generative AI, specifically—to unlock an organization’s data to get ahead of security threats.
Generative AI uses algorithms to generate new content (written content, images, video, audio, and computer code, etc…) based on user input. Unlike earlier versions of AI, generative AI can create new content, like news articles, poetry, or cyber threat analyses presented in a conversational UI. One type of generative AI that powers many conversational technologies are large language models (LLMs). LLMs generate novel combinations of text in the form of natural-sounding language, and are at the basis of one of the use cases we discuss below.
There are multiple applications for generative AI, and over time, we’ll see growing opportunities for AI to mitigate future threats based on current data, risk profiles, and observed user behavior patterns. Today, we’re going to explore a key use case put forth by Cohesity that accelerates the detection of a cyberattack using backup data.
Anomaly detection with AI
In a blog post about the power of AI-ready data and Cohesity, we discussed how Cohesity helps customers back up their entire data estate and unlock new ways to protect and manage data, in addition to improving cyber resilience with data isolation, threat detection, and data classification. While Cohesity provides deep insights and analytics that improve security posture, this new venture with Azure OpenAI uses Cohesity’s unique distributed file system to use the same data they’re already securing and managing with us to be fully leveraged with AI models.
One use case that we showcased on a recent Spotlight on Security with Microsoft executives includes a preview of Azure OpenAI integrated into the Cohesity platform to streamline anomaly detection, provide human-readable analyses of the threat, and help organizations streamline recovery time. In this possible use case, Cohesity presents the ability to take our foundation of log and system data, combined with insights built into our Cohesity DataHawk threat intelligence solution, to use AI to query all of this data and generate interactive reports for the CISO and practitioners alike.
With the use of Azure OpenAI, we generated an Insights Summary report that found a couple of objects on virtual machines that could potentially be affected by ransomware. In response, there are two user scenarios that we’ll explore. First, how this can be used by a CISO to assess business impact, and second, how practitioners can streamline responses.
First, let’s start with the CISO. The integrated AI uncovered a high number of affected, highly sensitive files (i.e., these files are from the finance department and show anomalous behavior or anomalous changes associated with a high confidence rating for anomalies). A CISO can ask simply, “What files were impacted on the VMs?” The output is an executive-level summary that provides a near real-time, human-readable assessment of the risk profile of these anomalies: highly confidential information from finance, and the risk rating is high. Immediately the CISO can assess business risk while team members in the SOC get details that will stop, mitigate, and respond to the threat.
From there, the practitioner can ask for more details on the impacted systems, and the integrated AI will offer human-readable impact analyses (example: the system contains 328 Word documents in the finance department’s file share, as well as 254 Microsoft Excel workbooks). And since this is leveraging comprehensive metadata from Cohesity backup and recovery data, the practitioner can view contextual threat response information, such as “This server is currently running on VMware and can be recovered via Cohesity leveraging instant recovery. This is a critical system that requires 100% uptime for company operations.”
These are just examples, but you can imagine the potential for these conversational analyses to completely change how quickly cross-functional teams can assess risk, align on impact, and kick off an action plan before attackers can further disrupt the business.
Under the hood: unlocking cyber threat insights from AI-ready data
So how does it work? The Cohesity Data Cloud, our modern data security and management platform, is unique in that it is “AI-ready.” It is architected in such a way that is easily searchable and has granular access controls. With global search, you can search across multiple workloads and histories of snapshots. This allows AI and large language models to quickly answer critical business questions, and ensures that only the right people see responses regarding the data they have access to.
Backup data from Cohesity is indexed and contains the specific metadata that makes utilizing that data in Large Language Models (LLM) possible. In the same way that backup data is stored and able to be searched for threat analysis, it is also AI-ready so that when a person asks questions about the data through the LLM or other power Language AI models (e.g., Azure OpenAI), LLM provides human-readable responses. By using authoritative data sources backed up on Cohesity, it can help to ensure more accurate, actionable responses to user or machine queries.
Since Cohesity indexes all backup data, APIs return context-aware responses in a highly performant way that doesn’t use up too much compute power.
Security of data assets with Cohesity and Azure OpenAI
We’ve always taken a security-first approach when protecting customer data. This is at the forefront of our burgeoning AI strategy. There are a few key factors that customers should know when considering the power of Azure OpenAI with Cohesity that address risks:
- In the same way that organizations design role-based access controls (RBAC) and consider enterprise privileges, cross-functional alignment on what AI can access will play a pivotal role in safely introducing these initiatives.
- We have granular access controls, so only the right people see responses regarding the data they have access to.
- The data is indexed and searched in place so data stays put and attack surfaces are minimized as there are no extra copies required.
We hope that this blog post sheds some light on a topic that is in a lot of news headlines in recent months and a practical application that will allow organizations to safely and securely introduce AI into their cybersecurity strategy using comprehensive, clean, and contextual backup data from Cohesity.
This blog is part of our “Road to Catalyst” series. Check back every week for new data security and AI content, and register today to join us at Cohesity Catalyst, our data security and management virtual summit.