Instant Recovery for unlimited VMs, to any point in time, on a distributed, resilient data store

August 10, 2017

Products, TechTalks

Mark Thomas | Member of Technical Staff

Cohesity DataProtect allows our users to easily protect their VM environments (such as VMware, Hyper-V etc) using converged data protection and simple policy-based automation. Our distributed scheduler automatically uses all the available nodes on the Cohesity cluster to parallelize the backups and maximize efficiency, thereby providing low recovery point objectives (RPO). Another unique feature of the cohesity platform is that we enable near-instant recovery, for an unlimited number of VMs, and to any point-in-time backup stored on Cohesity – to support both recovery workflows and test/dev. Some of the key features that our platform provides in this regard are :

  1. Instantly recover from any backed up snapshot. We don’t just support instant recovery from the latest backup, but from any backup point-in-time.
  2. Recover an unlimited number of VMs at the same time (for example, all VMs belonging to an application group, or all the VMs from a cluster in the case of a DR scenario).
  3. VM data is recovered on a distributed, high-performance, resilient data store. VM data remains highly available and consistent even where there might be hardware, software or network failures on the Cohesity cluster.

We can instantly recover any VM from any backup because all our full and incremental backups are stored internally on SpanFS as completely hydrated clones. Traditional backup vendors store each incremental backup as a delta from the previous backup. This leads to long a delta chain that needs to be broken periodically. In addition to that, traditional backup vendors need to do additional consolidation of the deltas before a VM can be recovered, which leads to delays and longer RTOs. With Cohesity, since all snapshots are kept fully hydrated (i.e., in the same format as in the original VM environment), there are neither any long snapshot chains nor any additional consolidations required at the time of recovery – thereby achieving near-instantaneous RTOs.

Using a simple shopping cart like experience via our UI or via simple REST APIs, our users can instantly bring up and use multiple VMs. The VMs are brought up with the Cohesity DataPlatform serving as the underlying datastore (we also provide options to automatically migrate the VMs back to their original or alternate datastores). VMs are recovered in a fully distributed, high-throughput, and resilient volume on DataPlatform. Since the Cohesity DataPlatform is a fully distributed scale-out system, the VMs end up leveraging the hardware resources (such as CPU, RAM, SSD) of all the nodes in the cluster. This makes sure that even when our users are restoring multiple VMs at the same time, they are all instantly available and usable.

Finally, it is worth mentioning that a system should not only have great performance characteristics (such as providing the ability to instantly restore a VM), but it should also make sure that any data that it serves remains consistent and available even when there are failures of any kind. This is especially hard in distributed systems, because the probability of failures increases quite fast as the size of the distributed system increases. The Cohesity DataPlatform scales linearly as nodes are added [add link], but it also makes sure that data is always consistent and available. We do this by using distributed consensus based algorithms to make sure that data is always replicated correctly to more than one node before we acknowledge a user write.

The combination of all the above capabilities in the Cohesity platform gives our users a very easy, quick and powerful way to restore VMs for multiple different use cases (such as single VM recovery, DR scenarios, Test/Dev purposes etc).

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*