One of the topics that people often ask me about is what datacenter topologies VMware SRM supports. The good news is that SRM does have flexible capabilities for a number of topologies to help support more complex use cases beyond the typical two site deployment.
Shared Recovery Site (VC at Remote Sites)
The shared recovery site use case is one of the more commonly seen topologies outside of a two datacenter deployment. This is a use case that has been supported by SRM for multiple releases and provides a good option for customers that are looking to protect virtual machines at remote offices that have their own vCenter instances.
With this use case the customer is able to share the resources used for recovery amongst several remote offices. This provides flexibility in providing the resources required at the recovery site. A conservative approach would provide sufficient capacity to failover all remote sites to the shared recovery site. An alternative approach would be to oversubscribe the recovery site resources on the assumption that you wouldn’t need to recover all the remote sites at once. This second approach obviously has some benefits in terms of the potential for reduced capital expense at the recovery site but may add risk if there is a chance that all the remote sites may actually need to fail over at the same time.
One of the advantages of this topology, with remote vCenters at each site, is that local IT retain full manageability of the infrastructure. Even in the event of a network outage between the central data center and the remote site the hosts at the remote site remain accessible. In contrast one of the disadvantages of this approach is that each SRM pair must be at the same major and minor version and each vCenter server must be a supported version for the SRM server joined to it. This means that upgrades of SRM or vCenter across the set of data centers should be carefully planned and coordinated.
Shared Recovery Site (Central VC)
While the above topology is a common approach to providing shared recovery site capabilities it is not the only approach to achieving this. If there isn’t a need for each remote site to have its own vCenter instance then a simple single-pair SRM deployment could be adapted to meet the use case.
For this deployment you would deploy two vCenter instances, one to manage the resources at the recovery site and one to manage the resources at the remote site. An instance of SRM for each vCenter would be deployed and the two paired together. You could then create recovery plans for each remote site but have a single shared vCenter to consolidate management of the remote sites.
This approach has the advantage that upgrades of the vCenter and SRM instances would be somewhat simpler than the shared recovery site (fewer moving parts). Additionally you would be able to have different levels of recovery plan in the same SRM deployment. For example you could define a recovery plan for each remote office but also recovery plans for each region, grouping the remote offices together. The disadvantage of this approach is that the remote infrastructure couldn’t be managed by vCenter when you lose connectivity.
Additional Multi-Pair Topologies
In addition to the shared recovery site topologies defined above I am often asked about support for other multi-datacenter topologies. Prior to the release of SRM 5.8 VMware supported some additional topologies using the Request for Product Qualification (RPQ) process.
While the shared recovery topology is still the main use case for multi-pair topologies, the technology provided by SRM underneath actually supports more flexible configurations. The SRM documentation has recently been updated to remove some of the previous limitations and allow some more complex topologies to be supported based on the concept of multiple SRM pairs. From the SRM 5.8 Installation and Configuration Guide:
“In addition to the shared recovery site configuration, Site Recovery Manager also allows and supports shared protected site (1:N) and many-to-many (N:N) configurations.”
It is also supported to begin with a standard two site SRM deployment and later on add additional site pairings to add in more complex topologies.
With the new flexibility a new set of potential use cases are now supported. For example a three site topology allowing a round robin like protection use case would be possible:
- Site A’s workloads being protected to Site B
- Site B’s workloads being protected to Site C
- Site C’s workloads finally being protected back to Site A
Another topology that might be of interest would be to have a traditional SRM pairing between two core data centers while also providing DR protection for remote offices to one (or even both) of the core data centers.
The examples presented above are not intended to be exhaustive. As long as you don’t exceed the configuration maximums (use 10 or fewer SRM pairings) there are a wide variety of topologies that are enabled.
SRM 5.8 Improvements In Managing Multiple SRM Pairs
With the introduction of the web client in the SRM 5.8 there have been several improvements in managing more complex topologies with SRM. The first improvement has been with the increase in supported SRM scale up to 5,000 virtual machines from 1,000 in previous releases. This makes it easier to consider deploying SRM across multiple pairs of data centers.
One of the big user interface improvements is gained by integrating with the inventory lists of the web client. It is now easy to quickly switch between multiple site pairings.
The lists of protection groups and recovery plans is also consolidated so you can see plans belonging to multiple site pairings at once.
In addition to the consolidated views the various wizards to create new protection groups and recovery plans (for example) let users choose the target SRM pairing during creation.
Finally the SRM installer has also been enhanced to make it easier to deploy multiple pairs of SRM servers without having to specify command line parameters to the installer. For the initial SRM pairing you would use the default identifier, for subsequent pairings each SRM instance in the pair should be installed using the same customer Plug-in Identifier (and each pairing should have a distinct identifier).
Replication Topology Support
The SRM 5.8 Installation and Configuration guide states:
Site Recovery Manager supports point-to-point replication. Site Recovery Manager does not support replication to multiple targets, even in a multi-site configuration.
While SRM pairs do allow you to failover distinct workloads to different recovery sites, SRM doesn’t currently support orchestrating the failover of the same workload to different recovery sites. This means that SRM only supports managing a replicated datastore with one SRM pairing.
Hopefully this article has outlined some of the newly supported capabilities available to SRM users. Planning multi-site disaster recovery strategies takes careful thought as to what you are looking to achieve but the new capabilities make it even easier to address a variety of recovery requirements.
If you’re interested in learning more about SRM 5.8 then I’d recommend the official documentation, product resource page, the uptime blog, and Eric Shanks’ guide to SRM 5.8.