Devices & Hardware

CloudNet: A Platform for Optimized WAN Migration of Virtual Machines

Published
of 11
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Description
University of Massachusetts, Technical Report : A Platform for Optimized WAN Migration of Virtual Machines Timothy Wood Prashant Shenoy University of Massachusetts Amherst
Transcript
University of Massachusetts, Technical Report : A Platform for Optimized WAN Migration of Virtual Machines Timothy Wood Prashant Shenoy University of Massachusetts Amherst K.K. Ramakrishnan Jacobus Van der Merwe AT&T Labs - Research Abstract Cloud computing platforms are growing from clusters of machines within a data center to networks of data centers with resources spread across the globe. Virtual machine migration within the LAN has changed the scale of resource management from allocating resources on a single server to manipulating pools of resources within a data center. We expect WAN migration to likewise transform the scope of provisioning from a single data center to multiple data centers spread across the country or around the world. In this paper we propose a cloud computing platform linked with a VPN based network infrastructure that provides seamless connectivity between enterprise and data center sites, as well as support for live WAN migration of virtual machines. We describe a set of optimizations that minimize the cost of transferring persistent storage and moving virtual machine memory during migrations over low bandwidth, high latency Internet links. Our evaluation on both a local testbed and across two real data centers demonstrates that these improvements can reduce total migration and pause time by over 3%. During simultaneous migrations of four VMs between Texas and Illinois, s optimizations reduce memory migration time by 65% and lower bandwidth consumption for the storage and memory transfer by 2GB, a 57% reduction. 1 Introduction Cloud computing enables both large and small enterprises to better manage their resources some no longer need to invest in local IT resources and can instead lease cheaper, ondemand resources from providers, while others can utilize the flexibility of cloud resources to dynamically meet peak demand without having to over-provision in-house resources. Since cloud platforms typically rely on virtualization, new resources can be quickly and dynamically added within minutes. From a cloud computing service provider s perspective, server virtualization allows flexible multiplexing of resources among customers without the need to dedicate physical resources individually. Current commercial solutions present cloud servers as isolated entities with their own IP address space outside the customer s control. This separation of cloud and enterprise resources increases software and configuration complexity when deploying services, and can lead to security concerns since enterprise customers must utilize IP addresses on the public Internet for their cloud resources. Cloud platforms leave the onus on the the customer to securely connect the cloud and enterprise resources and manage firewall rules. A more desirable architecture is for storage and compute resources in the cloud to be seamlessly connected to an enterprise s users and applications, acting as if they were secure, local resources within the enterprise LAN. In such a scenario, we envision that an enterprise s IT services will be spread across the corporation s data center as well as dynamically set-up cloud data centers. Enterprises may choose to locate applications in provider cloud data centers for performance reasons, e.g., when the provider cloud is more optimally placed between customer sites than the enterprise own data center, or it might utilize the provider cloud to handle overflows from local servers during periods of peak demands. Ideally, these cloud data centers could be located anywhere in the world to take advantage of costs like energy, infrastructure and labor, or workload metrics such as diurnal usage patterns. Further, cloud data centers in certain geographies can be exploited to move data and applications closer to end-users. These challenges increase when placement decisions can change, requiring applications to be dynamically moved between data centers in response to changing costs or workloads. As a consequence, quickly and transparently migrating computing and storage from one data center to another (whether in the enterprise or in the cloud) will be necessary to break the boundaries between geographically separated data centers. WAN migration changes the scale of provisioning from managing servers on a rack to optimizing pools of resources from multiple data centers. It also greatly simplifies deployment into the cloud, allowing an enterprise to seamlessly move a live application from its own infrastructure into a cloud data center without incurring any downtime. Unfortunately, existing virtual machine migration techniques are designed for the LAN, and are not sufficiently optimized to perform well in low bandwidth, high latency settings that are University of Massachusetts, Technical Report typical in WAN environments. Research prototypes and commercial products are only beginning to make WAN migration feasible, and the requirements in terms of storage and network configuration, as well as bandwidth and latency needs still prevent it from being practical. We propose a platform called in order to achieve the vision of securely connected enterprise and cloud sites that support dynamic migration of resources. uses virtual private networks (VPNs) to provide secure communication channels and allow customers greater control over network provisioning and configuration between their sites and the cloud. bridges the local networks of multiple data centers, making WAN-based cloud resources look like local LAN resources and allowing LAN-based protocols to seamlessly operate across these bridged WAN sites, albeit with increased network delay. As a consequence, LAN-based live virtual machine migration techniques [9] operate unmodified over, allowing VMs to be moved across WAN sites. However, such a capability addresses only part of the problem, as LAN-based live migration techniques perform poorly in low-bandwidth high-latency WAN settings. To address this key challenge, incorporates a set of optimizations to significantly improve performance of VM migration in WAN environments. Further, while traditional live migration methods assume a shared file system is available at both sites, allows VM migration across data centers with or without shared storage by also migrating disk data when no shared storage is available. Our contributions include: 1. The design and implementation of a cloud computing platform that seamlessly connects resources at multiple data center and enterprise sites. 2. A holistic view of WAN migration that handles persistent storage, network connections, and memory state with minimal downtime. 3. Optimizations that minimize the total migration time, application downtime, and the volume of data transferred. 4. An extensive evaluation of how different application types impact migration performance under a variety of network conditions. Our experiments using a set of realistic applications show s optimizations decreasing memory migration and pause time by 3 to 7% in typical link capacity scenarios. We also evaluate application performance during migrations to show that s optimizations reduce the window of decreased performance as VM state is transferred compared to existing techniques. 2 Design Overview In this section, we present some background and an overview of the design, along with the motivation for why WAN migration is essential for managing resources across data centers. Enterprise Sites Internet VM VPC 2 VM VM VPC 1 VM Cloud Sites Figure 1: Two VPCs isolate resources within the cloud sites and securely link them to the enterprise networks. 2.1 Seamless, Secure Cloud Connections Most cloud platforms allow cloud resources to have either private IP addresses that confine them within the cloud, or public IP addresses that allow them to be connected to the enterprise, but also potentially expose them to malicious Internet traffic. These cloud platforms rely on user configured firewalls, such as Amazon EC2 s security groups, to ensure that cloud and enterprise resources can be securely connected. These approaches are inadequate for enterprise needs as no effort is made to give the abstraction that cloud resources are seamlessly connected to the enterprise s existing infrastructure, and misconfiguration can easily leave resources unprotected. To address these transparency and security challenges, uses the notion of a Virtual Private Cloud (VPC) 1. A VPC is a combination of cloud computing resources with a VPN infrastructure to give users the abstraction of a private set of cloud resources that are transparently and securely connected to their own infrastructure. Figure 1 shows a pair of VPCs that span multiple cloud data centers, but present a unified pool of resources to each enterprise. Seamless network connections: uses MPLSbased VPNs to create the abstraction of a private network and address space shared by all VPN endpoints, connecting resources from different sites as if they were on a single network. Since addresses are specific to a VPN, the cloud operator can allow customers to use any IP address ranges that they like without worrying about conflicts between cloud customers. Another benefit of MPLS-based VPNs is that the level of abstraction can be made even greater with Virtual Private LAN Services (VPLS) that bridge multiple VPN endpoints onto a single LAN segment. This allows cloud resources to appear indistinguishable from existing IT infrastructure already on the enterprise s own LAN. Secure any-to-any communication: VPNs are already used by many large enterprises, and cloud sites can be easily added as new secure endpoints within these existing networks. VPCs use VPNs to provide secure communication channels via the creation of virtually dedicated paths in the 1 After proposing the virtual private cloud concept in [29], we have since found it also used on a blog post encouraging the use of VPNs and cloud computing [13], and it has subsequently been used for an Amazon product. University of Massachusetts, Technical Report provider network. This eliminates the need to configure complex firewall rules between the cloud and the enterprise, as all sites can be connected via a private network inaccessible from the public Internet. 2.2 Resource Pools that Span Data Centers As enterprises increase their reliance on cloud computing for cheap and dynamic access to resources, it has become necessary to manage and optimize resources across multiple data centers. For instance, a single cloud provider may expose the presence of different geographically-separate data centers e.g., as availability regions in EC2 [4] enabling an enterprise to perform cross-geographic placement and optimizations. Similarly, an enterprise may lease resources from different cloud providers, each with their own data center, and perform cross-data center optimizations to ensure availability or to exploit dynamic prices. Today, jointly managing multiple data centers across the Internet is difficult because the lack of seamless connections between sites isolates resources, and there are only limited mechanisms for moving resources between locations. s VPC architecture simplifies cross-data center management, since its use of VPNs enables independent resource pools at each cloud site to be grouped into a single pool of resources transparently connected to the enterprise. Resources at new cloud data centers can be easily mapped into the VPC, and existing resources can be efficiently moved between enterprise and data center sites. Further, applicationlevel considerations such as workloads or fault tolerance requirements can be used to dynamically decide where to place individual VMs. 2.3 Efficient WAN Migration In order to dynamically manage and optimize resources across multiple data centers, an enterprise must have the ability to efficiently perform live migration of applications (and their data) across data centers. Several virtualization platforms support efficient migration of VMs within a local network [9, 2]. By virtue of presenting WAN resources as LAN resources, s VPC abstraction allows these live migration mechanisms to function unmodified across data centers separated by a WAN. However, the lower bandwidth and higher latencies over WAN links result in poor performance, as we show in Section 3.3. In fact, VMWare s recently announced support for WAN VM migration between nearby data centers requires at least 622 Mbps of bandwidth dedicated to the transfer, and is designed for links with less than 5 msec latency [3]. Despite being interconnected using fat gigabit pipes, data centers will typically be unable to dedicate such high bandwidth for a single application transfer, plus enterprises will want the ability to migrate a group of related VMs concurrently. Further, current live VM migration techniques assume the presence of a shared file system, which enables them to migrate only memory state and avoid Net Mem Disk VPN Setup Asynchronous Copy Pause VM Live Memory Transfer Synchronous Time (not to scale) ARP Figure 2: The phases of a migration for non-shared disk, memory, and the network in. moving disk state. A shared file system may not always be available across a WAN or the performance of the application may suffer if it has to perform I/O over a WAN. Therefore, WAN migration techniques must be able to optionally migrate an application s disk state, in addition to migrating its memory state. Current LAN-based live migration techniques must be optimized for WAN environments before enterprises can fully exploit their benefits for cross data-center resource management, and this is the primary focus of this paper. 3 WAN Migration in Consider an organization which desires to move one or more applications (and possibly their data) from Data Center A to Data Center B. Each application is assumed to be run in a VM, and we wish to live migrate those virtual machines across the WAN. uses these steps to live migrate each VM: Step 1: Establish layer-2 connectivity between data centers, if needed. Step 2: If storage is not shared, transfer the application s disk state. Step 3: Transfer the memory state of the application to a server in Data Center B, as it continues running without interruption. Step 4: Once the disk and memory state have been transferred, briefly pause the application for the final transition of memory and processor state to Data Center B. This process must also maintain any active network connections between the application and its clients. While these steps, illustrated in Figure 2, are well understood in LAN environments, migration over the WAN poses new challenges. The constraints on bandwidth and the high latency found in WAN links makes steps 2 and 3 more difficult since they involve large data transfers. The IP address space in step 4 would typically be different when the VM moves between routers at different sites, making it difficult or impossible to seamlessly transfer active network connections. avoids this problem by using VPLS VPN technology in step 1, and utilize a set of migration optimizations to improve performance in the other steps. University of Massachusetts, Technical Report VPLS-Driven Migration Bridging sites A and B with a layer-2 connection simplifies network reconfiguration during a migration because it provides the abstraction of a single LAN across Data Centers A and B. While there are several technologies available to create such connections, uses VPLS based VPNs since these are already commonly used by enterprises. In many cases, Data Center B will already be a part of the customer s virtual private cloud, because other VMs owned by the enterprise are already running there. However, if this is the first VM being moved to the site, then a new VPLS endpoint must be created to extend the VPC into the new data center. Creating a new VPLS endpoint involves configuration changes on the data center router in question. This is a process that can be readily automated via configuration interfaces on modern routers [2, 1]. Group membership in VPLS VPN is typically determined during this configuration phase. However, to facilitate more dynamic group changes, Cloud- Net uses a centralized VPN Controller to adjust which VPLS endpoints are grouped together to form each virtual private cloud. The VPN Controller maintains a ruleset indicating which endpoints should have connectivity; as all route control messages pass through the VPN Controller, it is able to control how the tunnels forming each VPLS are created. This ensures that each customer s resources are isolated within their own VPLS networks, providing s virtual private cloud abstraction. Maintaining Network Connections: Once disk and memory state have been migrated (as discussed in the subsequent sections), must ensure that active network connections are redirected to Data Center B. In LAN migration, this is achieved by having the destination host transmit an unsolicited ARP message that causes the local switch to adjust the mapping for the VM s MAC address to its new switch port [9]. Over a WAN, this is not normally a feasible solution because the source and destination are not connected to the same switch. Fortunately, s use of VPLS bridges the VLANs at data centers A and B, causing the ARP message to be forwarded over the Internet to update the switch mappings at both sites. This allows open network connections to be seamlessly redirected to the VM s new location. 3.2 Disk State Migration LAN based live migration assumes a shared file system for VM disks, eliminating the need to migrate disk state between hosts. As this may not be true in a WAN environment, Cloud- Net supports either shared disk state or a replicated system that allows storage to be migrated with the VM. If the enterprise has access to a global SAN, then WAN migration can be achieved by simply granting the VM secure access to the SAN from both data centers. Otherwise, we have a shared nothing architecture where VM storage must be migrated along with the VM memory Pause Time (sec) SpecJBB Kernel Compile TPC-W Figure 3: Low bandwidth Internet links can significantly increase the time required to migrate virtual machines. state. uses a disk replication system that migrates storage similar to how memory is transferred during a VM migration. Once a VM migration has been planned, the replication system must copy the VM s disk to the remote host, and must continue to synchronize the remote disk with any subsequent writes made at the primary. In order to reduce the performance impact of this synchronization, uses asynchronous replication during this stage. Once the remote disk has been brought to a consistent state, switches to a synchronous replication scheme and the live migration of the VM s memory state is initiated. During the VM migration, disk updates are synchronously propagated to the remote disk to ensure consistency when the memory migration finishes and the VM becomes live on the remote host. When the migration completes, the new host s disk becomes the primary node in the replication scheme, and the origin s disk is disabled. 3.3 Transferring Memory State Most VM migration techniques use a pre-copy mechanism to iteratively copy the memory contents of a live VM to the destination machine, with only the modified pages being sent during each iteration [9, 2]. At a certain point, the VM is paused to copy the final memory state. WAN migration can be accomplished by similar means, but the decreased bandwidth can lead to decreased performance particularly much higher VM down times since the final iteration where the VM is paused can last much longer. augments the existing migration code from the virtualization platform with a set of optimizations that improve performance, as described in Section 4. The amount of time required to transfer a VM s memory depends on its RAM allocation, working set size and write rate, and available bandwidth. These factors impact
Search
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks