Virtualization and Cloud Computing: Benefits and E-Discovery Implications
What exactly is virtualization and why is there so much buzz about it these days? Virtualization can occur in many forms, but most initially think of using virtualization to consolidate servers into a single hardware platform. Essentially, you can run multiple servers on a single piece of hardware, where each “server” has its own memory “footprint” within the host machine. Servers are the most common devices when firms embark down the virtualization path. There are many other forms of virtualization such as desktop, network and storage virtualization. Desktop virtualizations occur in the larger firms all the way down to solos.
Virtualization has been around for a long time and was commonly used in the big iron mainframe days. The big mainframes had sections “carved out” that were running different operating systems and applications. This takes advantage of the investment in hardware. It is similar to having multiple people riding on a single train. The more people riding, the more cost effective the operation. It gets very expensive to operate the train if there is only one passenger. By virtualizing servers, we can reduce overall power consumption, cooling requirements and maintenance costs. This very “green” impact has been one of the big drivers towards virtualization.
Cloud computing is also a very popular term these days and a lot of providers also use virtualization as part of their solution. What exactly is cloud computing? Basically, it is using an application or computing services that don’t normally reside on your premises. Typically, an application that uses the Internet for access is considered cloud computing or software as a service (SaaS). Google Docs is a good example of SaaS and cloud computing. The application resides in the Internet and you access your data with a browser. Another feature of cloud computing is its on demand nature. You can activate services very quickly with very little setup time.
These new uses of technology bring along some challenges, especially when dealing with the electronic data as part of discovery. We’ll go into some more details about the technical issues for e-discovery, but let’s get through some of the propeller head stuff first explaining how the technology works.
Let’s talk a little bit about some of the terminology you’ll hear when dealing with virtualized environments. There will be multiple independently installed virtual machines (VMs) or guests that may even be running different operating systems. These virtual machines run on the same physical hardware, otherwise known as the host. This can be done by running the VMs on top of a host operating system with virtualization software or by running on top of specialized virtualization software called the hypervisor, which has direct access to the hardware. Microsoft’s implementation of this is called Hyper-V and is a very popular (and free) method to facilitate virtualization. Besides Microsoft, VMware products are widely used in the virtualization market. Another term you may hear is P2V, which is physical to virtual. Essentially, this takes your physical server and converts it into a virtual environment. Another thing to remember is that each virtual machine is unaware of the other VMs running on the host system.
About now, we expect we’ve achieved that classic “deer in the headlights” look from many readers. But please, read on, because virtualization is an incredible asset to law firms large and small.
Why would you even consider virtualizing servers at your law firm? Besides the reduction in energy costs and the savings in physical space, virtualized machines make it very fast to recover from failures or to even provide a high availability environment. Each running machine is really nothing more than a bunch of files that are independent of the hardware. This means that if there is a failure, you can take your backup files and “stand up” another instance of the VM somewhere else. This usage of virtualization is very attractive for disaster recovery purposes. As an example, we utilize a backup device that takes snapshots of the data every 15 minutes. Should one of the servers fail (e.g. File and Print server), we can virtualize the server on the backup appliance utilizing the latest 15 minute snapshot. This means that we are back in operation very quickly running a VM in place of the failed server. And we’ve lost no more than 15 minutes worth of data – what’s that worth to you? This same process can be followed for virtualized servers too.
As a design goal, you want to run your host hardware at around 60% utilization. This maximizes the number of VMs on the host and provides room so that each VM can burst up and use the remaining processing power of the host. So don’t get greedy and try to max out the utilization – you’ll potentially do yourself a great deal of harm going down that road.
Is virtualization only for the large law firms? Not at all. Certainly, large firms were the first to implement virtualized environments, but there are advantages for small law firms as well. You could have one virtualized server to test updates to applications, new applications or even operating system patches. We’ve seen small firms virtualize several servers (E-Mail, File and Print, Domain Controller, Database, etc.) onto a single platform. If you are running Terminal Server, it is a good idea to virtualize that too since other applications may have issues running alongside Terminal Server at the same time. In our own environment, we currently run Terminal Server as a guest VM along with a guest instance of the BlackBerry Enterprise Server Express (free) on the same host.
Another advantage of virtualization is rapid deployment and flexibility. This is not quite the same thing as providing for a disaster backup as we’ve already mentioned. Rapid deployment means that you can take the VM and move it to another host very quickly with little or no impact. Remember that a virtual machine is nothing but a bunch of files so moving it to another host is really nothing more than copying files. Changing the characteristics of the virtual machine is another great advantage. You can adjust memory and hardware availability on the fly. As an example, we just increased RAM for one of our VMs from 2GB to 3GB with just a couple of mouse clicks. There may be limitations in sharing the host peripherals among the VMs depending on the product you are using. As an example, we can’t define any of the USB ports to a virtual machine with the version of Microsoft Hyper-V we have running on one of our hosts. VMware is a lot more flexible and we haven’t had problems sharing any of the host peripherals with the guest VMs.
Servers aren’t the only reason for virtualization. Many lawyers using Macs are very familiar with virtualization. Lots of Mac users are running VMware or Parallels with a copy of Windows (in the virtual machine) to run some software that doesn’t have a Macintosh version. This allows the Mac user to continue to use a Windows-based application until the vendor produces a Mac native version. Virtualization for Macs doesn’t just mean that Windows is a guest system all the time. You can even run Mac OS X as a guest VM on a Windows system.
What are some of the other considerations for virtualization? Just because you can run multiple virtual machines on a single host doesn’t mean that everything is free. There are licensing costs associated with each VM you have running. This means that you’ll still need to pay for the operating system license, the mail server licenses and any other application license cost. Be sure to check the terms of licensing since it is a rapidly changing landscape. Software manufacturers are addressing licensing since virtualization has become very popular. Some provide special terms for licensing in a VM environment where each instance is at a far reduced rate. As an example, your anti-virus provider may offer a per server cost that is much lower than individual pricing if the software was running on separate hardware for each server.
Another consideration is the skillset required to configure and maintain a virtualized environment. Running a VM on a single workstation is pretty straight forward. As an example, installing Parallels on a Mac and then installing Windows in the Parallels VM is a task most attorneys can handle without any trouble. However, sizing and designing a server environment is a lot more complicated. If you are a larger firm, get your IT staff some training in the virtualization hardware/software that you intend to deploy. If you’re a smaller firm, make sure that your IT support folks are certified or trained in particular products and not just going through the “read me” file that came with the software.
Virtualization has spread like wildfire recently. Because of its many advantages, it is here to stay and we predict that more and more firms will be implementing it over the next several years. Besides the overall cost savings, it’s a great environment to minimize downtime to the firm. When you discuss guaranteed business continuity with a law firm and tell the lawyers that they can ensure that will never lose more than 15 minutes of data, trust us, you have their rapt attention. So if you haven’t thought about the advantages of virtualization yet, seize the moment and put it high on your to-do list.
By now we’re sure you’re sold on virtualization and many of you (and your clients too) already have virtual environments. But what do you do when there’s litigation? What do you preserve and are there special considerations? You bet. Let’s start with a virtualized environment that you control. As we’ve already mentioned, it is very easy and fast to preserve a VM in its current state by just taking a snapshot. Perhaps that’s all you really need to do for preservation. It’s a simple and cheap process. This assumes that you don’t need a forensic image of the VM. If you do, then things get a lot more complicated and frankly not really necessary. If you really do want to forensically preserve the VM environment, then you need to do the entire piece of hardware. This would include the host OS along with all disks in the machine.
Electronic discovery in the cloud is a different beast. Many cloud providers use virtualization to achieve efficiencies and keep end-user cost down. As an example, if you buy a cloud “server” from a provider, they will normally give you a VM that is running along with VMs of other companies. That’s where things may get sticky. What if a company is being investigated by the DOJ and they seize the hardware where the VM resides? It’s just your bad luck if your VM is also on the same hardware. Essentially, you are at the mercy of the cloud provider. Will they move your VM to a different piece of hardware before the Feds arrive or are you out of business? This particular situation is something that needs to be addressed as part of the terms of service with the provider.
What ability does the provider have to preserve electronically stored information as part of litigation? What logging and auditing do they provide? How will the ESI be produced? In what format? Will they maintain chain of custody? Are there cross border or privacy issues? As you can see, there are a lot of issues that can arise when your data is physically out of your control. You should try to address all of these issues before litigation. Bottom line is that you are pretty much stuck with the capabilities of the cloud provider and how they handle the data.
It is critical to understand data location when using cloud services. If data resides in a foreign country, you may not even be able to access it in extreme cases. Different laws may apply and you may need to get legal assistance from someone familiar with the storage country’s laws. Even if the data is completely within the United States, you may be faced with other challenges. Most reputable cloud providers have multiple data centers and replicate data between them. This is a way to provide high reliability and availability. It also means that data may be in multiple places. This may actually increase your litigation exposure because of the different jurisdictional entities.
So how many of you use Dropbox? Isn’t it a clever cloud service? We absolutely love it. Not just because it seems to be the most practical way to get data to and from an iPad, but because of the potential evidence sources. Remember that data is synchronized to each computer where Dropbox is installed. This means that there may be different versions of a document on each computer, assuming that it hasn’t connected to the Internet in some time. We don’t know of many folks that encrypt the data before handing it off to Dropbox, but the potential evidence source is certainly something you should be investigating.
Some questions that you can consider are:
- What types of data will you store in the cloud?
- Will you be encrypting the data?
- What will the cloud provider give you in regards to data protection, access, retention, security and logging?
- Where will the data be stored and can you specify the geographic location?
- Who can access the data?
- How will the data be returned and in what format? How long will it take and is there any charge?
- How does the provider deal with metadata and is it preserved?
These are just items to consider as you move to the cloud.
Cloud computing and virtualization are very powerful technologies for businesses today. You can help your clients by preparing them for the e-discovery challenges before they are involved in litigation. Make sure you have properly addressed issues with the service provider and their terms of service. All responsibilities and actions need to be identified in the contract terms. Don’t be afraid to dip your toe into the cloud and virtual world, just make sure you know where the evils and challenges may reside.
Great article. It’s also important to mention that data storage in the cloud can also lead to third-party disclosure risks, particularly in regards to Attorney-Client privileges (here in the U.S., anyway). As you mention, a clear and complete terms-of-services agreement that addresses these issues is vital.