THE SUCCESSFUL DATA CENTER MIGRATION
WHAT IS A DATA CENTER MIGRATION?
This makes sense. Downtime often equals lost revenue or a decline in customer satisfaction. Everyone knows what happens when they have an application down for any period of time. It isn’t pleasant. Factor in the risk associated with moving multiple applications, network equipment, IP space, etc. and this makes perfect sense. If you’re not comfortable quantifying the impact of the downtime, or know the impact of the downtime will be significant, you’re going to hold off on the migration.
Migrations, even of a single application, aren’t trivial.
Yes, you read that correctly. Of the companies that planned a data center migration, 20% cited the lack of a migration plan as the reason for delaying their migration. Interesting.
Migrations are not an everyday occurrence. You can get right to the brink of a migration, look at the plan, and realize you don’t have the expertise to execute in a timely fashion or without consequence. It would have obviously been ideal to recognize the lack of experience earlier in the process, but that’s not always realistic.
It’s quite interesting when you realize that the least cited issue associated with a delay in a data center migration is cost! Note: You’ll notice that these values add up to 145%. Many of the organizations surveyed in this research selected multiple reasons for delaying their migration. This is important because a delay isn’t about just one thing! All of these pieces are moving at the same time.
A SUCCESSFUL DATA CENTER MIGRATION
DISCOVERY Discovery is responsible for bringing to light five key points:
1. A CLEAR DEFINITION OF PROJECT OBJECTIVES
Why are you migrating? What do you want to accomplish?
2. A MUTUALLY AGREED UPON SET OF SUCCESS CRITERIA
Simply stated, how will you know this migration was a success? All stakeholders, business and technology, need to be on the same page here. What does success mean for the Storage Team, the Networking Team, the Virtualization team and the business as a whole? It isn’t often that a migration is attempted without an objective. After all, there’s a reason apps and infrastructure should move or a migration wouldn’t have been identified. However, it is VERY often that a migration is attempted without any definition of success. Everyone has to be on the same page with success criteria for each group. It’s a simple step, but it’s critical. Communicate!
3. AN INFRASTRUCTURE ASSESSMENT
This is the most time-consuming element. An infrastructure assessment delivers a complete picture of your environment on a cabinet-by-cabinet and a U-by-U basis. Consider
: • What’s in each cabinet?
• What’s in each U?
• How does everything connect to your network?
• What are the dependencies and relationships—technically and operationally— between each application and infrastructure component? • If you shut down a collaboration server, does email still work? What about your CRM application while your SAN is being migrated? To help expedite this process, we’ve created this worksheet. It will help you identify key device and application owners so that everyone who needs to be involved is included in the process.
4. BUDGET REQUIREMENTS
Too often, a migration budget is set based on technical costs. In order to fully understand the budget necessary for a successful migration, the business costs need to be taken into account. There can be retraining, desktop IT resets, etc. required that typically don’t make it into the overall budget estimate. The moral of the story here is to be sure that each element of the migration is taken into consideration when it’s time to secure funding. Surprises here (i.e., cost overruns) are almost always the result of not asking enough questions or involving all stakeholders.
5. YOUR 1-5 YEAR STRATEGY
Some organizations plan in 1-year increments, others in 3-year increments and others in 5-year increments. Your planning window doesn’t matter. Be sure that you have that plan in hand and are using it relative to the migration project. You would be amazed how many people never ask, “Will this work for us in year two and three?” It’s so easy to focus on the now that even short-term future requirements get lost in the shuffle. Don’t let that happen to you. There are details underneath this top-level assessment, things such as cable maps, rack elevations, etc. that can come into play, but this is more than enough to get you started. If you’re interested in a more detailed discussion about these additional elements, please contact us.
THE MOST COMMON DISCOVERY MISTAKE MADE?
Lack of a complete infrastructure assessment. This means a cabinet by cabinet and a U by U assessment of every device, and its associated application(s). The best way to avoid this mistake is to look not only at the assessment exercise but to extend it to a complete infrastructure map. All things physical, virtual, network topography, etc. Get everything in there because it’s going to be significantly better to have too much information instead of too little coming out of Discovery.
Planning, the second phase of our migration process, is primarily focused on the physical infrastructure. This is when all of the requirements identified during Discovery are brought into reality. There are four key objectives for this phase:
What is going to need to be acquired/built/deployed and managed—where is it going to go and when is it going to get there? Are you consolidating? When will you be out of room in your new cabinet or cage? Our general safety tip is to have about 15% - 20% overhead at the rack level so there is room to expand. This isn’t a trivial step—but it is one that is well within the realm of daily IT and business operations.
Once the design is set, all of the stakeholders, business and technology, need to be brought into the process so they can review. This is a critical step for one reason only. This is the point where you know everyone has their objectives and success criteria met.
Revisions are necessary to account for and address any objectives and success criteria identified in the Review phase.
4. FINAL REVIEW
You can think of this as the ‘repeat’ portion of a ‘lather rinse repeat’ methodology. Final infrastructure review is that one last session to be sure all of the objectives and criteria are addressed and you’re ready to begin development. The old adage, measure twice - cut once? Apply that here.
THE MOST COMMON PLANNING MISTAKE MADE?
Failing to establish clear leadership. More often than not, a data center migration project lacks a clear leader—someone who is responsible for communicating clearly and definitively across all teams at all stages of the migration process. A successful migration can’t have one voice from one department leading the way. They will, by default, be looking out for their best interest at the expense of others. Choose an impartial party with the authority to demand execution and the communication skills to keep everyone on the same page. This isn’t an easy task, but it’s critical to the success of the project. Sadly, it is overlooked in 9/10 migrations.
The third phase of a successful migration process is Development. During this phase: 1. You build the physical infrastructure that will support the applications and business processes identified in Discovery and refined in Planning. 2. After the build in step 1 is complete, you need to review the specifications of each infrastructure element to be sure that there aren’t any last minute changes. This is another measure twice, cut once step. 3. You finalize the support processes. What, exactly, does this mean? Each and every party associated with providing support for physical or virtual infrastructure elements, as well as their associated applications, will have sign-off on the proper steps necessary to receive support. Should a question arise—everyone will know exactly where to go and how the support processes will be executed. Don’t underestimate the need to have clearly defined processes and lines of communication for support tickets for each element of physical infrastructure and their associated application(s). All too often we hear, “It’s taken care of. You just submit a ticket to the support desk and it’s routed to the right person.” This is most commonly code for, “The ticket was logged and will hopefully be seen by someone who can resolve the issue (or at least someone who knows someone who can resolve it).” Migration is a good time to verify the support processes and escalations are in place—especially when you consider the brand new infrastructure you’ll be running! 4. Finally, the physical infrastructure constructed during the build will be signed off on by the various owners. Think of this as ownership validation—where the owner of each and every server, switch, router, SAN, etc. is giving the thumbs up on their pieces of the puzzle.
THE MOST COMMON DEVELOPMENT MISTAKE MADE?
Upgrading of parts of the infrastructure stack without recognizing the interdependencies of these components. There is nothing wrong with upgrading key components of your infrastructure during a migration. New network equipment, for instance, is easily set during a migration, as are transitions from physical to virtual. However, what happens when a migration features fractional upgrades is that no one prepares for the trickle down impact of these changes. Be sure that during Discovery and Planning these upgrades are highlighted with their interdependencies for specific testing during the Validation phase.
The fourth phase of a successful data center migration process is Validation. This is literally the checks and balances to be sure everything you planned is happening. Are the network, compute, storage and security and compliance requirements all met? You’re about to begin moving everything to the new data center location or new infrastructure, so this is the final checkpoint to be sure all the details identified in Discovery and Planning made it through Development.
EXECUTE EQUIPMENT COOL DOWN
Before you move on to the actual Migration, use the Validation phase to execute one final safety check - a full cool down period for critical hardware. Servers, storage appliances, routers, firewalls, switches, etc. may all be running just fine at present. However, when you shut them off, allow them to cool down and then power them back up, do they come back online? Before you get to the point of moving hardware, be sure you know whether or not each device will recover from the power down. This sounds silly, but it is really important. Each and every customer we migrate has an issue with at least one critical hardware component not coming back from a cool down, or not operating properly after a cool down. For example, a customer migrated had a router that didn’t recover from the power down because changes in firmware caused an issue that was previously unknown. The good news? It was identified during the Validation phase so we were able to bring in a new router and address the issue quickly.. This brief test will save you hours of time, countless headaches and thousands of dollars.
THE MOST COMMON VALIDATION MISTAKE MADE?
Failure to engage the business. Validation tends to be a very methodical part of the process—one where minutiae are attended to. This typically means the IT, Security and NetEng teams are heads down, hammering through their checklists. Be sure to pick your head up and include the business in the validation process. It’s likely something has changed in the migration process due to an unforeseen technical change or additional requirement, so make sure all stakeholders understand how these changes may directly impact their day-to-day operations. The business validation should be a very quick step—one that will save you countless hours later in the migration process. 30-60 minutes invested here is going to literally save you days as you get into the migration and management phases.
We’re finally here—Migration. All of the hard work is about to pay off. At the Migration phase, we like to call out four discrete areas of focus: 1. The App Migration Plan. Just as it sounds, this is a specific set of steps to be followed in the migration of each application. The dependencies identified in the Discovery phase comprise the most critical elements of this checklist. 2. Similarly, the most critical elements of the Data Migration Plan will have come from the dependencies identified in the Discovery phase. 3. The Test Migration is exactly that—a test move of applications and data, as well as network configurations. This test is going to show you whether or not everything is ready for prime time—and it’s going to give you a very realistic idea of how long the actual migration will take. A great way to approach a Test Migration is to move backup instead of production infrastructure. It is very common to test one or two production applications as part of this process and the remainder of the applications tested are backups. You should be able to mix and match test migration elements based on the criticality of your applications. We use this application prioritization worksheet to help determine a company-wide understanding of each application. 4. Finally—the Migration. Once the test migration has been performed and everything checks out, it’s time to move!
THE MOST COMMON MIGRATION MISTAKE MADE?
Failing to set realistic expectations for how long the actual migration will take. If during the Test Migration, you were noting the actual time it took to move applications and data, you shouldn’t have any surprises when it comes time for the final migration. However, all too often—and by too often I mean almost every time—there have been completely unrealistic expectations set around how long it will take to move any one application and its associated data. A Migration timeline is a math exercise. Use the Test Migration phase to get your values and then adjust accordingly because a production migration will be slightly slower than your test migration as even more care and attention to detail are needed. Let’s look at an example. It will take 10.4 hours to move 5 TB of data over a 1-Gig link. This is ‘best case’ because it assumes you’re getting full throughput on the network and that your reads and writes are able to occur at the network speed. So, even if you have the old and new infrastructure in the same data center, on the same LAN, best case you’re looking at 10.4 hours to move 5 TB of data on 1 Gig links. When you push this out to 100 TB of data, even at a 10 Gig link, you’re looking at just under 21 hours to move that data. In each and every data center migration we perform, whether it’s physical or virtual, this math becomes the a-ha moment for our customers as it inevitably takes longer than anticipated. Be sure to assign realistic data transfer times to your migration windows. Use the Test Migration process to validate the applications and data can be moved as well as the actual time required for the move to occur. Minimizing surprises here will go a long way for a ‘smooth’ and ‘successful’ migration.
Most data center migration plans stop after the Migration. Everyone realizes the migration was a success and it’s time to focus on other projects. This is a major mistake. Once the migration is complete, there must be a clear postmigration transition back to day-to-day operations, including support. Additionally, it’s going to be well worth your time to have proactive monitoring and response in place to watch over the migration to be sure everything is operating as planned.
THE MOST COMMON MANAGEMENT MISTAKE MADE?
“Set it and forget it” mentality. Everyone is so excited to be done, they immediately want to wash their hands of the effort. This doesn’t work. Proper attention to the transition and ongoing management are key to the project’s success. When we’re working on a migration for a customer, we allocate 24/7 resources for proactive support and monitoring for 72-96 hours after a migration. That’s a bit excessive and you can probably get away with 48-72 hours, comfortably. You simply want to be sure you have hands, eyes and ears on everything as it continues to burn in. One final note here—be sure to have the appropriate business resources on call, so they can help address any issues requiring their attention. They’ll want to be completely done with this “IT project” so be sure you keep them engaged in this step.
SCALE The final phase of a successful data center migration is Scale. Scale focuses on where you’re going now that your new data center and your new infrastructure is in place. In the immortal words of Yogi Berra, “If you don’t know where you are going, you’ll end up someplace else.” These are words to live by.
THE MOST COMMON SCALE MISTAKE MADE?
We’re done! That’s it! It’s time to move on. It isn’t difficult to set an annual plan, maintain quarterly reviews and develop a simple process for ad-hoc requirements associated with your infrastructure. A data center migration is a perfect time to reset these regular review expectations and get everyone back on the same page. You’ve just invested significant time, energy and money into executing a difficult—but critical—process. Don’t let the energy and attention to detail be removed from the team now that it’s over. You have a brand new canvas from which to work … and the business and IT requirements are not going to stop simply because you’ve migrated to new infrastructure.