Last May, Outbrain migrated our data center from downtown New York to Secaucus, N.J. We found a strong connection between these numbers – 1000, 48, 9, 0 — and we lived to tell about it.
Moving one of your main data centers, including ±1000 physical servers, is no easy task. It requires extensive planning and preparations, and can directly affect the business if not done right. In order to reduce the impact on our team, we decided to do it in an accelerated timeframe of just under 5 weeks. This is the story of Outbrain’s data center migration from 111 8th, NY to Secaucus, N.J. – or in other words: 1000 servers, 48 hours, 9 miles and 0 downtime.
Outbrain is a content discovery platform that helps readers find the most interesting, relevant and trusted content wherever they are. Through Outbrain’s content recommendations across a network of premium publishers, including CNN, ESPN, Le Monde, Fox News, The Guardian, Slate, The Telegraph, New York Post, Times of India and Sky News, brands, publishers and marketers amplify their audience engagement by driving traffic to their content – on their site and around the web. Founded in 2006 and headquartered in New York, the company has 15 offices around the world including Israel, the U.K., Australia, Japan, Singapore and multiple European locations.
Outbrain operates from three data centers in the U.S., one of which is Internap’s Secaucus, N.J. facility, serving more than 72,000 links to content every second.
On April 8th, we toured Internap’s newly-built Secaucus data center and saw how it was coming together. The new facility was impressive and very spacious, so we decided to go for it. The chosen date was May 23rd, Memorial Day weekend, in order to provide us with the advantage of a long weekend with lower traffic on our service.
Once the date was set, the countdown began.
We broke the project into 2 phases:
1. Pre-move, which included applicative preparations and tests, and logistics.
2. The move, which included the physical transfer of equipment, and application setup and sync in the new location.
Applicative preparations and tests
At Outbrain, we fully implemented the Continuance Integration methodologies, so on any given day we have about 100 different production deployments. It was crucial for us to maintain this ability during the migration, in addition to the overall health of the system once we shut down the servers at the existing 111 8th location. We conducted numerous tests in order to verify that all the redundancy measures we put in place, and what we refer to as our “immune system,” would still be fully functional even after all the services in 111 8th were unavailable to our other functioning data centers. Those tests included scenarios such as controlled disconnection of the network to 111 8th, which simulated complete unavailability.
In addition, we analyzed specific high-risk components and looked for different ways to set those up in the new Secaucus data center in advance.
The irretentive process of testing and analyzing required a great deal of collaboration between the different engineering groups within Outbrain (operations, developers, etc.) – which was key to the success of the project.
As the famous phrase goes, “God is in the details,” and this type of project included many, many details.
Starting with finalizing the contractual agreement regarding the new space, and making sure it would be ready as part of the extremely aggressive timelines we set; power, AC, cage build-out, planning the new space layout and taking the opportunity to prepare for projected growth and more. We set tight meetings with the Internap team, as we all realized that communication is key and every day counts.
Our preparation also involved bidding between different vendors to perform the heavy lifting of actually moving the 1000 servers, labeling every component (with 3 labels each, in case one falls down – redundancy), planning the server allocation into the moving trucks (you do not want all of your redundant servers on the same truck), insurance aspects, booking elevator time and docking space, and many more small details that at the end of the day made a big difference.
The time had come, and on Friday, May 23rd at 4:00pm, we hit the button. An automated shutdown script, which we prepared in advance, managed the shutdown of all services in the desired sequence. By that time, the movers were on site, the trucks were parked in the loading docks, and the cage became quiet – no more static noise, and no more hot aisle to stand in.
We had split into 2 teams; the first team drove with the first uploaded truck to the new Secaucus data center, where the floor was already pre-labeled with the new location of each rack; the second team remained in 111 8th and continued to work on loading the next truck. Both teams worked in parallel to reduce the duration of the physical move.
All in all, it took 5 trucks to complete the move, and after 29 hours, we had all the equipment relocated, racked and cabled in the new site, and we were ready to start our final and potentially most daunting phase – starting up the equipment and services.
The startup process was also done in a controlled method, to assure the correct startup sequence. We took advantage of the fact that Outbrain is a global company with offices in Tel-Aviv, so when it was nighttime in NY, Tel-Aviv was in the middle of its business day. We operated a full task force in Tel-Aviv to help us make sure that the services were coming up correctly and that the syncing processes were working well.
As a result of careful planning, advance testing, and more than anything, the commitment of the Outbrain and Internap teams, we began serving real-time content recommendations from our new NJ data center within 48 hours, with 0 downtime or impact to our customers. We still had the opportunity to enjoy a nice trip to the NJ coast on Memorial Day weekend (and some shopping).
Learn more about how Outbrain drives content discovery and engagement.