In the early AM hours on Saturday morning we will be making a change to our switch configuration at Rackspace. Currently we have four racks of servers at that data center – 62 machines and counting. Our uplinks connect into a firewall/load-balancer on rack #1 and another on rack #2, both of which are then connected to our backend private network via interconnected switches on each of the four racks.
Racks #1 and #2 each have a 24-port gigabit switch (Cisco 2970), and racks #3 and #4 each have a 24-port 10/100 switch with 2 gigabit uplink ports (Cisco 2950). Racks #2, #3 and #4 each connect to rack #1’s 2970 via their gigabit uplinks.
Now here is the problem we are solving this weekend… Every time we add a new rack of servers we have to pull one server out of either rack #1 or rack #2 and move it to the new rack, so that we can free up a gigabit port on the Cisco 2970 for the new rack’s switch to plug into. That’s just a pain. And at the rate we’re growing, rack #1 and rack #2 will eventually become completely empty 🙂
So, we are moving the gigabit switches to a layer above all of the racks, and each rack will now plug into these external switches – creating a pyramid layout that will scale to 48 racks (and beyond with more gigabit switches). After the maintenance, all of the rack switches will be 10/100 and the gigabit switches will be dedicated strictly to rack aggregation and hosting the firewall/load-balancer ports.
We are planning on just a few minutes of downtime for this upgrade and some latency while we verify connectivity and failover traffic from secondary to the primary firewall. This will happen between 1:00am and 5:00am Saturday, January 7 as reported on our system status RSS feed.