“The data center is the cloud,” said 40 year IT veteran Dileep Bhandarkar, Distinguished Engineer, Microsoft, during the closing panel session at this year’s Server Design Summit, which was held November 27-28, 2012 in Santa Clara, CA. Whether that will be as famous a tagline as Sun Micro’s, “The network is the computer,” remains to be seen.
“Disruptive innovation always happens from below with the smart phone as a great example,” Mr. Bhandarkar said. “Cloud servers are different from enterprise servers. Workload optimized servers are preferable to one size fits all,” he added.
What was clear from this very informative conference is that Cloud resident data centers are and will be quite different from data centers that reside on customer premises. The local network which interconnects switches and servers in such cloud data centers will also be different, with different protocols employed for switching. That data center in the cloud must also be accessible by mobile users that launch corporate apps via their smart phone or tablet – either in a BYOD (Bring Your Own Device) setting on the corporate WiFi network, or in the field using a wireless carrier’s 3G/4G network.
Conference Highlights #
Here are some of the key messages and takeaways from the very informative, provocative and stimulating Server Design Summit:
- Public cloud computer server spend will grow from $1.54B in 2010 to $3.56B in 2015 (5-year CAGR of 18.3%) Source: IDC
- Private cloud computer server spend will grow from $2.55 B in 2010 to $5.88 B in 2015 (5-year CAGR of 18.1%) Source: IDC
- Data Center Priorities are changing, as per the figure below from IBM.
- Server Density is increasing: System on a Chip (SOC); less cabling, fewer components; Server Processors as “Neighbors” to local storage, more Cache Memory, use of Flash to complement HDD storage, Low-Power Processors (e.g., ARM, Intel Atom).
- Cloud resident data centers might use “wimpy” servers that don’t have as much processing power as conventional enterprise premises based compute servers, but use significantly less power (energy). They must be highly modular, scalable for build-outs, have very good I/O handling, and have provisions for failure recovery.
- A focus on availability and reliability is of paramount importance for a cloud resident data center, according to Matt Ferrari, CTO of Hosting.com who shared a lot of eye opening information with the audience (see next few bullet points),
- The current state of disaster recovery (DR) is dismal: 80% Of U.S. Companies lack a DR plan; 50% Of SMBs Worldwide have no failure recovery plan; 1 in 4 disaster recovery tests FAIL; most companies go out of business that experience prolonged data center down-time.
- Surprisingly, most “disasters” that take down a data center are not caused by catastrophic events (e.g. hurricane, fire, earthquake). Instead, 63% of disasters are “clinical.” Causes include: human error, equipment & system failures, malicious activity and social engineering. In those cases, customers and auditors are impatient and intolerant for being out of service.
- Cloud DR must be totally automated (including testing), make use of shared infrastructure for recovery, provide holistic application protection, improve speed and reduce errors.
- In cloud servers, it’s a start-over environment, with cloud-specific requirements. Growing from $5.2B to $9.4B in five years (2001-2015), according to IDC.
- New requirements from Cloud Service Providers (CSPs) keep emerging. It’s important to monitor their latest deployments for technical changes and evolving server design. (IDC did not say who these CSPs are and whether or not they might be telcos).
- IDC said, “understand what the ODMs are trying to do. If you’re a systems vendor, you may want to take the same approach. If you’re a large enterprise, you have a choice: build, buy, or collaborate.” [An Original Design Manufacturer (ODM) owns and/or designs in-house the products that are branded by another firm that resells those products.]
- In the last 12 years, there’s been a 64X improvement in CPU price performance, but only a 10X improvement in Ethernet interfaces on servers/switches (from 1GE to 10GE). The latter is still not that widely deployed on servers as per the illustration below.
- Server networking trade-offs include: LAN on Motherboard (LOM) or separate card for 1 or 10GE, latency vs realized throughput, cabling choice, power dissipation of total system.
- A “leaf-spine” cluster configuration is likely to be the most popular topology for large cloud data centers, according to Andy Bechtolsheim, founder of Arista Networks (and co-founder of Sun Micro).
- There are several alternatives for building data center switch fabrics, with the “winner” being ECMP and L3-BGP. According to Mr. Bechtolsheim, that’s what the largest cloud data centers use today.
- Andy said there were 2 ways to build large switch fabrics: Virtual Output Queueing (VOQ) or Multi-stage fabrics (a network of switches, if you will). VOQ is superior in every way, but at 2X to 3X the cost per port and power consumption.
- There’s an urgent need for congestion management in cloud data center switches to avoid dropped packets.
- 100GE won’t be a commercial success unless the confusion over the PHY is settled. There are too many alternatives now.
- Current 100GE PHYs can only achieve distances of up to 100Ms in most cases (one PHY technology can reach 300M)
- CMOS Silicon Photonics will be “coming soon” and provide 100G PHYs with a 1Km or greater reach, according to Mr. Bechtolsheim. At that time, we can expect the 100GE market to gain market traction within the data center, most likely for interconnecting data center switches (servers will use 1, nX, or 10G E links to switches).
VCs Thoughts and Hot Areas within IT
In a VC panel session on November 28th, big technology and market shifts were seen as creating opportunities for IT startups. “It’s one of the most exciting periods to be an entrepreneur in the IT enterprise sector,” according to Chris Rust, a general partner at U.S. Venture Partners (Menlo Park, CA).
“Traditional functional partitions of servers, WANs, SANs and load balancers are being reinvented in real-time with Internet scale data centers, virtualization [and other trends spawning] a chaos that has not happened in a long time, creating unsolved problems and opportunities,” he said.
Former Cisco/Stratacom veteran Kambiz Hooshmand, the founder of incubator Archimedes Labs LLC (Palo Alto, CA), was very optimistic on the U.S. economy and the opportunities for start-ups in the mobile cloud space (haven’t we heard that one before?).
“I think in 2014-15 things will look really bright for the U.S. economy. If you are an entrepreneur that is your time horizon for an exit,” he said. More profoundly, Mr Hooshmand said several times that because ”every layer of the (protocol) stack has to be re-written for the emerging mobile/cloud infrastructure taking shape now, there are and will be more opportunities than ever.”
AW Comment: We don’t agree that the mobile cloud will lead to a bonanza of successful start-ups unless key mobile infrastructure issues are sorted out. Those include: guaranteed performance with enforced SLAs, mobile security (especially to access cloud resident servers and apps), and 3G/4G roaming at a reasonable cost with no degradation in service quality. We haven’t seen any progress in any of these areas over the last two years. Until they are all resolved, we can’t see how an enterprise would buy into the mobile cloud.
Please note that the mobile cloud for enterprise customers has much more demanding requirements than a mobile cloud for consumers, which might not need all the above capabilities. Indeed, we think the mobile consumer cloud is here now (e.g. Amazon and Apple are great examples).