Tuesday, October 21, 2008

Gartner on Green Data Center Recommendations

Gartner Research just issued a very telling release on taking a holistic view of energy-efficient data centers, rather than a narrow point-technology view. Gartner compared the data center to a "living organism" in terms of how it needs to be treated as a dynamic mechanism. (BTW, I owe a head nod to Dave O's GreenM3 blog for coining the term "the living data center")

Said Rakesh Kumar, a research vice president at Gartner,
“If ‘greening’ the data centre is the goal, power efficiency is the starting point but not sufficient on its own... Green’ requires an end-to-end, integrated view of the data centre, including the building, energy efficiency, waste management, asset management, capacity management, technology architecture, support services, energy sources and operations.”
“Data centre managers need to think differently about their data centres. Tomorrow’s data centre is moving from being static to becoming a living organism, where modelling and measuring tools will become one of the major elements of its management,” said Mr Kumar. “It will be dynamic and address a variety of technical, financial and environmental demands, and modular to respond quickly to demands for floor space. In addition, it will need to have some degree of flexibility, to run workloads where energy is cheapest and above all be highly-available, with 99.999 per cent availability.”
I like this analysis because it implies a dynamic "utility computing" style data center where workloads can be moved, servers can be repurposed, and capacity is always matched to demand. This is the ideal approach to ensuring constant efficiency.

The release also had six recommendations; Here's the one I like the most:
6. Manage the server efficiencies. Move away from the ‘always on’ mentality and look at powering equipment down
To me, it sounds like technologies like Active Power management are finally getting traction; and, it seems that power management is being validated -- especially in environments with very highly cyclical workloads. (most recently endorsed in a 451 Group report, as well by a host of vendors).

Especially with the economy in a spin, and margins being tightened, look for more ideas for increasing the $ efficiency of data center assets.

Saturday, October 18, 2008

IT Analysts opening their Kimono

There was a time when IT industry analysts would only provide information, opinion or data for a price. But it seems that in today's web 2.o world, they are exposing more of their thoughts in the form of blogs and other "free" information. I suspect that this is happening due to downward pressure on price (full subscriptions to analysts and reports for a year can be tens of thousands of dollars), plus the realization of the need to "market" their expertise and insights to a broader audience.

Here are some of my favorite, complementary, data center-related analyst info you can subscribe to:

Forrester: Great set of analysts and topics here. Check out their entire Blog Listings page to find the right IT industry slice for your taste.

Gartner: Gartner has a "blog network" page where it appears they've asked most of their analysts to do individual blogs, many of which are on various IT topics and technologies. And just so you don't think there isn't any overlap between analyst coverage, they also have a *very* complete blog and video covering Cloud Computing and Cisco's possible intentions, too.

IDC: I just discovered the "IDC Exchange" (which I wrote about earlier last week). They recently did a *very* nice multiple installment piece on cloud computing you have to check out. They've also recentely completed an analysis piece on Cisco and their possible cloud computing intentions.

Redmonk: I've known James Governor since my days at Sun. He runs a multi-topic blog called Monkchips which is insightful with a tinge of wry wit from across the Pond (It's also #3 on the list of Top Analyst Blogs). Michael Cote, another Redmonker, also has a quality blog (People over Process) on IT operations issues and more (BTW, Michael's Blog is #8 on the list)

Saugatuck Technology: I've been following their reports on SOA and related technologies, but they've been branching out. While not a Blog, they email out a very nice complementary summary of each of their extended reports in the form of "Research Alerts"

Thursday, October 16, 2008

Awesome Blog/Report on Cloud Computing by IDC

Those quant guys at IDC have been at it again. This time, they've posted a really fine report/overview on cloud computing on their "IDC Exchange" Blog page, authored by Frank Gens. It was initially posted in September, but it looks like they've been adding report bits (and great graphics) to it for a while.
Over the coming weeks, we’ll roll out a number of posts on cloud services and cloud computing. While these posts can be read standalone, they can also be viewed as parts of a single, coherent IDC overview of this emerging model and market opportunity. We’ll use this post to create a “virtual table of contents”, adding links to these cloud-related posts as they’re published, allowing you to see how different elements of our cloud outlook fit together, and to easily navigate among them.
Here's the Table of Contents:
BTW, if you want a chuckle, click on the "listen now" button to hear a machine-generated voice read the pages for you. Listen closely to the fact that the computer never takes a breath. :)

Tuesday, October 14, 2008

Postcards from the Cloud Summit Executive conference

I'm just now getting back from a full day in Mountain View at the Cloud Summit Executive conference. And if there were themes that summarized the day, they would be "it's all about the business impacts" and "integration of services will be critical."

The conference was sponsored by TechWeb, and hosted/moderated by the experienced MR Rangaswami Co-Founder of Sand Hill Group. The crowd was around 300 folks, just small enough that you could network w/interesting people during the generous lunch/coffee breaks -- nice schedule design.
And, while some of the vendor/presenters were definitely commercials, some of the content in the general sessions was really worth the price.

The day opened with Tom Hogan, SVP from HP. Although the talk was (of course) vendor-centric, he did a really nice job of summing-up the challenges IT faces (nearly 85% of budgets going toward "keeping the lights on"), and yet identifying the business opportunities that the "cloud" will enable. Most notably these were spawning opportunities for smaller businesses, including enabling certain departmental-scale projects in larger orgs. The cloud, according to Tom, was just another channel for deploying business services -- not a panacea. Use it as part of your portfolio mix.

Next was a panel moderated by Bruce Richardson of AMR research on "selling the crowd to Wall Street and Main Street. It included Bryan McGrath from Credit Suisse, Robin Vasan from the Mayfield Fund, and Jeff Koser, business author. Again, the tone was decidedly business-focused, with little discussion of technology (in contrast to SDForum, a few weeks earlier). There were some great tidbits within the discussion on how to design a resilient business/model based on using cloud infrastructure -- plus entrepreneurial tips anyone should use (Focus on how to attract customers; maintain a super-low cost-of-sales, and a very scalable sales channel; keep price of product under $50k; ensure a "sticky" product with recurring revenue, etc.)

Later in the afternoon was a very thoughtful presentation/discussion from Vishal Sikka, CTO from SAP. His opening slide:
Where we are: Power, Infrastructure, Operations
Where we want to be: Integration, Integrity, Elasticity
First, he said, businesses are missing "integration", especially between critical applications; cloud computing could make this worse -- but there are initiatives to improve on this. Problem is, they won't happen over night. On "Integrity" he also pointed out that data integrity today is in fact fragmented -- and again, the cloud could make this worse before it improves on things. And finally -- the topic near to my heart -- "elasticity". Vishal strongly said that elasticity (of compute capacity) HAD to be delivered for all apps of all types, and for DBs as well. And he cautioned: Infrastructure will be permanently heterogeneous. Plan for it.

The other Achilles heel for the cloud, he said, was compliance. It's all about transparency and control (distinct from security). Although companies have been outsourcing for years, the "cloud" still needs ways to provide logs, tracking, compliance tools, etc. [sounds to me like a business opportunity...]

Later in the afternoon (after a good networking lunch, complete with Red Bull on ice!) was what I thought was the best panel of all: "Understanding enterprise requirements for the cloud". It was moderated by David Berlind at TechWeb, with Art Wittmann of Information Week -- and featured two diametrically-opposed perspectives on IT: Carolyn Lawson, CIO from the California Public Utilities Commission, and Anthony Hill, CIO from Golden Gate University.

First, Art covered high-points of a recent report on Cloud Computing by InformationWeek: 62% of respondents still had no interest (or not enough info) to consider cloud computing at all. Of those considering clouds, there "likes" included meeting user demand and scale, and avoiding huge capital outlays. But their "fears" were expected: Security, Control, Performance, Support, and Vendor Lock-in.

But then the real great part of the panel began. Carolyn Lawson of CPUC ran a government IT operation. Data was highly sensitive; new capital and employees were hard to come by (literally approving state approval). She faced stiff legal issues around geography (as it related to where data lives), data liability, and data security. And she literally said "going to a cloud architecture would be like stepping off of a cliff", given her current constraints.

On the other hand, Anthony Hill of Golden Gate University had a completely different set of constraints and drivers. He's outsourced nearly every application the University uses -- to nearly a dozen different SaaS providers -- and keeps incremental user costs nearly at zero. He has only a small staff, and has avoided huge capital and operational budgets, while supporting the strategic needs of the business to provide an "online university" environment. To be sure, he has challenges too: very high vendor risk (what if they go out of business? what happens to my data?); very high switching costs (how do I migrate my data?); and, the fact that today, he's doing ALL of the inter-application integration himself.

But the big take-away from this interaction was that it is clear that for some businesses, the "cloud" is a godsend -- but for others, it will make almost no inroads in the foreseeable future. Conclusion: Look at the business needs first, before assuming that technology solves all.

A final note: conferences like this are invaluable for their networking opportunities, and TechWeb had a good mix of content (but please try to reduce the "commercials" in the future), small size, break-outs and long breaks. The audience was qualified, too; lots of CEO and "office of the CTO" on badges, etc. My buddy and Yoda-of-the-Blog James Urquhart was there too, as-was a generous quantity of VCs who traveled a few miles from Sand Hill road.

Monday, October 13, 2008

Cloud Computing forever changes consolidation and capacity management

This is an intriguing topic - the relationship between the need to forecast compute capacity (part art and part science today), and the "elasticity" guaranteed by what we're calling "the cloud."

So last week, when Michael Coté (an analyst with RedMonk) wrote about "How cloud computing will change capacity management" I thought it would be a good idea to expand on his observations and to dissect the issues and trends. Including my prediction that existing capacity management tool value will be overtaken by utility computing technologies.

First, terms: When I talk about the "cloud", I'm usually talking about Infrastructure-as-a-Service (a la Amazon EC2) rather than platform-as-a-service (e.g. Google app engine) or Software-as-a-Service. To me, IaaS represents that "raw" compute capacity on which we could provision any arbitrary compute service, grid, etc. (It's also what I consider the underlying architecture that's been called Utility Computing).

Michael was clear to define two other terms, Capacity Management, and Capacity Planning. Capacity management is the balancing of compute resources against demand (usually with demand data you have), while capacity planning is trying to estimate future required capacity (usually without the demand data you'd like).

Another related issue that has to be addressed is
Consolidation Planning - essentially "reverse" capacity planning -- estimating how to minimize overall in-use capacity while maintaining service and availability levels for virtualized applications.

So how does use of "cloud" (IaaS) impact capacity management/planning, as well as consolidation planning? In my estimation, there are two broad views on this:
  1. If you buy-into using the "public" cloud, then all the work you've been doing to estimate capacity and to plan for consolidation doesn't really matter. It's because your capacity has been outsourced to another provider who will bill you on an as-used basis. The IaaS "cloud" is elastic, and expands/contracts in relationship to demand.
  2. If you instead build an "internal cloud", or essentially architect a utility computing IaaS infrastructure, the story is a little different. You're taking non-infinite resources (your data center) and applying them in a more dynamic fashion. Nonetheless, the way you've been doing capacity management/planning, and even consolidation planning, will change forever.
I'll take #2, above, as an example, because its operation is more transparent. You start with your existing infrastructure (machines, network, storage) and use policy-based provisioning/controls to continuously adjust how it is applied. This approach yields a number of nice properties:
  • Efficiency: You only use the capacity (physical and/or virtual) you need, and only when you need it
  • Continuous consolidation: A corollary to above is that the policy engine can "continuously consolidate" virtualized apps (e.g. it can continually compute and re-adjust consolidated applications for "best-fit" against working resources)
  • Global view: Global available capacity (and global in-use capacity) is always known
  • Prioritization: You can apply policy to prioritize capacity use (e.g. e-commerce apps get priority during the holidays, financial apps get priority at quarter-close)
  • Safety net: You can apply policy to limit specific capacity use (e.g. you're introducing a new application, and you don't know what initial demand will be)
  • Resource use: It enables solutions for "resource contention" (borrowing from Peter to pay Paul); higher-priority applications can temporarily borrow capacity from lower-priority apps.
The net-net of the properties above is the long-term obviation of capacity planning, capacity management, and consolidation-planning tools. (Now take a deep breath)

Yes. Long-term, I would expect existing capacity management tools like PlateSpin PowerRecon, CiRBA's Data Center Intelligence, and VMware's Capacity Planner to be completely obviated with the appropriate internal IaaS architectures. Why? Well, let's say you do clever consolidation-planning for your apps. You virtualize them and cram them into many fewer servers. But a few months pass, and the business demand for a few apps changes... so you have to start re-planning over again. Contrast this against an IaaS infrastructure, where you let a computer continuously figure out the "best fit" for your applications. The current concept of "static" resource planning is destined for the history books.

Oh - and there are some nice side-benefits of allowing policy to govern when and where applications are provisioned in an internal IaaS ("internal cloud") architecture:

1) Simplified capacity additions: If capacity is allocated on a global basis, then the need to plan on a per-application basis is much less important. Raw capacity can be added to a "free pool" of servers, and the governing policy engine allocates it as-needed to individual applications. In fact, the more applications you have, the "smoother" capacity can be allocated, and the more statistical (rather than granular) capacity measurement can become.

2) Re-defined "consolidation planning": As I said above, the "static" approach to consolidation planning will give way to continuous resource allocation, essentially "continuous consolidation." Instead, you'll simply find yourself looking at used capacity (whether for physical or virtualized apps) and add raw capacity (as in #1) as-needed. The hard work of figuring out "best fit" for consolidation will take place automatically, and dynamically.

3) Re-defined capacity management: Just like #2 - Rather than using tools to determine "static" capacity needs, you'll get a global perspective on available vs. used raw capacity. You'll simply add raw capacity as-needed, and it will be allocated to physical and/or virtual workloads as-needed.

4) Re-defined
capacity planning for new apps: Instead of the "black art" of figuring-out how much capacity (and HW purchase) to allocate to new apps, you'll use policy instead. For example, you roll-out a new app, and use policy to throttle how much capacity it uses. If you under-forecast, you can "open the throttle" and allow more resources to be used -- and if it's a critical app, maybe even dynamically "borrow" resources from elsewhere until you permanently acquire new capacity.

5) Attention to application "Phase": You'll also realize that the best capital efficiency occurs when you have "out-of-phase" resource demands. For example, most demand for app servers happens during the day, while demand for backup servers happens at night -- so these out-of-phase needs could theoretically share hardware. So I would expect administrative duties to shift towards "global load balancing", and encouraging non-essential tasks to take place during off-hours. Much the same way Independent System Operators across the country share electric loads.

BTW, if all of this sounds like "vision" and vaporware, it's not. There are firms offering architectures like IaaS, Internal Clouds and utility computing technologies today, that work with your existing equipment. I know one of them pretty well :)

Thursday, October 9, 2008

Cassatt's chief scientist explains, simplifies

What's the sign of a really smart guy? The ability to take a complex topic and simplify it so that even your mom will understand it.

Steve Oberlin, Cassatt's Chief Scientist, had done that. He's taken a look at how data centers operate, the dynamics that drive them, and how existing technology can help simplify IT management's life. Simpler capacity management, service-level management, and overall energy efficiency. It's the basis behind utility computing, what will drive Infrastructure-as-a-Service (IaaS), the basis for building "internal cloud" infrastructures we're all talking about.



Oh. And what's the sign of an extraordinarily smart guy? That he can simplify-down these concepts -and- produce the entire video himself.



Monday, October 6, 2008

Would you buy an Amazon EC2 appliance?

Before you scream "a what?" I'm only posing this as a thought experiment...

But the concept was recently put forth as an illustration
at last week's SDForum by an attendee. I kind of thought about it for a few minutes, and realized that the concept isn't as crazy as it first sounds. In fact, it implies major changes for IT are on the way.

First of all, the idea of a SaaS provider or web service provider creating a physical appliance for the enterprise is not new. There's the Google search appliance, but I also expect providers like Salesforce.com to do the same in the near future. (There are some very large enterprises that want to be 100% sure that their critical/sensitive data is resident behind their firewall, and they want to bring the value of their SaaS provider inside.)

So I thought, what would I expect from an Amazon EC2/S3 appliance to do? Similar to Google's appliance providing internal search, I'd expect an Amazon appliance to create an elastic, resilient set of compute and storage services inside a company, and it could support any/all applications no matter what the user demand. It would also have cost-transparency, i.e. I'd know exactly what it cost to operate each CPU (or virtual CPU) on an hourly basis. Same goes for storage.

This approach would have various advantages (plus a small limitation) to how IT is operated today. The limitation would be that its "elasticity" would be limited by the poolable compute horsepower within an enterprise. But the advantages would be huge -- who wouldn' t like a cost basis ~$0.10/CPU-hour from their existing resources? Who wouldn't like to shrug-off traditional capacity planning? etc. etc. AND they'd be able to maintain all of their existing compliance and security architectures, since they were still using their own behind-the-firewall facilities.

Does it still sound crazy so far?

NOW what if Amazon were to take one little extra step. Remember that limitation above -- the what-if-I-run-out-of-compute-resources issue? What if Amazon allowed the appliance user to permit reaching-out to Amazon's public EC2/S3? Say you hit peak compute demand. Say you had a large power outage or a series of hardware failures. Say you were rolling-out a new app and you couldn't accurately forecast demand. This feature would be valuable to you because you'd have practically infinite "overflow" -- and it would be valuable to Amazon since it would drive incremental business to their public infrastructure.

To be honest, I have no idea what Amazon is planning. But I DO know that the concept of commercially-available software/hardware to create internal "clouds" is happening today. And not just in the "special case" of VMware's "VDC-OS", but in a more generalized approach.

Companies like Cassatt can -- today -- take an existing compute environment, and transform its operation so that it acts like an EC2 (an "internal cloud"). It responds to demand changes, it works around failures, and it optimizes how resources are pooled. You don't have to virtualize applications if you don't want to; and if you do, you can use whatever VM technology you prefer. It's all managed as an "elastic" pool for you. And metered, too.

To be sure, others are developing similar approaches to transforming how *internal* IT is managed. But if you are one of those who believes in the value of a "cloud" but wouldn' t use it, maybe you should think again.

Sound crazy?

Thursday, October 2, 2008

Decades of experience with Clouds: Telcos

While at yesterday's SDForum meeting on cloud computing, a panelist pointed out that we've been living with (a form of) cloud computing for decades. It's called Telephony.

On reflection, the telcos do give us an interesting model for what PaaS *could* be like, and a metaphor for types of cloud services. To wit:
  • As users, we don't know (or care) where the carrier's gear is, or what platform it's based on so long as our calls are connected and our voicemail is available.
  • There isn't technical "lock-in" as we think of it. Your address (phone number) is now portable between carriers, and the cloud "API" is the dial tone plus those DTMF touch-tones
  • I can "mash-up" applications such as Voicemail from one company, conference calling from another, and FAX transmission from a third.
  • There are even forms of "internal clouds" in this model -- they're called PBXs (private branch exchanges) which are nothing more than "internal" phone switches for your business
This last point interests me the most - that enterprises have economic and operational needs (maybe even security needs!) to manage their own internal phone systems. But inevitably, workers may have to use the public phone system, too.

Similarly, many enterprises will need to retain 100% control of certain computing processes and never outsource to a cloud; They'll certainly be attracted to the economics that external computing resources offer, but will eventually build (or acquire) a similar *internal* capability. Just wait.

Wednesday, October 1, 2008

Postcards from SDForum - Cloud Computing and Beyond

I attended most of today's SDForum "Cloud computing and Beyond: The Web Grows Up (Finally)" in Santa Clara. Somewhere around 200 professionals from Silicon Valley showed to hear -- and to debate -- the relative maturity and merits of the thing we're calling the cloud.

The day was lead-off by James Staten, a friend and former colleague, and now with Forrester Research, who gave a fantastic keynote of "Is cloud computing the next revolution?" Just getting to a definition of terms, and mapping the taxonomy of this emerging market is tricky. But he's tracking this fast-maturing market rather closely. Both web-based services and Software-as-a-Service are becoming the norm; but the industry is also calling the lower-level services (PaaS, IaaS) cloud too. So be careful of terms when you enter into a cloud debate.

Another morning keynote (which I unfortunately missed most of) was delivered by Lew Tucker, Sun Microsystems' new CTO of Cloud Computing (and also a friend and former colleague). He's quite a visionary, and went so far as to suggest that computing resources of tomorrow will be brokered/arbitraged based on specializations, costs, etc.

One particularly lively panel was hosted by Chris Primesberger of E-Week, with panelists from Salesforce.com, Intacct, SAP, RingCentral and Google. There was some light discussion about cloud differentiation, interaction, and standard approaches to describing cloud SLAs. Most generally agreed that there would in fact be 3rd-party businesses brokering between providers at some point. The other enlightening discussion focused on capacity planning for the cloud -- what if a user scaled from ten to ten-thousand servers in a few days or weeks? Could services like Amazon handle this? In a consistent - and impressive - way, the panelists agreed that these sorts of scale issues were "a drop in the bucket" when you consider the vastness of what these large service provide on a daily basis.

In what drew the most spontaneous applause was a question asked to the panel (but probably directed to Rajen Sheth of Google) by a member of the audience. Essentially, how could we *not* assume there would be service lock-in, when Force.com had one platform model, and Google App Engine had another? (a good point elucidated by James Urquhart some time ago). The Google response focused on "providing the best possible service for customers" but was clearly a dodge. (BTW, the author herein suggests that SaaS and PaaS models will follow the same proprietary/fragmentary model as did Linux and Unix).

In an afternoon panel led by David Brown of AMR research, the main question addressed was whether (or to what degree) cloud computing was disruptive. The panel consisted of hardware, software and services vendors from Elastra, Egenera, Joyent and Nirvanix. The panel agreed that there were different types of disruption, depending on where you sit. From an infrastructure management perspective, internal cloud architectures can be disruptive to IT Ops, since it changes how resources are applied and shared, and the fundamentals of capacity planning. Cloud architectures can also be disruptive to traditional forms of hosting and outsourcing, due to their pay-as-you-go approach.

I will say that Jason Hoffman, Founder of Joyent, stood out in the panel clearly as a visionary in this field. Keep an eye on this guy. His take on disruption was that if "cloud" means Infrastructure-as-a-Service, then it's really just another form of hosting, and not very disruptive. But if how "clouds" are applied to support business needs using policy (i.e. to dynamically communicate SLAs, Geographic compute locations, costs, replication, failover,etc.) then they become very disruptive. IT administration would shift from scripting and fire-fighting, to policy-development and policy modification.

Finally, I will point out that many more folks showed-up who would use clouds and/or broker cloud services than who would actually *make* the clouds (IaaS) in the first place, again attesting to the point I made earlier this week that it's a lot harder to do, and only really sophisticated vendors will be taking that on.