Plastic brain learning/un-learning

Posts Tagged ‘Cloud Computing’

Puppet in the Clouds

In Cloud Computing, Manageability on March 25, 2009 at 9:52 pm




In Virtualized Data center on February 16, 2009 at 10:59 pm


I recently came across Benjamin Black’s blog on complexity in the context of AWS. He says:

i now see complexity moving up the stack as merely an effect of complexity budgets. like anything worth knowing, complexity budgets are simple: complexity has a cost, like any other resource, and we can’t expect an infinite budget. 

spending our complexity budget wisely means investing it in the areas where it brings the most benefit (the most leverage, if you must), sometimes immediately, sometimes only once a system grows, and not spending it on things unessential to our goals.

What drives design complexity, in the Cloud computing infrastructure space?

Finding the right mix of functional “differentiation” versus “integration” at all levels or tiers of the design (whether it is hardware or software), along with technology & business constraints drive “complexity budget”. 

“Differentiation” at the functional level is pretty well understood, but evolving: Routers, Switches, Compute, Storage nodes although the very basis of how these functions are realized is changing (e.g. Cisco is supposed to be gearing to sell servers).

“Integration” of various infrastructure functions in the data center, is always a non-trivial system (integration) expense.

Some Technology constraints examples:

  • Right from the chip level to system level, energy efficiency improvements are much slower than, hardware density improvements. Consequently, not being able to consume power in proportion to utilization levels result in sub-optimal cost/pricing structures.
  • Context/environment for virtualization for Cloud providers is really around defining what it means to have a “virtualized data center”. Cisco’s Unified computing being one example.
  • Workload characterization & technology mapping

Business constraints: typically, learning curve/market adoption, cost and time-to-market.

The need to produce manageable systems, the imperatives of interdependencies from the chip level to board, system level, software and ultimately the data center level requires us to resort to holistic, iterative thinking & design: We have to consider the function, interaction at all levels, in the context of the containing environment, i.e. what it means to have a “fully virtualized” data center.

Ironically, Virtualization gets more attention than “Reconfigurability” needed (in response to virtualization, workload & utilization variations) at all levels: compute, storage, interconnect, power/cooling, perhaps even topology with in the data center.

After all, Cloud providers would want:

  • “proportional power consumption” at the data center level
  • reduced system integration costs with fully integrated (virtualization aware) compute, storage & interconnect stack
  • optimal cloud computing operations, even for non-virtualized environments (lets face it, there are tons of scenarios that don’t necessarily need virtualization; by the same token, there are virtualization solutions that enable better utilization but don’t require hypervisors).
  • complete automation of managed IT environments

This is all about moving the complexity away from IT customers (adopting the cloud computing model), and in to the data centers/cloud providers.

Java Observability, Manageability in the Cloud

In API, Manageability on January 30, 2009 at 10:14 pm

Have you done this fire-drill: you have a high traffic/volume web application, it is sluggish or unstable. Your team is called to figure out if the problem is in the application, middleware stack, operating system or somewhere else in the deployment configuration??

This is where Observability comes in: Being able to dynamically probe resource usage (across all levels of the infrastructure/application stack) at a granular level, control probing overheads and associating actions or triggers with those probes at all levels.At a system level,  DTrace is a good example, on Solaris & Mac OS X. There’s even a Java API for DTrace. Of course, there are other profiling libraries that offer low over-head, extremely fine-grained probes, aggregation capabilities (e.g. JETM).

Manageability is another important consideration: Being able to control and manage applications via standard management systems. This involves (hopefully) consistent instrumentation mechanisms, and standard isolation mechanisms between IT resources being managed, and external management systems. JMX is an example of one such standard. Other options such as JMX to SNMP bridge, MIBs compiled to MXBeans are some methods of integrating managed resources in to higher level (management) frameworks.

Flexible binding of computing infrastructure to workloads, is a key value proposition of the Cloud Computing model. Cloud computing providers like Amazon serve the needs of typical web workloads, by providing access to their dynamic infrastructure.

The problem is workload diversity. Vendors like Amazon, Google essentially offer Distributed computing workload platforms. Imagine tracing application performance issues on a distributed platform!

So, why is Observability and Manageability more important in the Cloud?

Because consumption of a fixed resource like CPU hrs is not optimal when you are on a “elastic”/dynamic infrastructure. You’d want to pay for exact utilization levels.

Because you want monitoring arrangements that work seamlessly, as they do on a single system, but on top of “elastic”/dynamic infrastructure.

There are many other reasons, of course: 

Being able to manage the lifecycle of applications in an automated manner, in a distributed computing environment is perhaps the biggest use case for Manageability in the Cloud.

Metering, Billing, Performance monitoring, Sizing and Capacity planning are some examples of activities in the Cloud computing model that leverage the same underlying Observability principles (Instrumentation, dynamic probes, support for aggregating metrics etc).

Lets look at a couple of examples of what Observability and Manageability capabilities can enable in the Cloud computing context:

  • Customers can pay at a more granular level of Throughput levels (Requests or Transactions processed per unit of time) for a given latency (and other SLA items), so your charge back model makes sense, in the context of your business activities.
  • Enable better  business opportunities, in a cost-effective manner. e.g. Mashery focuses metering/instrumentation at the API level, but the proposition in this context: Open up your API’s, meter it to provide you with an automated “business development filtering” mechanism….attractive if you’re in the right business, even on the “long tail” i.e customer traffic/volume is not high, but at least you have an ecosystem (100’s of developers or partners) to support the long tail, without burning your bank account. See my earlier post on RESTful business for more context here…these considerations are more valid in the Cloud computing paradigm.
  • Ensuring DoS style attacks don’t give you a heart attack (because your elastic cloud racked up a huge bill)

Observability and Manageability are key “infrastructure” capabilities in the Cloud computing model that enable features/value proposition such as the ones discussed above. These are not new ideas, but adaption of time tested ideas to a new computing paradigm (i.e predominantly distributed computing over adaptive/dynamic infrastructure, but other variations will crop up).

Dealing with Web 2.0, Cloud computing trends….

In Systems thinking on December 7, 2008 at 8:55 pm

My favorite oxymoron(ic) concept/phrase is “sustainable competitive advantage”…its all about getting to a point where you have sufficient lead time with competitors on all aspects of ‘competitive advantage’ such as Product or Technology supremacy, Customer intimacy or Operational Efficiency. You need constant learning & unlearning to recognize when the rules of the game change, when there are changes in frame of reference and refocusing. Programmable Web or Web 2.0 or “Participation age” or *aaS (everything as a Service) is just another inflection point in how the technology game is evolving. But first, lets talk about how you might lose competitive advantage. You lose competitive advantage due to:

  • Imitation by competitors: Amazon’s AWS, GoogleApps, Joyent are some of the forerunners in the Cloud computing offerings space. While first mover advantages could be significant in the presence of strong network effects, certainly the concepts around the Cloud Computing business model is no barrier for entry. So, expect responses from other big players such as Microsoft, HP, IBM and Sun.Interestingly, Sun’s offerings competed on a similar business model, but addressed a much narrower segment of workloads. Then, Web 2.0 came along….Companies with large install bases of their own in various customer segments such as Enterprises, Web 2.0/startups, Channels (Resellers, OEM’s), Developers are sure to come up with unique value propositions for their primary revenue base, and build on their existing networks first. “Best Practices” is the euphemism for imitation…not that it is bad, but remember it is an equalizer. Now, in the context of Cloud computing, ITIL/ITSM as a best-practice will begin to be re-defined.
  • Denial or Inertia of incumbents: Ironically, the likelihood of delayed response to technological shifts increases with a company’s level of success with previously dominant technologies. Google and Amazon seem to have cost structure advantage, and have exploited Web 2.0 standards to roll out utility scale computing. Sunk costs and current cost structures for the big players (IBM, Sun etc) impose quite a lot of inertia, while they formulate response strategies…
  • Exploiting your strength’s to the point where they turn out to be your weakness (Sub-Optimization): What has worked in the past may not work for the future. For e.g. Network effects that drove the adoption of the Microsoft PC platform, are not as relevant in the Web 2.0, Open Source/Open standards world. Similarly, high performance server market is not as attractive to customers as before in the Web 2.0 world, because of open source innovations that enable highly scalable, utility scale deployments using low-end hardware and open source software. Think about what MogileFS, Hadoop, BigTable, AWS do to HW vendors…you would care less about OS innovations such as ZFS or DTrace when you have to worry about large scale deployments. Application level fault tolerance tend to away a big chunk of the value proposition out of these technologies. 
  • Change in the rules of the game: Rail roads, Telegraph/Telephone networks, Automotive manufacturing all went through this, and it is not time for IT industry. First generation of these industries worried about Manufacturing issues (e.g. once Ford Model T’s started rolling out en masse, the success in dealing with problems in production created a new problem/game: that of concern about dealing with growth, customer segments, distribution etc). After Mainframes, Client-server & PC eras, we’re moving back to ubiquitous & participatory age of computing with the need for Utility scale of compute and storage capacity. Cloud computing seems to address business model concerns in this area in a simple/scalable manner. 
  • Change in the very context/frame of reference: The nature of computing, as well as how you look at computing in general, is changing (i.e the frame of reference that is common or “reality” and the methods of inquiry). All enterprises have to deal with a Social network of some sort at the edges. The LongTail, Power Law and the idea of offering API’s for a programmable web of Services around your products, is the new frame of reference. You just don’t “sell products” anymore. This is where terms like “Platform as a Service”, “Infrastructure as a Service”, “Software as a Service” etc come in to play….whether you like it or not, companies will have to be “operators” of “IT utility” at some level. Today we tend to look at “internal applications” versus “customer facing” applications, as adjuncts to product strategy. We’re moving in to an era of Information-bonded, social networks.

Google’s promise on more eco-friendly Data centers, Microsoft’s Generation 4 Data center design, and how they craft their business models around Cloud computing at all levels, determine how the IT industry is set to transform the next round of this game.

RESTful Business…

In API, Business models on November 26, 2008 at 5:42 pm

One of the key drivers behind Cloud Computing, of course is the standardization of web API’s around the REST programming model. Whether you are a Technology provider, Content provider or Service provider there is always room for a strategy to monetize or deepen customer relationships, foster a stronger community via REST API’s. So, based on your target market, chances are you can always unearth Business models that can monetize your API’s with appropriate Billing, API usage provisioning capabilities.

Lets look at a few examples:

1. NetFlix: By opening up the API’s to their movie titles and subscriber queue’s, they enable richer user experiences outside of the Netflix web site context (e.g. I’m sure there are iPhone apps for Netflix in the works), Enables new partners and generally all sorts of niche players (e.g. Online Bond movie communities building apps around their interests). Its all about moving “down” the Long Tail, and get a communities to build stuff that NetFlix can’t do alone. Its all about faster innovation, driving more “positive” feedback in to their user base.

2. BestBuy: is all about getting their catalog/estore to wherever you hang out on the web. Talk about extending your customer reach…literally, extending your sales channels. again, is all about growing the eco-system of customers, partners by enabling off-site experiences.

Exposing API’s, Metering the usage of the infrastructure supporting the API’s, catering to the “Long Tail” via enabling crowd-sourcing…all of these facilitate new business models or extend existing business models in a highly leveraged manner. You’re literally extending the reach of your business development activities, and with appropriate tracking/measurements in place (e.g. what part of your API use cases are popular, growing like crazy etc) you have a mechanism to weed out and surface viable business models or extensions to current business models.

Of course, it takes a lot of discipline and focus to manage these changes (e.g. watch out for cannibalizing existing channels or incentive structures etc). Corporations, however big, can’t think of all the great ideas themselves, it just makes sense to enable communities, accept the wisdom of “crowd sourcing”. After all, isn’t this the age of the programmable web?