• Home
  • Categories
    • Analysis
    • Commentary
    • General
    • State of the Cloud
  • About
  • Contact
  • Consulting
  • Subscribe via RSS

Revisiting EC2 Instance IDs

February 4th, 2010  |  Published in Analysis  |  8 Comments

Back in September, I published the Anatomy of an EC2 Resource ID where I pointed out some curious patterns in EC2’s ID scheme and proposed a method of “decoding” these patterns to reveal an underlying serial number. In that post I was careful to write that “while the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits”.

This week, the black box became a little bit whiter.

Sören Bleikertz, a computer science student writing his Masters thesis on EC2 security, poked into the Xen hypervisor used by EC2 and made some observations regarding EC2’s underlying architecture. Among his findings on the storage and networking configurations, Sören pointed out that each instance was given a unique name (the “Xen domain”) such as dom_32504936 and that this seemed to behave like a serial number, growing from day to day. Sound familiar yet?

Well, it turns out that this Xen domain is none other than the underlying instance ID uncovered in my previous research! This revelation gives us an important conclusion: the decoding method was accurate. The serial number exists and based on everyone’s input we even got the formula right.

With Sören’s technique at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which thanks to RightScale enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. Surprisingly, it turns out that the constants are in fact identical for all regions. What threw us off the scent is that as opposed to us-east-1 which very likely started the serial number from zero, the other regions do not. For example, the serial numbers for the 3-month-old us-west-1 region are already in the range of 752 million. Those for eu-west-1 are in the 500 million range. We can safely assume that hundreds of millions of instances have not in fact been spun up. What makes more sense is that each region was assigned a different starting point in order to ensure globally unique instance IDs.

An additional finding of Sören’s is that the image file for the root disk points to a filename on the VM host such as /mnt/instance_image_store_3/262768. It turns out that the number at the end of this file is, again, simply the AMI ID – decoded. For example, we can re-encode 262768 to yield ami-19a34270, which is Alestic’s Ubuntu Karmic Base image. Similar to instance IDs, the underlying image ID also seems to have different ranges in each AWS region.

As a bonus of Sören’s discoveries and the connection to the IDs, it’s now possible to infer your instance ID (and image ID) locally, without even consulting the EC2 user-data. Why someone would prefer this to the user-data is a good question, but it’s a fun exercise nonetheless. Here’s a Ruby script that does just that:

#!/usr/bin/ruby
$stderr.puts("Detecting VM domain ID (may take a few moments)")
dom_id = nil
(1..65535).each do |i|
        if system("xenstore-ls /local/domain/#{i} > /dev/null 2>&1")
                dom_id = i
                break
        end
end

$stderr.puts("VM domain ID is #{dom_id}")

dom_name = `xenstore-read /local/domain/#{dom_id}/name`
$stderr.puts("VM domain name is #{dom_name}")

numeric_id = dom_name.split("_").last.to_i
c1 = numeric_id >> 24
c2 = (numeric_id >> 16) & 0xFF
c3 = numeric_id & 0xFFFF
c3_1 = (numeric_id >> 8) & 0xFF
c3_2 = numeric_id & 0xFF

d1 = c1 ^ c3_2 ^ 0x69
d2 = c2 ^ c3_1 ^ 0x40 ^ 0xe5
d3 = c3 ^ 0x4000

instance_id = sprintf("i-%02x%02x%04x", d1, d2, d3)
puts(instance_id)

This requires xen-utils to be installed on the machine (on Ubuntu, run apt-get install xen-utils-3.3). Here’s an example run:

# ./get_instance_id.rb
Detecting VM domain ID (may take a few moments)
VM domain ID is 1423
VM domain name is dom_32900610
i-6a554602

Thanks once more to Sören for the great detective work.

State of the Cloud – February 2010

February 2nd, 2010  |  Published in State of the Cloud

Welcome to the eighth update for the State of the Cloud series. In case you are just joining – this monthly report measures the adoption of leading cloud providers amongst public-facing websites. The data set is based on the top 500,000 sites as measured by QuantCast. As always I’d like to point out the caveats of this method which were laid out in the first post in the series.

This month UK-based FlexiScale joins the report. FlexiScale has been on the radar for a while, but due to the US-oriented nature of QuantCast’s data it was so under-represented that it was as invisible. Since then, FlexiScale’s footprint has expanded enough so as to make it reasonable to include it the report. It’s worth mentioning that FlexiScale is probably still under-represented, so its standings should be taken with a pinch of salt.

Snapshot for February 2010

This month I’m plotting the numbers both in the usual bar chart and also in pie chart format. The pie-chart really visualizes that there are two leagues here. it’s increasingly difficult to see anyone from the minor league crossing over to the major league, which holds a whopping 93% of the total cloud-hosted sites found. Amazon EC2 hosts a shade under 50% of these sites while Rackspace Cloud Servers hosts 43%. Joyent, GoGrid, OpSource and FlexiScale together comprise just 7% of all cloud-hosted sites found.

I should remind readers that these percentages are not of the full 500,000 sites surveyed. In fact, all the cloud providers together still host a meager 1% of the sample.


Trends

Here are the results as observed over the past months:

Tune in next month for further updates!

State of the Cloud – January 2010

January 2nd, 2010  |  Published in State of the Cloud  |  6 Comments

A new year, a new decade… and also a new month. What a better way to welcome the new year than our regular report on the cloud computing industry. Will 2010 be the year of the cloud? Many are expecting a major shift of the traditional hosting industry, large and small, into the cloud space this year. We’ll be watching.

Snapshot for January 2010

The standings continue to be stable. Has Amazon won or are we in for surprises during 2010?

Trends

The cloud marches forward! The past two months have been slower than usual (4-5% overall growth). On the one hand the cloud may have come in useful to handle the holiday traffic spikes, but on the other hand who wants to rock the boat at the most important time of the year, and just before your staff head out for vacation?

Comparing Amazon EC2 Regions

December 16th, 2009  |  Published in Analysis  |  8 Comments

Two weeks ago, Amazon announced the launch of a new N. California region, us-west-1. This is now the third region in Amazon’s portfolio, following the N. Virginia-based us-east-1 region and the Ireland-based eu-west-1. I got curious to see how fast the new region was being adopted. Luckily, with the Anatomy of an Amazon EC2 Resource ID formula at hand, we can zone in on this data this quite easily.

To the results!

First, a plot of the instance counts for each region rising over the course of the day and a half sampled.

Plot of Amazon EC2 Instance Launch Count

Next, a more direct comparison of the count of instances launched per day.

EC2 instance launch counts per day per region

Conclusions?

  • us-west-1 is a great success. Just two weeks after its official launch its level of activity is 73% of eu-west-1, which has been around for a whole year!
  • us-east-1 continues to maintain a giant (and unsurprising) lead. We can also observe that its numbers are in the same range as the measurements back in September.

It will be interesting to see when (in my opinion it’s “when” not “if”) us-west-1 surpasses eu-west-1. It will probably not take too long. What is more intriguing is to try and guess the relative sizes, say, 1 year from now. Will us-east-1 retain its lead? While it enjoys its status as the default for AWS operations, the new region will definitely put up a fight due to its proximity to the California-centered tech industry.

One segment, applications with Salesforce integration, were up to now severely limited due to Salesforce’s datacenter being located in California and Amazon’s in Virginia. Many chose simply not to go for AWS. The new region now provides this segment with low-latency communications between EC2 and Salesforce. This segment alone may turn out to be a significant factor in the growth of us-west-1.

I’ll continue to follow from time to time. AWS are planning some new Asia regions in 2010, first Singapore and then Japan, so there will definitely be lots of data to keep us busy!

Happy Holidays!

Rackspace Cloud Usage Analysis

December 10th, 2009  |  Published in Analysis  |  8 Comments

We’ve seen Amazon EC2’s usage. We’ve also seen GoGrid’s usage. In this post, we’ll take a peek at how many servers are being spun up by users of Rackspace Cloud Servers. In the State of the Cloud series Rackspace seems to be a close second to Amazon. Will the usage data provide confirmation of this? Let’s find out!

Quite like GoGrid, Rackspace Cloud Servers’ systems make the task of measuring usage quite straightforward. Server IDs are serial numbers and there are no hoops to jump through in order to make the calculations. I set about collecting samples over a period of approximately two weeks. Here’s what I found:

Rackspace Cloud Servers Usage

Over the timespan of two weeks 7241 servers were provisioned by Rackspace Cloud users. On average the result is 488 servers/day.

Of the three providers surveyed, the Rackspace Cloud retains its position in second place. While among the top-ranked public websites Rackspace comes in at a close second, in terms of servers provisioned the gap is tremendous. The Rackspace Cloud is ahead of GoGrid’s 181 servers/day but still a hundred times smaller than Amazon’s whopping 50,000 servers/day.

As I speculated regarding GoGrid, some of the difference might be explained by the more elastic nature of a typical Amazon deployment and the ecosystem of tools available for Amazon. I do suspect that Rackspace may be gaining traction at least among the sporadic development and testing use cases – simply due to their low entry-level pricing. I myself have found that if in need of a quick server with minimal demands then Rackspace’s $0.015/hour definitely beats Amazon’s $0.085/hour.

State of the Cloud – December 2009

December 5th, 2009  |  Published in State of the Cloud  |  3 Comments

December is already upon us! Following a small delay due to the days spent at the IGT2009 cloud conference, here are this month’s statistics on cloud usage among the web’s top sites.  The first post in the series describes methodology, data sets and caveats.

Snapshot for December 2009

Top 500k Sites by Cloud Provider

Little seems to change in this chart from month to month, with Amazon firmly in the lead. The devil is in the details though. This month’s surprises are in the relative growth seen for each provider in the data set.

Monthly Growth

Monthly Growth Nov-Dec 2009

For the first time, Amazon EC2 falls strongly behind in month-to-month growth. All providers but OpSource exhibited significantly more movement. While OpSource’s sharp fall seems like big news, it probably really isn’t: since OpSource is small in our data set, its sampling error is higher.

Trends

Top 500k Sites by Cloud Provider - Trends

Total cloud usage among the top 500k sites grew by 5.3% this month, continuing the upwards trend we’ve been seeing for the past five months.  Even if this month’s growth seems smaller than previously, let’s compare it to non-cloud providers. A quick count of four leading traditional web hosters – GoDaddy, The Planet, Dreamhost and Bluehost – indicates their combined share, while over 10x larger than the cloud providers above, grew by just 1.7% this month.

Further numbers can be found in Rackspace’s Q3 Financial Reports, published last month. Rackspace is one of the few companies in the field to roll out its cloud vs. traditional hosting revenues, providing the industry a rare insight. The results? Rackspace’s Q309 revenues for cloud were $15.3M, compared to $147.1M for managed hosting. Let’s dig deeper and calculate the quarterly growth: The Rackspace Cloud grew 17.5%, compared to 5.8% growth in managed hosting.

It’s not surprising that the relative size and growth we can see in Rackspace’s results are reminiscent of what we see in our own data. In the relatively saturated web hosting market, cloud computing still has much ground to cover, but it’s gaining significant momentum.

Analysis of GoGrid Cloud Usage

November 12th, 2009  |  Published in Analysis  |  5 Comments

In previous posts, we’ve gone to great lengths to make educated guesses regarding Amazon EC2 usage. Amazon, however, are not alone in the cloud space. It makes sense that we do the same for other providers to get an idea of their comparative levels of usage.

In today’s post, we’ll be discussing such an analysis of GoGrid.

GoGrid’s technology seems to make the process almost trivial – the server ID that is assigned to each and every server provisioned appears to be a straightforward serial number. Hence, this time we need not find patterns or XOR bytes, and can instead get down to business. As always, I will warn that the numbers are circumstantial and only GoGrid knows if they are the real thing.

Chart of GoGrid Servers Provisioned

In total, during a time span of just over 13 days this research witnessed 2413 servers provisioned on GoGrid. On average, that comes to approximately 181 servers launched per day.

How does this compare to Amazon EC2? We calculated 50,000 servers/day for EC2, so the difference is significant. To put it another way – by this count, if GoGrid was the size of the Texas then EC2 would be the Pacific Ocean. Let’s try to dig a little deeper though: this measurement is more a reflection of a provider’s level of elasticity than of its absolute size. With a thriving ecosystem of tools and services such as Elastic Load Balancing and Elastic MapReduce it’s not surprising to find the use of EC2 servers is more dynamic.

On a personal note, I must say I was put off by the time it takes to launch a server on GoGrid (sometimes up to 8 minutes) as compared to what I was used to from EC2. For me, that means that if I need a quick server just to run something for a minute or two there would be no question.

Coming up next time – a similar analysis of Rackspace Cloud Servers.

State of the Cloud – November 2009

November 2nd, 2009  |  Published in State of the Cloud  |  4 Comments

It’s that time of month again, which means a new State of the Cloud post is upon us! For new readers, State of the Cloud is a regular report on the adoption of cloud infrastructures, comparing the share held by each provider. The first post in the series describes methodology, data sets and caveats.

Old or New?

During my recent presentation at the IGT, I was asked whether there is any evidence to back up the belief that newcomers are embracing cloud infrastructure more than established companies. Before our regular monthly numbers, let’s take a shot at that question. The technique we’ll use is simple: take registration dates for domains in the State of the Cloud data set, and compare those of cloud-hosted sites to those of the general population. (The analysis was performed with a random sampling of each group.)

Frequency of year of domain registration per hosting type

The difference is easily recognizable to the naked eye. While both groups have representation across the board, cloud-hosted sites tend to be much newer. Calculating the median of each group, the overall median year of registration in our data set is 2003, while that for cloud-hosted sites is 2005. These findings won’t come as news to anyone, but it’s great to let the numbers tell the story of the cloud’s early adopters.

And now for our regular programming -

Snapshot for November 2009

Top sites by cloud provider

Amazon leaps ahead, this month attaining a 35% lead on runner up Rackspace Cloud Servers. I’d like to note that due to an update of Quantcast’s top million sites (the input data set), this month’s results may be “bumpier” than usual. In the grand scheme of things, these regular updates will contribute to greater accuracy.

Monthly Growth

Cloud provider growth

GoGrid got the biggest boost this month, even though in absolute numbers there is a lot of ground to be covered. In the race for pole position, Amazon greatly outpaced Rackspace. We might infer that that the updated data set’s positive effect on GoGrid and Amazon EC2 reflects well on them: it is an indication of their true strength amongst the top of the crop of Internet sites, as compared to the other providers.

Trends

Cloud Provider Trends

Regardless of the individual providers’ standings, the overall growth of the cloud providers surveyed over the past 3 months (as indicated by the black line) is an incredible 33% – from 3170 hits in our data set 3 months ago to 4217 this month.

Presentation: Measuring the Clouds

October 21st, 2009  |  Published in General  |  3 Comments

Following up on the cloud research I’ve been conducting and publishing here, yesterday I presented the topic at an IGT workshop. There was a lot of great discussion on the findings as well as ideas for new angles and fresh approaches to looking at the data. Thanks to IGT’s Avner Algom for hosting the session!

I’ve SlideShare’d the presentation below for my readers’ enjoyment. If anyone is interested in discussing feel free to reach out.

IGT is holding its primary cloud computing event of the year, IGT2009 – World Summit of Cloud Computing, on December 2-3. A ton of folks from the industry are attending and I’ll definitely be there. I look forward to meeting fellow cloud-ers.

Measuring The Clouds

Amazon Usage Estimates and Updates

October 7th, 2009  |  Published in Analysis  |  1 Comment

My recent Anatomy of an Amazon EC2 Resource ID post and the usage statistics it implied caused quite a stir in the cloud computing community. It’s been an exciting couple of weeks to see the discussion taking place. Needless to say, Amazon continues to keep quiet on the real numbers. In the past weeks new data has come to light, both by further analysis using the ID technique and from fresh sources. I’d like to share a couple of the more intriguing developments with my readers:

  • RightScale decided to apply the findings to the mountain of EC2 data they have – a few years worth. Firstly, this solved a few of the remaining puzzles in the ID formula, on the Series ID and Superseries ID (I’ve updated the original post to reflect this). Moreover, the wider perspective led them to estimate that the total number of instances launched is actually a whopping 15.5 million. RightScale’s full findings can be found here.
  • Randy Bias published an excellent post stating that Amazon’s EC2 Generating 220M+ Anually – based on “actual verified EC2 numbers plus some guesses and a rough model of its current annual usage”. Bias’s sources tell him Amazon has approximately 40,000 servers running (note that’s servers not instances).

So what is the bottom line? CIO Magazine’s Bernard Golden described it well in his coverage of my research, concluding that “it’s hard to look at these numbers and not conclude that something big is going on, and not just in “toy” applications.”

Something big indeed.

    http://twitter.com/randybias

    Previously


    Feb 2, 2010
    State of the Cloud – February 2010

    by Guy Rosen | Read | No Comments

    Welcome to the eighth update for the State of the Cloud series. In case you are just joining – this monthly report measures the adoption of leading cloud providers amongst public-facing websites. The data set is based on the top 500,000 sites as measured by QuantCast. As always I’d like to point out the caveats [...]


    Jan 2, 2010
    State of the Cloud – January 2010

    by Guy Rosen | Read | 6 Comments

    A new year, a new decade… and also a new month. What a better way to welcome the new year than our regular report on the cloud computing industry. Will 2010 be the year of the cloud? Many are expecting a major shift of the traditional hosting industry, large and small, into the cloud space [...]


    Dec 16, 2009
    Comparing Amazon EC2 Regions

    by Guy Rosen | Read | 8 Comments

    Two weeks ago, Amazon announced the launch of a new N. California region, us-west-1. This is now the third region in Amazon’s portfolio, following the N. Virginia-based us-east-1 region and the Ireland-based eu-west-1. I got curious to see how fast the new region was being adopted. Luckily, with the Anatomy of an Amazon EC2 Resource [...]


    Dec 10, 2009
    Rackspace Cloud Usage Analysis

    by Guy Rosen | Read | 8 Comments

    We’ve seen Amazon EC2’s usage. We’ve also seen GoGrid’s usage. In this post, we’ll take a peek at how many servers are being spun up by users of Rackspace Cloud Servers. In the State of the Cloud series Rackspace seems to be a close second to Amazon. Will the usage data provide confirmation of this? [...]


    Dec 5, 2009
    State of the Cloud – December 2009

    by Guy Rosen | Read | 3 Comments

    December is already upon us! Following a small delay due to the days spent at the IGT2009 cloud conference, here are this month’s statistics on cloud usage among the web’s top sites.  The first post in the series describes methodology, data sets and caveats.
    Snapshot for December 2009

    Little seems to change in this chart from month [...]


    Nov 12, 2009
    Analysis of GoGrid Cloud Usage

    by Guy Rosen | Read | 5 Comments

    In previous posts, we’ve gone to great lengths to make educated guesses regarding Amazon EC2 usage. Amazon, however, are not alone in the cloud space. It makes sense that we do the same for other providers to get an idea of their comparative levels of usage.
    In today’s post, we’ll be discussing such an analysis of [...]

    About Guy Rosen
    Guy Rosen is an entrepreneur in the cloud computing space. This blog shares his cloud market research, commentary and tips & tricks.

    Find out more at the about page.
    Alternatively, you can subscribe by email for the latest and greatest updates!

    Recent Posts

    • Revisiting EC2 Instance IDs
    • State of the Cloud – February 2010
    • State of the Cloud – January 2010
    • Comparing Amazon EC2 Regions
    • Rackspace Cloud Usage Analysis

    Categories

    • Analysis
    • Commentary
    • General
    • State of the Cloud

    Archives

    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009

    Contributors

    • Guy Rosen

    Popular

    • Anatomy of an Amazon EC2 Resource ID
    • State of the Cloud - August 2009
    • State of the Cloud - September 2009
    • Top Sites on Amazon EC2 - July 2009
    • Rackspace Cloud Usage Analysis
    • Comparing Amazon EC2 Regions
    • Revisiting EC2 Instance IDs
    • State of the Cloud – October 2009
    • State of the Cloud - January 2010
    • Analysis of GoGrid Cloud Usage
  • Blogroll

    • Documentation
    • WordPress Planet
    • Development Blog
    • Suggest Ideas
    • Plugins
    • Themes
    • Support Forum


  • © 2010 Guy Rosen
    Powered by WordPress using the Gridline Lite theme by Graph Paper Press.