<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jack of all Clouds :: Guy Rosen on Cloud Computing &#187; Analysis</title>
	<atom:link href="http://www.jackofallclouds.com/category/analysis/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jackofallclouds.com</link>
	<description>Cloud Computing analysis and commentary from Guy Rosen</description>
	<lastBuildDate>Thu, 01 Jul 2010 15:24:52 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Revisiting EC2 Instance IDs</title>
		<link>http://www.jackofallclouds.com/2010/02/revisiting-ec2-instance-ids/</link>
		<comments>http://www.jackofallclouds.com/2010/02/revisiting-ec2-instance-ids/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 22:29:58 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=585</guid>
		<description><![CDATA[Back in September, I published the Anatomy of an EC2 Resource ID where I pointed out some curious patterns in EC2&#8217;s ID scheme and proposed a method of &#8220;decoding&#8221; these patterns to reveal an underlying serial number. In that post I was careful to write that &#8220;while the patterns are indisputable, there remain unknowns and [...]]]></description>
			<content:encoded><![CDATA[<p>Back in September, I published the <a href="http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/">Anatomy of an EC2 Resource ID</a> where I pointed out some curious patterns in EC2&#8217;s ID scheme and proposed a method of &#8220;decoding&#8221; these patterns to reveal an underlying serial number. In that post I was careful to write that <i>&#8220;while the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits&#8221;</i>.</p>
<p>This week, the black box became a little bit whiter.</p>
<p><a href="http://openfoo.org/">Sören Bleikertz</a>, a computer science student writing his Masters thesis on EC2 security, poked into the Xen hypervisor used by EC2 and made some observations regarding <a href="http://openfoo.org/blog/amazon_ec2_underlying_architecture.html">EC2&#8217;s underlying architecture</a>. Among his findings on the storage and networking configurations, Sören pointed out that each instance was given a unique name (the &#8220;Xen domain&#8221;) such as <code>dom_32504936</code> and that this seemed to behave like a serial number, growing from day to day. Sound familiar yet?</p>
<p>Well, it turns out that this Xen domain is none other than the underlying instance ID uncovered in my previous research! This revelation gives us an important conclusion: the decoding method was accurate. The serial number exists and based on everyone&#8217;s input we even got the formula right.</p>
<p>With Sören&#8217;s technique at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which <a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">thanks to RightScale</a> enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. Surprisingly, it turns out that the constants are in fact identical for all regions. What threw us off the scent is that as opposed to us-east-1 which very likely started the serial number from zero, the other regions do not. For example, the serial numbers for the 3-month-old us-west-1 region are already in the range of 752 million. Those for eu-west-1 are in the 500 million range. We can safely assume that hundreds of millions of instances have not in fact been spun up. What makes more sense is that each region was assigned a different starting point in order to ensure globally unique instance IDs.</p>
<p>An additional finding of Sören&#8217;s is that the image file for the root disk points to a filename on the VM host such as <code>/mnt/instance_image_store_3/262768</code>. It turns out that the number at the end of this file is, again, simply the AMI ID &#8211; decoded. For example, we can re-encode 262768 to yield ami-19a34270, which is Alestic&#8217;s Ubuntu Karmic Base image. Similar to instance IDs, the underlying image ID also seems to have different ranges in each AWS region.</p>
<p>As a bonus of Sören&#8217;s discoveries and the connection to the IDs, it&#8217;s now possible to infer your instance ID (and image ID) locally, without even consulting the EC2 user-data. Why someone would prefer this to the user-data is a good question, but it&#8217;s a fun exercise nonetheless. Here&#8217;s a Ruby script that does just that:</p>
<pre style="margin-left: 20px; background: #eeeeee; padding: 3px; border: 1px solid black; line-height: 100%">
#!/usr/bin/ruby
$stderr.puts("Detecting VM domain ID (may take a few moments)")
dom_id = nil
(1..65535).each do |i|
        if system("xenstore-ls /local/domain/#{i} > /dev/null 2>&#038;1")
                dom_id = i
                break
        end
end

$stderr.puts("VM domain ID is #{dom_id}")

dom_name = `xenstore-read /local/domain/#{dom_id}/name`
$stderr.puts("VM domain name is #{dom_name}")

numeric_id = dom_name.split("_").last.to_i
c1 = numeric_id >> 24
c2 = (numeric_id >> 16) &#038; 0xFF
c3 = numeric_id &#038; 0xFFFF
c3_1 = (numeric_id >> 8) &#038; 0xFF
c3_2 = numeric_id &#038; 0xFF

d1 = c1 ^ c3_2 ^ 0x69
d2 = c2 ^ c3_1 ^ 0x40 ^ 0xe5
d3 = c3 ^ 0x4000

instance_id = sprintf("i-%02x%02x%04x", d1, d2, d3)
puts(instance_id)
</pre>
<p>This requires xen-utils to be installed on the machine (on Ubuntu, run <code>apt-get install xen-utils-3.3</code>). Here&#8217;s an example run:</p>
<pre style="margin-left: 20px; background: #eeeeee; padding: 3px; border: 1px solid black; line-height: 100%">
# <b>./get_instance_id.rb</b>
Detecting VM domain ID (may take a few moments)
VM domain ID is 1423
VM domain name is dom_32900610
i-6a554602
</pre>
<p>Thanks once more to Sören for the great detective work.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2010/02/revisiting-ec2-instance-ids/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Comparing Amazon EC2 Regions</title>
		<link>http://www.jackofallclouds.com/2009/12/comparing-amazon-ec2-regions/</link>
		<comments>http://www.jackofallclouds.com/2009/12/comparing-amazon-ec2-regions/#comments</comments>
		<pubDate>Wed, 16 Dec 2009 18:17:39 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=528</guid>
		<description><![CDATA[Two weeks ago, Amazon announced the launch of a new N. California region, us-west-1. This is now the third region in Amazon&#8217;s portfolio, following the N. Virginia-based us-east-1 region and the Ireland-based eu-west-1. I got curious to see how fast the new region was being adopted. Luckily, with the Anatomy of an Amazon EC2 Resource [...]]]></description>
			<content:encoded><![CDATA[<p>Two weeks ago, Amazon <a href="http://aws.typepad.com/aws/2009/12/expanding-the-aws-footprint.html">announced the launch of a new N. California region</a>, <i>us-west-1</i>. This is now the third region in Amazon&#8217;s portfolio, following the N. Virginia-based <i>us-east-1</i> region and the Ireland-based <i>eu-west-1</i>. I got curious to see how fast the new region was being adopted. Luckily, with the <a href="http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/">Anatomy of an Amazon EC2 Resource ID</a> formula at hand, we can zone in on this data this quite easily.</p>
<p>To the results!</p>
<p>First, a plot of the instance counts for each region rising over the course of the day and a half sampled.</p>
<div style="text-align:center; margin-bottom: 20px">
<img class="aligncenter size-full wp-image-529" title="Plot of Amazon EC2 Instance Launch Count" src="http://www.jackofallclouds.com/wp-content/uploads/2009/12/regional_instance_count_plot.png" alt="Plot of Amazon EC2 Instance Launch Count" width="585" height="341" />
</div>
<p>Next, a more direct comparison of the count of instances launched per day.</p>
<div style="text-align:center; margin-bottom: 15px">
<img src="http://www.jackofallclouds.com/wp-content/uploads/2009/12/regional_instance_count_24hr.png" alt="EC2 instance launch counts per day per region" title="EC2 instance launch counts per day per region" width="569" height="305" class="aligncenter size-full wp-image-532" />
</div>
<p>Conclusions?</p>
<ul>
<li><i>us-west-1</i> is a great success. Just two weeks after its official launch its level of activity is 73% of <i>eu-west-1</i>, which has been around for a whole year!</li>
<li><i>us-east-1</i> continues to maintain a giant (and unsurprising) lead. We can also observe that its numbers are in the same range as the measurements back in September.
</ul>
<p>It will be interesting to see when (in my opinion it&#8217;s &#8220;when&#8221; not &#8220;if&#8221;) <i>us-west-1</i> surpasses <i>eu-west-1</i>. It will probably not take too long. What is more intriguing is to try and guess the relative sizes, say, 1 year from now. Will <i>us-east-1</i> retain its lead? While it enjoys its status as the default for AWS operations, the new region will definitely put up a fight due to its proximity to the California-centered tech industry.</p>
<p>One segment, applications with Salesforce integration, were up to now severely limited due to Salesforce&#8217;s datacenter being located in California and Amazon&#8217;s in Virginia. Many chose simply not to go for AWS. The new region now provides this segment with low-latency communications between EC2 and Salesforce. This segment alone may turn out to be a significant factor in the growth of <i>us-west-1</i>.</p>
<p>I&#8217;ll continue to follow from time to time. AWS are planning some new Asia regions in 2010, first Singapore and then Japan, so there will definitely be lots of data to keep us busy!</p>
<p>Happy Holidays!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2009/12/comparing-amazon-ec2-regions/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Rackspace Cloud Usage Analysis</title>
		<link>http://www.jackofallclouds.com/2009/12/rackspace-cloud-usage-analysis/</link>
		<comments>http://www.jackofallclouds.com/2009/12/rackspace-cloud-usage-analysis/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 22:36:36 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=517</guid>
		<description><![CDATA[We&#8217;ve seen Amazon EC2&#8217;s usage. We&#8217;ve also seen GoGrid&#8217;s usage. In this post, we&#8217;ll take a peek at how many servers are being spun up by users of Rackspace Cloud Servers. In the State of the Cloud series Rackspace seems to be a close second to Amazon. Will the usage data provide confirmation of this? [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve seen <a href="http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/">Amazon EC2&#8217;s usage</a>. We&#8217;ve also seen <a href="http://www.jackofallclouds.com/2009/11/gogrid-cloud-usage/">GoGrid&#8217;s usage</a>. In this post, we&#8217;ll take a peek at how many servers are being spun up by users of Rackspace Cloud Servers. In the <a href="http://www.jackofallclouds.com/category/state-of-the-cloud/">State of the Cloud</a> series Rackspace seems to be a close second to Amazon. Will the usage data provide confirmation of this? Let&#8217;s find out!</p>
<p>Quite like GoGrid, Rackspace Cloud Servers&#8217; systems make the task of measuring usage quite straightforward. Server IDs are serial numbers and there are no hoops to jump through in order to make the calculations. I set about collecting samples over a period of approximately two weeks. Here&#8217;s what I found:</p>
<div style="text-align: center; margin-bottom: 10px"><img class="aligncenter size-full wp-image-518" title="Rackspace Cloud Servers Usage" src="http://www.jackofallclouds.com/wp-content/uploads/2009/12/rackspace_usage.png" alt="Rackspace Cloud Servers Usage" width="543" height="356" /></div>
<p>Over the timespan of two weeks <strong>7241 servers</strong> were provisioned by Rackspace Cloud users. On average the result is <strong>488 servers/day</strong>.</p>
<p>Of the three providers surveyed, the Rackspace Cloud retains its position in second place. While among the top-ranked public websites Rackspace comes in at a close second, in terms of servers provisioned the gap is tremendous. The Rackspace Cloud is ahead of GoGrid&#8217;s 181 servers/day but still a hundred times smaller than Amazon&#8217;s whopping 50,000 servers/day.</p>
<p>As I speculated regarding GoGrid, some of the difference might be explained by the more elastic nature of a typical Amazon deployment and the ecosystem of tools available for Amazon. I do suspect that Rackspace may be gaining traction at least among the sporadic development and testing use cases &#8211; simply due to their low entry-level pricing. I myself have found that if in need of a quick server with minimal demands then Rackspace&#8217;s $0.015/hour definitely beats Amazon&#8217;s $0.085/hour.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2009/12/rackspace-cloud-usage-analysis/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Analysis of GoGrid Cloud Usage</title>
		<link>http://www.jackofallclouds.com/2009/11/gogrid-cloud-usage/</link>
		<comments>http://www.jackofallclouds.com/2009/11/gogrid-cloud-usage/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 15:32:40 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=490</guid>
		<description><![CDATA[In previous posts, we&#8217;ve gone to great lengths to make educated guesses regarding Amazon EC2 usage. Amazon, however, are not alone in the cloud space. It makes sense that we do the same for other providers to get an idea of their comparative levels of usage.
In today&#8217;s post, we&#8217;ll be discussing such an analysis of [...]]]></description>
			<content:encoded><![CDATA[<p>In previous posts, we&#8217;ve gone to great lengths to make educated guesses regarding <a href="http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/">Amazon EC2 usage</a>. Amazon, however, are not alone in the cloud space. It makes sense that we do the same for other providers to get an idea of their comparative levels of usage.</p>
<p>In today&#8217;s post, we&#8217;ll be discussing such an analysis of <a href="http://www.gogrid.com/">GoGrid</a>.</p>
<p>GoGrid&#8217;s technology seems to make the process almost trivial &#8211; the server ID that is assigned to each and every server provisioned appears to be a straightforward serial number. Hence, this time we need not find patterns or XOR bytes, and can instead get down to business. As always, I will warn that the numbers are circumstantial and only GoGrid knows if they are the real thing.</p>
<div style="text-align: center; margin-bottom: 15px">
<img title="Chart of GoGrid Servers Provisioned" src="http://www.jackofallclouds.com/wp-content/uploads/2009/11/gogrid_server_count.png" alt="Chart of GoGrid Servers Provisioned" width="554" height="395" />
</div>
<p>In total, during a time span of just over 13 days this research witnessed <strong>2413 servers</strong> provisioned on GoGrid. On average, that comes to approximately <strong>181 servers launched per day</strong>.</p>
<p>How does this compare to Amazon EC2? We calculated 50,000 servers/day for EC2, so the difference is significant. To put it another way &#8211; by this count, if GoGrid was the size of the Texas then EC2 would be the Pacific Ocean. Let&#8217;s try to dig a little deeper though: this measurement is more a reflection of a provider&#8217;s level of elasticity than of its absolute size. With a thriving ecosystem of tools and services such as Elastic Load Balancing and Elastic MapReduce it&#8217;s not surprising to find the use of EC2 servers is more dynamic.</p>
<p>On a personal note, I must say I was put off by the time it takes to launch a server on GoGrid (sometimes up to 8 minutes) as compared to what I was used to from EC2. For me, that means that if I need a quick server just to run something for a minute or two there would be no question.</p>
<p>Coming up next time &#8211; a similar analysis of Rackspace Cloud Servers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2009/11/gogrid-cloud-usage/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Amazon Usage Estimates and Updates</title>
		<link>http://www.jackofallclouds.com/2009/10/amazon-usage-estimates-and-updates/</link>
		<comments>http://www.jackofallclouds.com/2009/10/amazon-usage-estimates-and-updates/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 13:01:43 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=344</guid>
		<description><![CDATA[My recent Anatomy of an Amazon EC2 Resource ID post and the usage statistics it implied caused quite a stir in the cloud computing community. It&#8217;s been an exciting couple of weeks to see the discussion taking place. Needless to say, Amazon continues to keep quiet on the real numbers. In the past weeks new [...]]]></description>
			<content:encoded><![CDATA[<p>My recent <a href="http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/">Anatomy of an Amazon EC2 Resource ID</a> post and the usage statistics it implied caused quite a stir in the cloud computing community. It&#8217;s been an exciting couple of weeks to see the discussion taking place. Needless to say, Amazon continues to keep quiet on the real numbers. In the past weeks new data has come to light, both by further analysis using the ID technique and from fresh sources. I&#8217;d like to share a couple of the more intriguing developments with my readers:</p>
<ul>
<li> <a href="http://www.rightscale.com/">RightScale</a> decided to <a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">apply the findings</a> to the mountain of EC2 data they have &#8211; a few years worth. Firstly, this solved a few of the remaining puzzles in the ID formula, on the Series ID and Superseries ID (I&#8217;ve updated the original post to reflect this). Moreover, the wider perspective led them to estimate that the total number of instances launched is actually a whopping 15.5 million. RightScale&#8217;s full findings can be found <a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">here</a>.</li>
<div style="text-align: center; margin-top: 15px; margin-bottom: 15px"><a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/"><img src="http://rightscale.files.wordpress.com/2009/10/ec2-instances2.png?w=599&amp;h=396" alt="" width="599" height="396" border="0"/></a></div>
<li><a href="http://twitter.com/randybias">Randy Bias</a> published an excellent post stating that <a href="http://cloudscaling.com/blog/cloud-computing/amazons-ec2-generating-220m-annually">Amazon&#8217;s EC2 Generating 220M+ Anually</a> &#8211; based on &#8220;actual verified EC2 numbers plus some guesses and a rough model of its current annual usage&#8221;. Bias&#8217;s sources tell him Amazon has approximately 40,000 servers running (note that&#8217;s <i>servers</i> not <i>instances</i>).</li>
</ul>
<p>So what is the bottom line? CIO Magazine&#8217;s <a href="http://twitter.com/bernardgolden">Bernard Golden</a> described it well in his <a href="http://www.cio.com/article/503570/Inside_Amazon_s_Cloud_Just_How_Many_Customer_Projects_">coverage of my research</a>, concluding that &#8220;it&#8217;s hard to look at these numbers and not conclude that something big is going on, and not just in &#8220;toy&#8221; applications.&#8221;</p>
<p>Something big indeed.</p>
<ul></ul>
<div id="_mcePaste" style="overflow: hidden; position: absolute; left: -10000px; top: 461px; width: 1px; height: 1px;">http://twitter.com/randybias</div>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2009/10/amazon-usage-estimates-and-updates/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Anatomy of an Amazon EC2 Resource ID</title>
		<link>http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/</link>
		<comments>http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 10:36:07 +0000</pubDate>
		<dc:creator>Guy Rosen</dc:creator>
				<category><![CDATA[Analysis]]></category>

		<guid isPermaLink="false">http://www.jackofallclouds.com/?p=229</guid>
		<description><![CDATA[New technique enables observation of EC2 usage, uncovers stunning data
Each time you allocate a resource using EC2 &#8211; an instance, a volume or a snapshot &#8211; you receive a unique identifier. This is the EC2 resource ID. Have you ever wondered what this ID represents? Well, I did. After noticing similarities between the IDs of [...]]]></description>
			<content:encoded><![CDATA[<h3><em>New technique enables observation of EC2 usage, uncovers stunning data</em></h3>
<p>Each time you allocate a resource using EC2 &#8211; an instance, a volume or a snapshot &#8211; you receive a unique identifier. This is the EC2 resource ID. Have you ever wondered what this ID represents? Well, I did. After noticing similarities between the IDs of resources requested in close succession, I started digging.</p>
<p>The outcome of this digging is a definition of the components that formulate an EC2 resource ID. The marvel is that this definition allows us to externally count the number of resources provisioned within a certain time frame &#8211; enabling us for the first time to observe EC2&#8217;s usage patterns. For example, we can count how many instances are launched on a certain day, in a given EC2 region.</p>
<p>Before continuing, I&#8217;d like to emphasize that these findings are circumstantial. While the patterns are indisputable, there remain unknowns and quirks that remind us that such &#8220;black box&#8221; observation has its limits. Note also that we can estimate how many new resources are created but not how many are already active, how many were later deleted, etc. The total number of servers running on EC2 remains a mystery.</p>
<h3>Results</h3>
<p>In one 24-hour period measured in September 2009, the estimation indicated the following volume of usage on Amazon EC2&#8217;s <code>us-east-1</code> region:</p>
<ul>
<li>50,242 instances requested</li>
<li>12,840 EBS volumes requested</li>
<li>30,925 EBS snapshots requested</li>
<li>41,121 reservations requested
<ul>
<li style="font-size: 11px; line-height: 1; margin-left: 20px">Disambiguation: a reservation in this context is an atomic launch of one or more instances. This does not imply a reserved instance. For example, if you launch 1 instance, you get 1 instance ID and 1 reservation ID; if you launch 2 instances in one command, you get 2 instance IDs and still 1 reservation ID.</li>
</ul>
</li>
</ul>
<p>These numbers are impressive, to say the least. Even more impressive is a small hint, lurking between the numbers, that implies that just over the past month Amazon crossed a significant threshold (see below for more details):</p>
<p><span style="margin-left: 20px">8.4 million EC2 instances launched (since EC2&#8217;s debut).</span></p>
<div style="text-align: center; margin-bottom: 15px"><img class="aligncenter size-full wp-image-233" title="EC2 Resource Usage" src="http://www.jackofallclouds.com/wp-content/uploads/2009/09/ec2_resource_chart.png" alt="EC2 Resource Usage" width="590" height="371" /></div>
<p><strong>UPDATE (Oct 7th 2009): </strong>RightScale <a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">applied the findings</a> for the two years worth of data they have in their systems. Based on that data, they estimate the number of instances launched is actually 15.5 million! They also plotted the numbers over two years &#8211; worth checking out.</p>
<h3>Anatomy of a Resource ID</h3>
<p>So how were the numbers above calculated? To find out, let&#8217;s decompose an EC2 resource ID. After comparing hundreds of IDs, this opaque identifier turned out to be a little more transparent than you&#8217;d expect.</p>
<div style="text-align: center; margin-bottom: 6px"><img class="aligncenter size-full wp-image-239" title="EC2 Resource ID" src="http://www.jackofallclouds.com/wp-content/uploads/2009/09/ec2_resource_id.png" alt="EC2 Resource ID" width="448" height="139" /></div>
<h4><strong>Type</strong></h4>
<p>The most trivial of the fields, the type is one of the following values, depending on the resource type:</p>
<ul style="margin-left: 50px">
<li><strong>i</strong> &#8211; instance</li>
<li><strong>r</strong> &#8211; reservation</li>
<li><strong>vol</strong> &#8211; EBS volume</li>
<li><strong>snap</strong> &#8211; EBS snapshot</li>
<li><strong>ami</strong> &#8211; Amazon machine image</li>
<li><strong>aki</strong> &#8211; Amazon kernel image</li>
<li><strong>ari</strong> &#8211; Amazon ramdisk image</li>
</ul>
<h4><strong>Inner ID</strong></h4>
<p>The Inner ID is a 16-bit counter of resources allocated. Each time a resource is requested, the Inner ID increments by one. For instance and reservation IDs, it increments by two (i.e., these Inner IDs are always even). Instead of counting from 0-FFFF as you&#8217;d expect, the Inner ID uses the following cycle:</p>
<ul style="margin-left: 50px">
<li>4000-7FFF</li>
<li>0000-3FFF</li>
<li>C000-FFFF</li>
<li>8000-BFFF</li>
</ul>
<p>(This cycle can be easily normalized by XORing with 4000.) When the Inner ID has exhausted its space, a new series begins (see below) and the cycle restarts.</p>
<h4><strong>Series Marker</strong></h4>
<p>For a given resource type, there is one active 8-bit Series ID. This Series ID, however, is not embedded directly into the resource ID. Instead, it is XORed to the leftmost 8 bits of the Inner ID. The result, which I call the Series Marker, is embedded in the ID to the left of the Inner ID.</p>
<p>For example, on the resource ID above the Series ID would be <strong>e5 = </strong>a7 XOR 42.</p>
<p>Series IDs usually decrement by one each time the Inner ID completes a cycle. I say &#8220;usually&#8221; because while this is the most common behavior, from time to time Series IDs seem to jump around in a pattern which is yet to be explained.</p>
<p><strong>UPDATE (Oct 7th 2009): </strong><a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">RightScale</a> contributed the missing piece: to normalize a series ID, XOR with E5 &#8211; this irons out the &#8220;jumps&#8221; I noticed perfectly.</p>
<h4><strong>Superseries Marker</strong></h4>
<p>For a given resource type, there is one active 8-bit Superseries ID. Like the Series ID, the Superseries ID is not embedded directly into the resource ID. Instead, it is XORed to the rightmost 8 bits of the Inner ID. The result &#8211; the Superseries Marker &#8211; is the leftmost byte of the resource ID.</p>
<p>For example, on the resource ID above the Superseries ID would be <strong>69 = </strong>31 XOR 58.</p>
<p>The Superseries ID changes so rarely that originally I had assumed it was some kind of checksum. This would have been odd as it limits the total available IDs to 2<sup>24</sup> = 16.8 million. Up to very recently, the Superseries ID for all resource types &#8211; instances, images, volumes, snapshots, etc. &#8211; was 69 (in the us-east-1 region (for eu-west-1 the Superseries ID is 74). These days, new instances use the Superseries ID 68. This subtle change, unnoticed by the industry, may hint at an astonishing achievement: 8.4 million instances launched since EC2&#8217;s debut! (Instance IDs are even so 8.4M = 16.8M / 2.)</p>
<p><strong>UPDATE (Oct 7th 2009): </strong><a href="http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/">RightScale</a> suggested to normalize the Superseries ID by XORing with 69. In this technique, the superseries ID for us-east-1 was 0, and the recent change incremented it to 1.</p>
<h4><strong>Regions</strong></h4>
<p>Note that since each EC2 region is a completely separate system, the IDs in each region are independent of each other.</p>
<h3>Counting Resources</h3>
<p>Now that we have an idea of what an ID represents, how do we use that knowledge to estimate the number of resources provisioned by EC2 in a given time frame? The process is quite straightforward, and can be applied to time frames ranging from minutes up to weeks, months and years.</p>
<p>During the 24-hour period measured, one resource of each type was requested from EC2 every hour. In practice this means an instance was launched, an EBS volume was created and an EBS snapshot taken. The IDs that EC2 assigned to these resources were recorded, along with the time of their creation (as indicated in the timestamp returned from EC2 itself). Finally, the resource were released (instance terminated, volume and snapshot deleted) in order to minimize expenses. This process repeated every hour, which seems to be frequent enough so as not to miss any series rollovers.</p>
<p>The results &#8211; IDs and timestamps &#8211; were then analyzed using a combination of scripts and Excel spreadsheets. The Superseries, Series and Inner IDs were extracted from the resource IDs. Finally, the IDs were normalized and combined to yield a single number &#8211; a number that represents the continuum of resource IDs.</p>
<p>With this number, it&#8217;s plain sailing to measure or plot how many resources EC2 provisioned between any two samples.</p>
<h3>Summary</h3>
<p>The analysis, measurements and description above are based purely on observation. I cannot make any guarantees as to the accuracy of the technique.  Even with confidence regarding the analysis of an ID, whether or not we can use that to infer overall usage is open to debate. In theory, Amazon could be allocating resources internally for various purposes. Is this performed on a scale large enough to throw the figures off course? Only time (and Amazon) will tell.</p>
<p>Final word: if you have any insights, corrections or additions to this research &#8211; please feel free to jump in the conversation or email me. I&#8217;ll be sure to give credit in updates or future posts.</p>
<p><strong>Thanks</strong> to <a href="http://www.alestic.com/">Eric Hammond</a>, <a href="http://natishalom.typepad.com/">Nati Shalom</a> and Avner Algom and Peter Weinstein of the <a href="http://www.grid.org.il">IGT</a> for reading drafts of this.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-ec2-resource-id/feed/</wfw:commentRss>
		<slash:comments>50</slash:comments>
		</item>
	</channel>
</rss>
