New technique enables observation of EC2 usage, uncovers stunning data
Each time you allocate a resource using EC2 – an instance, a volume or a snapshot – you receive a unique identifier. This is the EC2 resource ID. Have you ever wondered what this ID represents? Well, I did. After noticing similarities between the IDs of resources requested in close succession, I started digging.
The outcome of this digging is a definition of the components that formulate an EC2 resource ID. The marvel is that this definition allows us to externally count the number of resources provisioned within a certain time frame – enabling us for the first time to observe EC2′s usage patterns. For example, we can count how many instances are launched on a certain day, in a given EC2 region.
Before continuing, I’d like to emphasize that these findings are circumstantial. While the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits. Note also that we can estimate how many new resources are created but not how many are already active, how many were later deleted, etc. The total number of servers running on EC2 remains a mystery.
In one 24-hour period measured in September 2009, the estimation indicated the following volume of usage on Amazon EC2′s
- 50,242 instances requested
- 12,840 EBS volumes requested
- 30,925 EBS snapshots requested
- 41,121 reservations requested
- Disambiguation: a reservation in this context is an atomic launch of one or more instances. This does not imply a reserved instance. For example, if you launch 1 instance, you get 1 instance ID and 1 reservation ID; if you launch 2 instances in one command, you get 2 instance IDs and still 1 reservation ID.
These numbers are impressive, to say the least. Even more impressive is a small hint, lurking between the numbers, that implies that just over the past month Amazon crossed a significant threshold (see below for more details):
8.4 million EC2 instances launched (since EC2′s debut).
UPDATE (Oct 7th 2009): RightScale applied the findings for the two years worth of data they have in their systems. Based on that data, they estimate the number of instances launched is actually 15.5 million! They also plotted the numbers over two years – worth checking out.
Anatomy of a Resource ID
So how were the numbers above calculated? To find out, let’s decompose an EC2 resource ID. After comparing hundreds of IDs, this opaque identifier turned out to be a little more transparent than you’d expect.
The most trivial of the fields, the type is one of the following values, depending on the resource type:
- i – instance
- r – reservation
- vol – EBS volume
- snap – EBS snapshot
- ami – Amazon machine image
- aki – Amazon kernel image
- ari – Amazon ramdisk image
The Inner ID is a 16-bit counter of resources allocated. Each time a resource is requested, the Inner ID increments by one. For instance and reservation IDs, it increments by two (i.e., these Inner IDs are always even). Instead of counting from 0-FFFF as you’d expect, the Inner ID uses the following cycle:
(This cycle can be easily normalized by XORing with 4000.) When the Inner ID has exhausted its space, a new series begins (see below) and the cycle restarts.
For a given resource type, there is one active 8-bit Series ID. This Series ID, however, is not embedded directly into the resource ID. Instead, it is XORed to the leftmost 8 bits of the Inner ID. The result, which I call the Series Marker, is embedded in the ID to the left of the Inner ID.
For example, on the resource ID above the Series ID would be e5 = a7 XOR 42.
Series IDs usually decrement by one each time the Inner ID completes a cycle. I say “usually” because while this is the most common behavior, from time to time Series IDs seem to jump around in a pattern which is yet to be explained.
UPDATE (Oct 7th 2009): RightScale contributed the missing piece: to normalize a series ID, XOR with E5 – this irons out the “jumps” I noticed perfectly.
For a given resource type, there is one active 8-bit Superseries ID. Like the Series ID, the Superseries ID is not embedded directly into the resource ID. Instead, it is XORed to the rightmost 8 bits of the Inner ID. The result – the Superseries Marker – is the leftmost byte of the resource ID.
For example, on the resource ID above the Superseries ID would be 69 = 31 XOR 58.
The Superseries ID changes so rarely that originally I had assumed it was some kind of checksum. This would have been odd as it limits the total available IDs to 224 = 16.8 million. Up to very recently, the Superseries ID for all resource types – instances, images, volumes, snapshots, etc. – was 69 (in the us-east-1 region (for eu-west-1 the Superseries ID is 74). These days, new instances use the Superseries ID 68. This subtle change, unnoticed by the industry, may hint at an astonishing achievement: 8.4 million instances launched since EC2′s debut! (Instance IDs are even so 8.4M = 16.8M / 2.)
UPDATE (Oct 7th 2009): RightScale suggested to normalize the Superseries ID by XORing with 69. In this technique, the superseries ID for us-east-1 was 0, and the recent change incremented it to 1.
Note that since each EC2 region is a completely separate system, the IDs in each region are independent of each other.
Now that we have an idea of what an ID represents, how do we use that knowledge to estimate the number of resources provisioned by EC2 in a given time frame? The process is quite straightforward, and can be applied to time frames ranging from minutes up to weeks, months and years.
During the 24-hour period measured, one resource of each type was requested from EC2 every hour. In practice this means an instance was launched, an EBS volume was created and an EBS snapshot taken. The IDs that EC2 assigned to these resources were recorded, along with the time of their creation (as indicated in the timestamp returned from EC2 itself). Finally, the resource were released (instance terminated, volume and snapshot deleted) in order to minimize expenses. This process repeated every hour, which seems to be frequent enough so as not to miss any series rollovers.
The results – IDs and timestamps – were then analyzed using a combination of scripts and Excel spreadsheets. The Superseries, Series and Inner IDs were extracted from the resource IDs. Finally, the IDs were normalized and combined to yield a single number – a number that represents the continuum of resource IDs.
With this number, it’s plain sailing to measure or plot how many resources EC2 provisioned between any two samples.
The analysis, measurements and description above are based purely on observation. I cannot make any guarantees as to the accuracy of the technique. Even with confidence regarding the analysis of an ID, whether or not we can use that to infer overall usage is open to debate. In theory, Amazon could be allocating resources internally for various purposes. Is this performed on a scale large enough to throw the figures off course? Only time (and Amazon) will tell.
Final word: if you have any insights, corrections or additions to this research – please feel free to jump in the conversation or email me. I’ll be sure to give credit in updates or future posts.