Revisiting EC2 Instance IDs
February 4th, 2010 | Published in Analysis | 11 Comments
Back in September, I published the Anatomy of an EC2 Resource ID where I pointed out some curious patterns in EC2′s ID scheme and proposed a method of “decoding” these patterns to reveal an underlying serial number. In that post I was careful to write that “while the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits”.
This week, the black box became a little bit whiter.
Sören Bleikertz, a computer science student writing his Masters thesis on EC2 security, poked into the Xen hypervisor used by EC2 and made some observations regarding EC2′s underlying architecture. Among his findings on the storage and networking configurations, Sören pointed out that each instance was given a unique name (the “Xen domain”) such as dom_32504936 and that this seemed to behave like a serial number, growing from day to day. Sound familiar yet?
Well, it turns out that this Xen domain is none other than the underlying instance ID uncovered in my previous research! This revelation gives us an important conclusion: the decoding method was accurate. The serial number exists and based on everyone’s input we even got the formula right.
With Sören’s technique at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which thanks to RightScale enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. Surprisingly, it turns out that the constants are in fact identical for all regions. What threw us off the scent is that as opposed to us-east-1 which very likely started the serial number from zero, the other regions do not. For example, the serial numbers for the 3-month-old us-west-1 region are already in the range of 752 million. Those for eu-west-1 are in the 500 million range. We can safely assume that hundreds of millions of instances have not in fact been spun up. What makes more sense is that each region was assigned a different starting point in order to ensure globally unique instance IDs.
An additional finding of Sören’s is that the image file for the root disk points to a filename on the VM host such as /mnt/instance_image_store_3/262768. It turns out that the number at the end of this file is, again, simply the AMI ID – decoded. For example, we can re-encode 262768 to yield ami-19a34270, which is Alestic’s Ubuntu Karmic Base image. Similar to instance IDs, the underlying image ID also seems to have different ranges in each AWS region.
As a bonus of Sören’s discoveries and the connection to the IDs, it’s now possible to infer your instance ID (and image ID) locally, without even consulting the EC2 user-data. Why someone would prefer this to the user-data is a good question, but it’s a fun exercise nonetheless. Here’s a Ruby script that does just that:
#!/usr/bin/ruby
$stderr.puts("Detecting VM domain ID (may take a few moments)")
dom_id = nil
(1..65535).each do |i|
if system("xenstore-ls /local/domain/#{i} > /dev/null 2>&1")
dom_id = i
break
end
end
$stderr.puts("VM domain ID is #{dom_id}")
dom_name = `xenstore-read /local/domain/#{dom_id}/name`
$stderr.puts("VM domain name is #{dom_name}")
numeric_id = dom_name.split("_").last.to_i
c1 = numeric_id >> 24
c2 = (numeric_id >> 16) & 0xFF
c3 = numeric_id & 0xFFFF
c3_1 = (numeric_id >> 8) & 0xFF
c3_2 = numeric_id & 0xFF
d1 = c1 ^ c3_2 ^ 0x69
d2 = c2 ^ c3_1 ^ 0x40 ^ 0xe5
d3 = c3 ^ 0x4000
instance_id = sprintf("i-%02x%02x%04x", d1, d2, d3)
puts(instance_id)
This requires xen-utils to be installed on the machine (on Ubuntu, run apt-get install xen-utils-3.3). Here’s an example run:
# ./get_instance_id.rb Detecting VM domain ID (may take a few moments) VM domain ID is 1423 VM domain name is dom_32900610 i-6a554602
Thanks once more to Sören for the great detective work.










February 4th, 2010 at 5:13 am (#)
[...] Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing [...]
February 4th, 2010 at 7:52 am (#)
Very cool, thanks for the update. We had noticed that reported instance launch counts in west and eu did not match our observations at all. Do we now need to do a “hunt” for the lowest instance id in those regions?
February 4th, 2010 at 10:51 am (#)
@Thorsten – indeed! Are you able to report the earliest instance IDs you saw in the other regions? I suspect we’ll find that the base is a nice round number.
Initially I ventured that they may start at 0×10000000 and 0×20000000, but those bases yielded too many instances started.
February 4th, 2010 at 11:05 am (#)
Even easier – we can estimate the starting point from the AMI IDs. Based on this, underlying eu-west-1 IDs start at 500000000 (decimal) and us-west-1 IDs at 750000000.
That yields a total of 3,415,862 instances started all-time on eu-west-1 and 1,277,975 instances started all-time at us-west-1.
February 4th, 2010 at 5:54 pm (#)
We didn’t support EU or US-WEST on the day of release, so the earliest IDs we have are a few days later. I need to do some digging but in EU one of the earliest we see is i-3edded4a and in us-west i-bb8ea3fe (sorry, I don’t have the conversion calculator handy). I’m sure others can come up with earlier instance IDs…
February 4th, 2010 at 10:04 pm (#)
[...] Source [...]
February 4th, 2010 at 11:45 pm (#)
Thorsten – perfect! Those are pretty close to the 500000000 and 750000000 mark (give or take a few hundred thousand), so it lines up pretty well with what the AMIs teach us.
February 4th, 2010 at 11:55 pm (#)
Several months ago, I wrote something nearly identical in bash. I’ll share it with you so that you can skip the Ruby requirement of the one provided above… Note that mine confirms the UUID as it would be possible to have permission to multiple trees in the xen-store. You likely wouldn’t see that on a public cloud, but it might be done on private deployments…
http://gist.github.com/292068
April 27th, 2010 at 10:30 pm (#)
[...] Revisiting EC2 Instance IDs [...]
September 22nd, 2011 at 10:30 pm (#)
There’s a simpler command to get domid:
xenstore-read domid
March 23rd, 2012 at 5:54 pm (#)
[...] 2010-02-04 : Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing [...]