• Home
  • Categories
    • Analysis
    • Commentary
    • General
    • State of the Cloud
  • About
  • Subscribe via RSS

Revisiting EC2 Instance IDs

February 4th, 2010  |  Published in Analysis  |  11 Comments

Back in September, I published the Anatomy of an EC2 Resource ID where I pointed out some curious patterns in EC2′s ID scheme and proposed a method of “decoding” these patterns to reveal an underlying serial number. In that post I was careful to write that “while the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits”.

This week, the black box became a little bit whiter.

Sören Bleikertz, a computer science student writing his Masters thesis on EC2 security, poked into the Xen hypervisor used by EC2 and made some observations regarding EC2′s underlying architecture. Among his findings on the storage and networking configurations, Sören pointed out that each instance was given a unique name (the “Xen domain”) such as dom_32504936 and that this seemed to behave like a serial number, growing from day to day. Sound familiar yet?

Well, it turns out that this Xen domain is none other than the underlying instance ID uncovered in my previous research! This revelation gives us an important conclusion: the decoding method was accurate. The serial number exists and based on everyone’s input we even got the formula right.

With Sören’s technique at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which thanks to RightScale enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. Surprisingly, it turns out that the constants are in fact identical for all regions. What threw us off the scent is that as opposed to us-east-1 which very likely started the serial number from zero, the other regions do not. For example, the serial numbers for the 3-month-old us-west-1 region are already in the range of 752 million. Those for eu-west-1 are in the 500 million range. We can safely assume that hundreds of millions of instances have not in fact been spun up. What makes more sense is that each region was assigned a different starting point in order to ensure globally unique instance IDs.

An additional finding of Sören’s is that the image file for the root disk points to a filename on the VM host such as /mnt/instance_image_store_3/262768. It turns out that the number at the end of this file is, again, simply the AMI ID – decoded. For example, we can re-encode 262768 to yield ami-19a34270, which is Alestic’s Ubuntu Karmic Base image. Similar to instance IDs, the underlying image ID also seems to have different ranges in each AWS region.

As a bonus of Sören’s discoveries and the connection to the IDs, it’s now possible to infer your instance ID (and image ID) locally, without even consulting the EC2 user-data. Why someone would prefer this to the user-data is a good question, but it’s a fun exercise nonetheless. Here’s a Ruby script that does just that:

#!/usr/bin/ruby
$stderr.puts("Detecting VM domain ID (may take a few moments)")
dom_id = nil
(1..65535).each do |i|
        if system("xenstore-ls /local/domain/#{i} > /dev/null 2>&1")
                dom_id = i
                break
        end
end

$stderr.puts("VM domain ID is #{dom_id}")

dom_name = `xenstore-read /local/domain/#{dom_id}/name`
$stderr.puts("VM domain name is #{dom_name}")

numeric_id = dom_name.split("_").last.to_i
c1 = numeric_id >> 24
c2 = (numeric_id >> 16) & 0xFF
c3 = numeric_id & 0xFFFF
c3_1 = (numeric_id >> 8) & 0xFF
c3_2 = numeric_id & 0xFF

d1 = c1 ^ c3_2 ^ 0x69
d2 = c2 ^ c3_1 ^ 0x40 ^ 0xe5
d3 = c3 ^ 0x4000

instance_id = sprintf("i-%02x%02x%04x", d1, d2, d3)
puts(instance_id)

This requires xen-utils to be installed on the machine (on Ubuntu, run apt-get install xen-utils-3.3). Here’s an example run:

# ./get_instance_id.rb
Detecting VM domain ID (may take a few moments)
VM domain ID is 1423
VM domain name is dom_32900610
i-6a554602

Thanks once more to Sören for the great detective work.

If you enjoyed the post, please share it:
  • Digg
  • del.icio.us
  • LinkedIn
  • Slashdot
  • StumbleUpon
  • Twitter
  • Suggest to Techmeme via Twitter

Responses

Feed
  1. On Amazon EC2’s Underlying Architecture « すでにそこにある雲 says:

    February 4th, 2010 at 5:13 am (#)

    [...] Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing [...]

  2. Thorsten, CTO RightScale says:

    February 4th, 2010 at 7:52 am (#)

    Very cool, thanks for the update. We had noticed that reported instance launch counts in west and eu did not match our observations at all. Do we now need to do a “hunt” for the lowest instance id in those regions?

  3. Guy Rosen says:

    February 4th, 2010 at 10:51 am (#)

    @Thorsten – indeed! Are you able to report the earliest instance IDs you saw in the other regions? I suspect we’ll find that the base is a nice round number.

    Initially I ventured that they may start at 0×10000000 and 0×20000000, but those bases yielded too many instances started.

  4. Guy Rosen says:

    February 4th, 2010 at 11:05 am (#)

    Even easier – we can estimate the starting point from the AMI IDs. Based on this, underlying eu-west-1 IDs start at 500000000 (decimal) and us-west-1 IDs at 750000000.

    That yields a total of 3,415,862 instances started all-time on eu-west-1 and 1,277,975 instances started all-time at us-west-1.

  5. Thorsten, CTO RightScale says:

    February 4th, 2010 at 5:54 pm (#)

    We didn’t support EU or US-WEST on the day of release, so the earliest IDs we have are a few days later. I need to do some digging but in EU one of the earliest we see is i-3edded4a and in us-west i-bb8ea3fe (sorry, I don’t have the conversion calculator handy). I’m sure others can come up with earlier instance IDs…

  6. Revisiting EC2 Instance IDs | uncompiled.com says:

    February 4th, 2010 at 10:04 pm (#)

    [...] Source [...]

  7. Guy Rosen says:

    February 4th, 2010 at 11:45 pm (#)

    Thorsten – perfect! Those are pretty close to the 500000000 and 750000000 mark (give or take a few hundred thousand), so it lines up pretty well with what the AMIs teach us.

  8. Eric Windisch says:

    February 4th, 2010 at 11:55 pm (#)

    Several months ago, I wrote something nearly identical in bash. I’ll share it with you so that you can skip the Ruby requirement of the one provided above… Note that mine confirms the UUID as it would be possible to have permission to multiple trees in the xen-store. You likely wouldn’t see that on a public cloud, but it might be done on private deployments…

    http://gist.github.com/292068

  9. Mehr Amazon EC2 Interna | Server in den Wolken says:

    April 27th, 2010 at 10:30 pm (#)

    [...] Revisiting EC2 Instance IDs [...]

  10. Viktors Rotanovs says:

    September 22nd, 2011 at 10:30 pm (#)

    There’s a simpler command to get domid:

    xenstore-read domid

  11. cloudpack Night #2 B) « すでにそこにある雲 says:

    March 23rd, 2012 at 5:54 pm (#)

    [...] 2010-02-04 : Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing [...]

About Guy Rosen
Guy Rosen is Co-Founder & CEO of Onavo by day, and a cloud computing blogger by night. This blog shares his cloud market research and commentary.

Find out more at the about page.
Alternatively, you can subscribe by email for the latest and greatest updates!

Recent Posts

  • Shameless Plug
  • The clouded world of naming cloud startups
  • State of the Cloud – January 2011
  • Recounting EC2 One Year Later
  • State of the Cloud – November 2010

Categories

  • Analysis
  • Commentary
  • General
  • State of the Cloud

Archives

  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009


© 2013 Guy Rosen
Powered by WordPress using the Gridline Lite theme by Graph Paper Press.