the domain of aewens — posts tagged "statistics"

IPv32

December 05, 2018 — aewens

Some would look at the question "how many addresses would IPv32 have" and not take it seriously. On a normal day I would not either, but it was probably the mixture of absurdity of the question along with the catalyst of boredom that led me to the rabbit hole of IPv32.

Before I begin, I would like to warn you now that if you do not enjoy mathematics, this may not be for you. Also for any statistician who may happen upon this, I am only a hobbyist mathematician but I am well aware that trying to extrapolate data from only two points of data can in no way provide reliable data, but I did it anyways. Now, with all that out of the way, let us begin!

First, to determine how many addresses would be in IPv32, it would help to know how many addresses are in the existing IP schemas, IPv4 and IPv6. For IPv4 we have 2^32 or approximately 4 billion addresses and for IPv6 we have 2^128 or about 3 x 10^38 addresses (for those less math-inclined, that is a 3 followed by 38 zeroes, aka, a lot). After staring at these numbers for a few minutes and a bit of trial and error I came across a nifty equation that can derive the same address space using their version numbers:

f(x) = 2^(2^(x + 1))

So for IPv4, since it's the 4th version the function would be f(4) which is 2^(2^(4+1)) or 2^32. The same works with IPv6 using f(6) which is 2^(2^(6+1)) or 2^128. With this in mind, getting IPv32 should just be a matter of plugging in 32 to my function, f(32), and I get 2^(2^(32 + 1)) or 2^(2^33) or 2^8,589,934,592. In case you misread it, that's not about 8 billion, that is 2 to the power of about 8 billion. To try and get some idea exactly how big this number was, I started by plugging in "2^2^33" into my calculator and it informed me it was ∞ infinity. Well, that is clearly not correct so I tried plugging the number Wolfram|Alpha since I recalled they could do fancy stuff on their site with other mathy things I tried in the past. Still no love, it just sat and hung there for awhile until my impatience got the best of me.

Without any other options, I decided to try and make this problem easier for a computer to tackle using algebra. What I really want here is what exponent of 10 this number is so I have a rough idea how many zeroes I am dealing with here (or the order of magnitude). For those who remember algebra you can use my favorite function, logarithms, to extract the exponent from a number, and since I want the number in base 10 (the counting system where you go from 0 to 9 before you start adding other digits), the equation I started with is:

ln(2^2^33) / ln(10)

This works because of the fancy property of the equation:

y = ln(a^x) / ln(b)

Will take a^x and tell you how to get the same number using b^y. So from our original equation, I can then simply the function using the property of logarithms:

ln(a^x) = x * ln(a)

Making our new equation:

(2^33) * (ln(2) / ln(10))

While very similar to the previous equation that would not run on any calculator I threw it at, this one will work and is approximately 2,585,827,973. So the size of the IPv32 address space is around 10^2,585,827,973, or a 1 with over 2 billion zeros behind it.

If that number seems unimaginably large, that is because it is. Within the observable universe it is estimated that there is around 10^82 atoms (if you take the upper-bound estimation). Using IPv32, you could assign every atom in the known universe an IP address and still have not even made a dent in amount of addresses at it's disposal.

So now I have the answer, I should be done here right? Well, not exactly. Earlier, I said this was a rabbit hole and that is because while solving this problem I started to have other questions. For starters, if I wanted to store even just one IPv32 address, how much storage space would be needed? An IPv32 address consists of 8,589,934,592 bits or 1,073,741,824 bytes (just divide it by 8 since 8 bits make a byte). Since the remaining metrics of data are magnitudes of 1024, after a bit of division I came to the delightful discovery that IPv32 is exactly 1 gigabyte (1GB):

8589934592 / 8 / 1024 / 1024 / 1024 = 1

Additionally, like IPv4 and IPv6 there would probably be a visual representation of the IPv32 address so you can supply it to ping and see if that machine can communicate with that address. This led me to wondering what an IPv32 address would look like. Using the same encoding as IPv6, base 16, I can find it's encoded length using:

(2^33) * (ln(2) / ln(16)) = 2,147,483,648

So it would take a little over 2 billion characters to encode IPv32 in base 16, but I think I can do better. Another common encoding format available today is base64 encoding which utilizes all upper and lower case letters, numbers, and a few symbols as opposed to 0-9 and a-f in base 16. This equation would be similar:

(2^33) * (ln(2) / ln(64))

This comes out to a bit over 1,431,655,765, so I'll round up to 1,431,655,766 because I do not want to try to represent part of a character. From base 16 to base 64 I have so far gotten the encoding 1/3 or 33% of base 16, which is a good start but I still feel I can do better. The other existing encoding available that includes even more characters than base64 is UTF-8 or Unicode, that's right, I'm getting the emojis involved. Right now, Unicode is capable of encoding 1,112,064 (7^216 - 2048) characters, which to me (whether I am right or wrong) looks like Unicode is base 1,112,064 making my equation:

(2^33) * (ln(2) / ln(1112064))

Which is roughly 427,683,174 making this about 30% of base64 or 20% of base 16 (i.e. shrinking by 80% of the same encoding used by IPv6). So I'll tentatively say that IPv32 will be represented as about half a billion UTF-8 characters.

While interesting, I initially thought that this would be a waste since if I wanted to host a website using IPv32 it would probably make more sense to just buy the IPv32 address that is the binary representation of my website rather than hosting it... Or would it? While most websites right now are not 1GB, the average size of websites have been steadily growing over the years (in fact, at a rate of about 573 kilobytes or 573kb per year). Right now, the average frequently visited websites are approximately 3 megabytes or 3MB, so I can graph the expected size of websites over time using the function:

f(x) = 1024 * (573x + 3072)

In this function "x" is the years from today, "573x" is our rate of change, "3072" is 3MB (3 * 1024), and the "1024" scales the value to bytes since the 573 and 3072 is currently in kilobytes. This equation plots a linear graph that steadily climbs over time, in comparison to the growth of IP addresses space (which earlier I got using 2^(2^(x+1)) for a given version of IP) that is exponential. So IP address space will definitely at some point be bigger than the size of the average website, but exactly how far in the future would this be expected to happen?

There's obviously no way to truly answer this question, but given the data I have I can definitely make an educated guess! To begin answering this equation, I would first need to know how often a new version of IP would be released to relate it to the current function I have for the growth of websites. From a quick search I found that IPv4 was first deployed in 1983 and IPv6 was drafted in 1998 and became a standard in 2017. While I would like to go with 1998 for IPv6 because I have personally seen usage of IPv6 before 2017 in the wild, the difference between 1998 and 1983 is 15 years, meaning we should have gotten a new IP version in 2013 (1998 + 15) which definitely did not happen. So for the sake of pretending I have a model that can predict real data I am going to go against my personal beliefs and use 2017, making the time gap between two IP versions 34 years, or 17 years for every one IP version. Going backwards, that would theoretically place IPv0 at 1930, which will be important in a moment. With this, I now have all the information I need to construct my function for plotting the growth of the IP address space over the years:

f(x) = (2^(x / 17)) / 8

The "x" here is for the years passed and is divided by "8" to get the data as bytes, the "/ 17" normalizes the data so that x will increment by 1 after every new version of IP comes out, and then the "2^" is the same bit used in 2^32 for IPv4 and 2^128 in IPv6. However, I now need to adjust the website equation from earlier to start at 1930 since that's where the IP equation is technically starting. Since I don't have web page data from 1930 (for quite a few reasons), I'm just going to cheat and use the metric of websites in 2012 being about 1MB and use the 573KB offset to emulate the size of 1930 websites:

2012 - 1930 = 82 , 82 years

573 * 18 = 10314 , 10,314 kilobytes

1024 - 10314 = -9290, -9,290 kilobytes for webpages in 1930

f(x) = 1024 * (573x - 9290)

Yes, this means the average website in 1930 was about -9MB (with a negative). However, that kind of makes sense if you do not think about it, which is what I chose to do here.

Now, I did not try to solve where the interception of these two functions on paper because I could just throw them into Desmos to quickly get the intersection and visually see the rate of growth between the two. That intersection point occurred at (529.838, 3.014 * 10^8) which means that it would be about 530 years after IPv0 came out (1930), 2460, where both IP addresses and websites are about 287MB. However, since "real" IP versions are released every 34 years that would place 2460 between the release of IPv30 and IPv32, making IPv32 the first IP version that makes it a space conservative option for buying the IP address that is the binary representation of your website in the year 2474 (1930 + 17 * 32).

tags: ipv32, math, statistics, web