On Numbers: Part the First

In which we learn to count.

Okay peeps, let's talk about numbers. Specifically, integers and floats. As programmers, these are the kinds of numbers we're interested in. For the record, an integer (int) looks like this:


And a float looks like this:


In the real world, 1 == 1.0, but this isn't true for computers, so let's take a peek why. But before we can understand, we need to learn how to count.

Here are some numbers, you may be familiar with them.

1 2 3 4 5 6 7 8 9 10 11

For illustrative purposes, let's rewrite them.

0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011

It's still the same, we've just prepended zeros to pad the numbers up to 4 digits. Now let's start a little thought experiment. Imagine, for a second, that we had never discovered the number 9. Just, we as a species, looked at our hands, immediately relegated thumbs to second class citizens, and decided numbers should go from 1 to 8. Let's see what that would be like.

0001 0002 0003 0004 0005 0006 0007 0008 ...

If you're not bored yet, you may be a little confused as to what comes next. Remember, we've stricken the number 9 from existence. But never fear, we can still _represent_ the concept of 9 things (even though we don't have a numeral for it), in exactly the same way that we can represent 10 things even though we don't have a single numeral for 10. Et voila:

.... 0005 0006 0007 0008 0010 0011

"But you skipped 9! That's just 10. That doesn't work. Cats would have a mysterious extra life that we couldn't quantify, and three squared would be a non-existent number! Nena's seminal work about red balloons would be, at the very least, incredibly awkward." Well, yes and no. 

Think back to the simplest form of counting you know: tally marks. A little line for _one_ item, and then 4 lines and a strike through for 5. For simplicity, we'll represent a collection of 5 with the plus symbol. So, counting to six:

|   ||   |||   ||||   +   +|

And so 13 is ++|||, and 20 is ++++. Now, 25 is +++++, but that's getting kind of messy, so let's pretend 25 is represented by an X, and look, now we have a very simple version of roman numerals.

Let's throw another monkey and the requisite wrench into the works, and say that 0, is represented by a -. And further, to keep things clean, we'll divide everything into neat columns. All the single tallies are grouped together, and the pluses go together, and the X, etc. So some numbers, say, 4, 12, 36, 44, 51:

4:     -    -    ||||
12:    -    ++   ||
36:    X    ++   |
44:    X    +++  ||||
51:    XX   -    |

We could keep going, but despite our best efforts, things are getting kind of hairy. What if we wanted to represent the number 125? We could do XXXXX, but if we continue in our pattern of simplification, the correct thing to do would be to create a new symbol to represent 125, maybe _. But we're running out of straight lines on our keyboard and this is becoming a mess. Like we said before, it also looks like roman numerals, but those suck! Yeah! Down with romans! Romani eunt domus!

Ahem. Back to the matter at hand. How can we improve our counting system? 

Let's do something tricky. In each column, we can only use 4 tallies before switching to using the column to the left, and leaving a - in the current column. We're going to cheat, and bring back our modern arabic numerals, 0 through 4. Instead of drawing individual tallies, we'll count the number of marks we made, and just put our numeral down:

4:     0 0 4
12:    0 2 2
36:    1 2 1
44:    1 3 4
51:    2 0 1

KABLAMMO. That's the sound of your mind being blown. This is the illustration of the relation between places, numerals, and actual numbers. Let's look at the number 5. In our Zebulon Numeral system, it looks like this:  -  +  -. Translated to arabic numerals: 0 1 0 -> 010 -> 10. WHAT. THE. EFF.

So now counting, going by the basis of having five tally marks, looks like this, 1 2 3 4 10 11 12... and so forth. So the number '10' does not actually mean ten things. It means, in a system, where we have N numerals and a zero, we are representing exactly N + 1 items. It is essentially a tally mark in the next column over. And so here, where our basis of counting is four tallies and a zero, 10 really means 'five'.

In our previous example, where we hate our thumbs, 10 means 'nine'.

FINE, you say, BUT WHAT IF YOU HAD 16 FINGERS? Well, just like we made up symbols to represent numerals bigger than ||||, we can make up more symbols. We could just do, ... 8 9 | + X. But instead of a contrived example, I'll just reveal that modern convention uses the letters A-F to represent ten through fifteen.

1 2 3 4 5 6 7 8 9 A B C D E F whatcomesnextquickyouknowitalreadytoolateI'mjustgonnatellyou 10

Look at that. I've just, in an incredibly long and convoluted manner, taught you hexadecimal. Or, in english, sixatennish. Maybe 'base 16' is better. And we've seen base 5 and base 9 counting systems.

So. Let's take this all the way in the other direction. Let's say we're a little bit slow, and we only ever learned two numerals: '0' and '1'. But somehow we're still smart enough to count to numbers greater than one. What does that look like?


Recognize that? That there is binary, sonny. I remember back in ought six, well, we didn't have the numeral six back then, so it was ought one one ought...

Tune in next time when we link binary to hexawhatsitall, then look at what that means for integers on computers.

1 response
(this is Zach from Rob's lab)

This explanation is much better than the high school one where they just define binary numbers as 2^(n) + 2^(n-1) + ... + 2^1 + 2^0.

Have you shown this to anyone who doesn't know anything about the topic? I really like it, but I also knew where you were heading from the beginning. I really liked the foreign key post (which I was less familiar with, aside from knowing what a foreign key was) as well, but Rob said that he had a little trouble following it.

My guess is that it stems from your use of so many analogies. I always think like this, so I love this kind of explanation, but I could see someone becoming totally lost if they can't make the connection early on.