Bits is Bits

So I have to help with a Drive By about math, which involves treading through territory covered last year by the ultra-smart Elie Zananiri.

His take on the subject along with excellent notes and processing examples remains available on the world wide web.

I’d like to take a slightly different tack and give a little background on how computers think about numbers, where abstract numbers end and representational data begins, and how to manipulate this threshold to your advantage.

Binary

This is kind of a basic place to start, but it can be central to preserving your sanity when working on more complicated problems. Everything on your computer can be reduced to bits.

A bit is short for “Binary Digit”, it’s the atom of computation. Alone, a bit can only represent two possible numbers, because there are only two possible ways for a single bit to go: 0 or 1.

To represent larger numbers, we have to stick bits together into longer sequences. Usually, the smallest sequence we’ll work with is 8 bits, or a one byte.

Sketch illustrating calculation of the value of a single byte

Signed vs. Unsigned

You may have noticed the signed / unsigned keywords stuck before variable type names. This indicates whether or not the first bit in the sequence may be used to represent a number value, or whether it determines if the number is positive or negative. A 0 in the first bit of a signed value means the number is positive, a 1 means it’s negative.

Sketch illustrating representational limits of a single byte

But beware, and int isn’t an int — the number of bytes that make up a particular piece of data is contingent on the particular implementation of the programming language you’re using. In Processing for example, the integer is 32 bits of data, or 4 bytes. It’s also signed, this means it can represent values from -2,147,483,648 to 2,147,483,647. In the Arduino environment, however, an int is just 16 bits, or two bytes, which means you’re limited to representing values between -32,768 to 32,767. A quick way to determine how many values you can represent is the formula 2^(bit length) - 1. (Or 2^(bit length - 1) - 1 for signed types.)

In Processing, the binary() and unbinary() functions are extremely handy for seeing the bits behind the data, and vice versa.

Low resolution bitmap illustration of a spray can transcribed to a binary grid

If everything on your computer is just bits, then how do we decide what’s an image, what’s a sound file, what’s a text file, etc.?

Basically, these determinations is just oppressive dictums sent down from on-high by your operating system and a cabal of applications. You don’t have to subscribe to these narrow notions… nothing is stopping you from listening to images or reinterpreting music into text.

Cue Processing Demos:

bits_is_bits_cam_to_sound.pde
bits_is_bits_sound_to_image.pde
bits_is_bits_text_to_sound.pde

Really Big Numbers

In Processing, the largest number you can work with in the documented data types is a “long” — which is 64 bits of signed data, giving you 9,223,372,036,854,775,807. Lame. So what happens when you need to work with bigger numbers?

An iPhone calculator displaying the word "error" in its digits field

The way around these limitations is to use something called arbitrary precision arithmetic. Arbitrary precision arithmetic basically uses byte arrays of indeterminate (and conceptually infinite) length to allow you to do math with incredibly large numbers. They’re going to be much, much slower than using your programming language’s native data types, but when you need to work with really big numbers it’s a handy thing to have around.

The gold standard open-source option is the The GNU MP Bignum Library. There’s a big number library built into Java (and therefore Processing), you just need to explicitly load it and deal with its weird syntax. If you need one for ActionScript 3, talk to me. (Links pending!)

Here’s a quick example of breaking the 9 quintillion barrier in Processing:

import java.math.BigInteger;

void setup() {
  BigInteger a = new BigInteger("9223372036854775807");
  BigInteger b = new BigInteger("9223372036854775807");

  BigInteger sum = a.add(b);

  // Decimal representation
  println(sum);

  // Binary representation
  println(sum.toString(2));
}

So why bother with this? If you can manipulate bit strings of arbitrary length, you can start to do math with pieces of data large enough to have representational value.

Cue another processing demo: bits_is_bits_every_image.pde

Gratuitous Tips

You can do basic math in the Mac’s spotlight search bar, command-space brings it up quickly.
The built in mac calculator has a programmer mode which shows you different representations of the same number. Pull up the calculator and press command-3 to switch into programmer mode.
Wolfram Alpha is super handy for doing arbitrary-precision calculations, balancing equations, graphing stuff, etc.
Hex Fiend is a great, free hex editor for the mac that makes it easy to pry into the numbers behind any file on your computer.