Friday 19 August 2011

What is information?

So, apparently we live in the information age. But I have never found a satisfactory answer to the question: What is information? I think it is fine to not know at this time. If you went back in time to the iron age (for example) and asked the people there, "what is iron?" they would not hesitate in showing you lots of examples of iron and how they manipulate it. Now we feel we have a better understanding of iron than the people did in the iron age. We know about chemistry and stuff and how atomic properties determine the macroscopic characteristics of iron.

Similarly if you ask about information now, people are very quick to start talking about binary data and the internet and stuff. But there is a lot more to it than that.

\begin{interlude}But first, an example rant on the topic of information in the modern world: Advances in data technology has devalued music. That is not to say that modern music is any less good than it has been in the past. But rather, that as a currency it is worth less. Record companies made shedloads promoting their artists and we all bought their records and the artists enjoyed wealth and fame. But now we can (in principle) all enjoy their music for very little money now that we don't have to produce physical discs any more. I can just re-arrange a few gazillion particles inside my ATX tower (or my phone) and be listening to a hardcore trance mash-up before you can say, "Amy Winehouse". No self-respecting group of greedy music industry CEOs want you to stop giving them money (they wear cowboy hats you know) so we had all that kerfuffle about piracy and stuff. But surely it is an unmaintainable situation. We can't support the cocaine habits of all these people when information (binary data) is so cheep and easy to copy and transport. I suspect in the long run it will be a good thing for music; it will become artistic again, and have less of a mass-produced factory feel to it.\end{interlude}

If you look in a dictionary to find out what information is, you will quickly find there are two separate concepts which share the same word, just to keep things interesting. One is the binary data we all know and love, but there is the unrelated concept of knowledge and meaning. So you can say we live in an age of manipulating binary data, where we can store it and move it in great quantities, but I think we are still struggling to work out what information is. That is a deep philosophical problem.

I wanted to share a link to a factoid I acquired at some stage; the amount of data transmitted over the internet is about 1 exabyte a month. However when looking for said factoid I immediately ran into the problem has been getting me riled up and motivated this post. And that is that you can't measure some things in bytes. Instead of finding what I was looking for, I found a paragraph about how much space it would take to store all the words ever spoken by people, ever. Estimates range from 5 exabytes, to 42 zettabytes depending on if you store it in text or a digitised sound recording. But what are you actually storing? If you record it as text then you are loosing a lot of hesitations and inflections which surely contribute to the intended meaning. And even if you record all the sounds, you loose gestures and expressions.

Sure, you can take all the words ever spoken, and digitize them somehow so that it takes up lots of bytes. You can even undo the process and recover large parts of the intended meaning. But I think it is impossible to do that without loosing some of the intended meaning. And if the process is not completely reversible, what have you got stored in your bytes?

Check out this link, it makes me sad. These are the kind of people who tell you how many bytes it takes to store a person. I really don't see how you can do that when we are still struggling to understand what a person is, with unresolved questions like the mind-body dichotomy. I for one believe that I am not simply the product of electrical impulses in my brain – that I do not exist inside my cranium.

Digital information seems to be stored in specific locations. By this I mean that if you opened up your phone or your hard drive and looked at it hard enough, it is possible to say "this bit of information is stored in this physical location". I expect there are technical reasons why the preceding sentences aren't entirely true, but I am sure the premise is sound. On the other hand, I know all the words to "De Colores" and I strongly believe if you were to dissect my brain you could not say that the first instance of the word "colours" was contained in any specific location. Moreover, I believe it would be possible to remove any individual portion of my brain without affecting my memory in the slightest.. However, like all the best theories, this is completely untestable. You could never be sure you hadn't just removed a part of my brain that has nothing to do with memory.

Even if the things I know are somehow contained in my head (which I am prepared to accept is not the case), then I feel it is likely that each quantum of knowledge is equally distributed over a wide region.

Is information even quantizable? Computer says, "no". It seems nobody has thought to ask this question before. We try quantize everything we can (my favourite is the phonon) and the world is making lots of money out of binary data, so why stop to ask this question?

The concept of steganography is quite interesting and not completely unrelated to what I am trying to say in my post. It is the "art and science" of hiding one message inside another. People get all mathematical about it, looking for redundant bytes inside a file, working out how much extra information you can hide in there (measured in bytes of course) and worrying about the statistical likelihood of different patterns occurring and stuff. Which is great, I like mathematics and shit.

But surely there is another way to do it. Surely it is possible for somebody to say something, but given the correct context and background knowledge it can mean something quite different. I can't deny that the first type of steganography appears in my blog, but I think there is more of this second type.

Check out this topical story. It seems everybody is at it, although 8 billion-to-one is surely an over-estimate. Like how many words (or paragraphs) start with a K? A quick glance through my post reveals zero. And people always do rubbish maths when producing statistics for popular consuption. Sure the chance of 7 random letters spelling that particular word is 267, but what about all the other combintations of 7 letters which could be considered "meaningful".

Final thoughts

Perhaps this image sums up the point I am trying to make, and like my favourite book (and potentially my blog) can be interpreted on several levels. Is it a sequence of 71552 bits of binary data? Maybe it is a picture of an American themed race-track? Maybe it is the numeral zero? Can we attach meanings to any of these interpretations, and how many bytes of information are you actually gaining when you look at it? I like to think that should the right person receive a tiny (in terms of bytes) message then they could attach to it vasts amounts of information.


No comments:

Post a Comment