John Sokol's Blog: September 2010

Tuesday, September 28, 2010

Data Storage in Permeable combination

Much of this is very similar to computing odds and statistics, but is more from the view point of information theory rather then probability. In game theory and odds we are concerned with the frequency of certain outcome, but here we want to know how much information is in that outcome and how much data will it take to store these outcome as well as the reverse of how much data can be stored in these outcomes.

How much data in a Pair of Dice?

Look at the structure of a six sided die, how much data is in each roll? The outcome of a roll is a single number between 1 and 6.

In binary:
   2 Bits is a number from 0 to 3.
   3 Bits is a number from 0 to 7.

Add one to make these 1-4 and 1-8. A 1-6 outcome is more then 2 bits of data but less then 3. Right now there is no good way to deal with this. Most people would just round this up to 3 bits.

Using the logarithm function we can determine how many bits are needed to store all the possible outcomes.

Bits = log(6) / log(2); // This is equivalent to log(Base 2)
Bits = 2.58496

2.58496 Bits is where we would expect 2 < x < 3, but this is not a whole number.

I found I could describe this as a Fractional Bits. A fractional bit, in reality there is no way to subdivide an on/off, 1 or 0 Bit. When really storing data we are forced to round up, but we could add up these fractional bits to store whole bits.

A pair of dice rolls 1 to 36 possible outcomes. (Die1 – 1 * 6) + Die2
If we take the number of bits and times 2 we get 5.17 Bits.

For 3 rolls we get a 1 to 216 outcome that is 7.754 Bits. This rounds up to 8 bits, we already saved 1 bit from just round up to 3 bits x 3 and storing them together as 9 bits.

With a type of compression called arithmetic coding this just what is done.

Another way of thinking about it you can treat each die rolls a producing 1 digit of a base 6 number and then convert the results to base 2.

By reversing this for storing data we would have to round down to 2 bits per roll. But with 3 rolls we gain a bit and have 7 bits instead of 6.

Logarithm and exponent math
Getting the log base 2 of a number is extracting the exponent for a number.
Where Log(n) / log(2) convert the log function to a base 2.

log(16) / log(2) = 4 bits
log(32) / log(2) = 5 bits.

For powers of 2 this is very straightforward.

Bits = log(6) / log(2); // This is equivalent to log(Base 2)
Bits = 2.58496

This equivalent to 2 to the power 2.58496 equals 6. But raising something to a non whole number power is difficult.

This can be done using the exponent function.

    exp( log(N) ) = N
    log( exp(N) ) = N

Logarithm and Exponent are reversible functions.

X^Y can be done using exp(Y * log(X) )
   ( In most computer languages ^ mean to the power of)

So 2^2.58496 can be done using exp(2.58496 * log(2) )

Most log and exponent functions are by default Natural meaning
    Log (2.71828182845904523536) = 1
and
    Exp(1) = 2.71828182845904523536

Which is why we need to covert these to base 10 or base 2 to be useful for this application. This is done by
Multiplying or dividing by log(2) is these equations.

Math of exponents is also different.

X * Y = exp( log(X) + log(Y) )
Addition of exponents is equivalent to multiplication of regular number.

X / Y = exp( log(X) - log(Y) )
Subtraction of exponents is equivalent to division of regular number.

X^Y = exp( log(X) * Y )
Multiplication can be used to raise something to a power.
_____
Y /
\/ X = exp( log(X) / Y )
The Root of a number can be taken.

How much data can a deck of poker cards hold?

There are 54 cards in a deck.
The cards are A,2,3,4,5,6,7,8,9,10,J,Q,K 13 Ranks in all times 4 suites Jacks, Spades, Hearts, Diamonds.
13 * 4 = 52
There are also 2 Jokers giving 54 cards total.
The two jokers are sometimes indistinguishable from one another but for this example we will assume we can differentiate between them.

Our first card can be any 1 of 54.
Our second card can be any 1 of 53. And so on.

This gives us the equation:
54! = 54 x 53 x 52 x 51 x 50 x 49 x 48 x . . . x 2 x 1

The Factorial operation is notated as n! and gives the number of ways in which n objects can be permuted.

For example, 3! = 6. This come from by 3 x 2 x 1, since the six possible permutations of are:

{1,2,3}, {2,3,1}, {3,1,2}, {1,3,2}, {2,1,3}, {3,2,1}.

This is called factorial 54 and would be written as “54!” and give us the number of permeable combination that a deck of cards could be shuffled into.

I used the Unix bc program to compute this. “bc” according to the online manual is “An arbitrary precision calculator language”, it is very convenient when dealing with big number.

Here is the program to computer factorial 54!

c = 1 

for( a = 54; a >= 1; a--){

c = c * a 

}

c

The result is:

54! = 230843697339241380472092742683027581083278564571807941132288000000000000

72 Digits long and in Based 2, 238 Bits of data to store

log( 54! ) / log(2) = 237.06381108042942967244  Base 2

log( 54! ) / log(10) = 71.36331802162852843476  Base 10

With 237 bits of data we can store English text in 5 Bit EBCDIC characters.
Using this we could store an uncompressed 47-character message in a deck of cards.

We could hid a 224 Bit DES encryption key for decoding data stored elsewhere on a CD or floppy disk and carried along with our deck of cards.

If we increase the number of decks we can store more data. For instance we mix a deck of red backed and blue backed cards together for 108 cards. This gives more then the expected doubling of data, instead there is a 2.44 time increase.

54! * 2   = 474 Bits

(54  * 2)! = 578 Bits

decks	Cards	Bits	Bits per card
1	54	237.06	4.390
2	108	578.42	5.355
3	162	960.33	5.928
4	216	1368.64	6.336

What is happening is as the number of cards increases the amount of data per card increases.

With 4 decks, 216 Cards we see 6.336 Bits per card; this is equivalent of a 1 to 80 possible selection per card. To phrase it another way a 216 Digit base 80 numbers. Although we must keep in mind we also need 216 uniquely identifiable cards for this.

4 decks give 1368 Bits is 171 Bytes or 273 characters stored with 5 bit each letter.

Monday, September 27, 2010

Conscious Machines

With Conscious Machines, we are really talking about consciousness or more specifically machine consciousness.

This is a very complex philosophical topic, since it's not clear what being conscious even really means.

Consciousness is often described as a level of self Awareness, subjectivity, sentience, sapience and perception of one surroundings.

Like Intelligence, consciousness has come to means something uniquely Human, that animals or machines can not have, or at least should not have, otherwise they'd have to come up with a new definition for the word to exclude the animal or machine.

With Intelligence, challenges were set but when machines reached that level, they just kept moving the bar high and higher. Finally with the defeat of Gary Kasparov the Chess grand master to Deep Blue, it's hard to argue that machines aren't intelligent.

But this is an old argument, before that it was speech and language. Again, it was discovered animals had language, and machines are easily able to master speech these days. Although we decided that we didn't like hearing from them. Although "Your door is ajar" was cool at first, but soon became very irritating.

We humans arrogantly prefer to think of ourselves a divine and endowed with special GOD given uniqueness and abilities that only we can have.
The thought of having a machine that could be considered our equal can be a truly terrifying concept.

To most people predicting machines that are capable of competing with is on par with predicting the end of the world.
The day a machine becomes superior is the day we cease to be the dominant species on this planet. From that point forward we will only be allow to live through there benevolence, this could take many forms including humans ending up like some sort of house pets.

Fortunately the dates in these predictions of impending doom and the end of human superiority keep passing by uneventfully and predictions keep getting pushed back.

In Ray Kurzweil's “The Age of Spiritual Machines” published in 1999, he estimates 20 year number for creation of conscience machines. I think this is way far off the mark. In his book, Kurzweil fails to take into account several very important things and make a number of wrong assumptions.

The Memory Bottleneck with current technologies.

Everyone knows Moore’s law, with cpu performance doubling every 18 Months (1.5 years). This is an increase of 66% per year. What few people realize is that memory performance is only increasing 11% per year. It takes 9 Years to double memory throughput.

So memory capacity is increasing while it's speed is increasing only 1/6 as much. As things progress, the memory speed actually falling behind. The boot time memory checks on PC's are becoming painfully slow compared as the amount of RAM increases.

In applications with growing storage demands, brute force scanning of memory is actually becoming slower and slower proportionally to the amount of computing power available.

To rephrase this, it is taking an more time to search through every byte stored in a computers as RAM memory sizes increase. So even if you can hold more, and on a Byte per Byte basis memory access is faster, proportionally when compared with the overall size of memory in a modern computers, it's taking more time to search through.

They can make bigger libraries, but not faster librarians.

Wrong Assumption: the Brain is about computational performance.

Now with the Brain, It only runs at 100 Hz, and holds 10^12 Neurons.

Assuming one logic decision per neuron firing time this gives us 10^14 logic decisions per second.

CPU’s (in 2004) are at 3 * 10^9 (3 Ghz) logic decisions per second.

So following Moore's Law, in about 18 Years we will have machines at the same level in terms of logical operations per second.

But this misses the whole point about how the Brain really works.

Even from common sense experience it’s obvious the brain is a terrible computation engine. Competing with the brains computational abilities is pointless. Even a meager computer from 1980 can easily beat the best human's at computation.

So what is the brain really good at?

The brain is also terrible at brute force memorization and retrieval of plain raw information.

The thing it is good at is self learning feedback control system also know as a servo.

It's also very good at pattern recognition and information retrieval of interrelated data.

On one hand a $1 calculator can outperform a human at computation and even a 20 year old Apple II is far better a rote data storage and retrieval. On the other hand when is come to navigation and control the even the largest of computers are humbled when compared to a simple insects and can’t even come close to “Understanding and deriving meaning from information”

So from a mechanical perspective, what the brain is really good at is effectively searching through it’s memory many times per second, and cycling through memories to access more related memories, comparisons and weighing probabilities.

I believe the brain is not about computation but memory speed.

It's all about loose associative memory and pattern recognition. An input pattern come is and needs to be identified quickly to produce a response. Total accuracy is not critical. Approximation is close enough.

What is the brains memory capacity?

In Nov 2001 I asked Don Knuth this question at one of his “Stump the Professor!” lectures at Xerox Parc. Well he was stumped, even he didn't really have any answers saying is it more more a medical / biology problem. Well having had free run of the Stanford campus and asking the best academics I could find I realized they have even less of a clue then the computer / information science geeks.

Current science doesn't even know how the brain stores information, or how much the brain can store. I have heard many theories that it may use quantum effect, to storing information in some other dimension outside of normal time space. I think these are very unlikely and best of all some beyond known science explanation is not needed to understand how the brain can store so much and retrieve if as fast as it does.

Keep in mind the brain is a control system and pattern matching engine. That's what it does and has had billions years of evolution to master it.

So a minimum estimate might be at 1 bit per neuron. This gives 10 GB, although that estimate seems very small and unlikely considering about much redundancy of information is know to be stored with in the brain.

Another estimate might be at 1 bit per dendrite. That would be 10,000 (dendrites) Bits per neuron giving 100 TB. This is certainly a better estimate but I suspect the real number is far greater.

If data was stored using permeable combinations then it's storage would be far greater. If data was stored in interconnection patterns it’s capacity would increase exponentially according to Bell Numbers. See also: 1 , 2

In examining this I realized it would be possible to have from 10 to 1000 times more storage then a bit per dendrite. It's also in line with what we know about the biology where the brain grows and break interconnections, when learning.

This explain would many things, From its extremely high memory performance given such low clock rates (100 Hz) to it’s ability to have so much redundancy that it can tolerate massive damage and for the most part degrades gracefully.

How does this compare with computers.

Lets assume the best case, PC’s are limited to 4GB already (Sorry this originally was written in 2004) , we will be at 10GB in no time all, 7 years. To Reach the 100 TB of RAM that’s about 20 years out. With Moore 66%.

What is the brains memory throughput?

The brain can access all of it memory as much as 100 times per second.

A real brain can do many things a digital system cannot such as return a fuzzy analog values or access all of its memory at once.

If we use the estimates of the brain memory capacity to be 10GB to 100TB this gives us a memory throughput from 1 Terabytes per Sec to 10 Petabytes per second.

Right now in 2004 we have 833 MHz front size bus (FSB) that is 32 Bits wide.

This memory throughput only increases at only 11% a year(much slower then Moore's law) , so it takes about 7 years to double in speed. For us to go from our current PC’s at 8.33 x 10^8 Byte per sec to our lower estimate of 10^10 is a 12x increase and would take 25 Years. The to reach the higher estimate 10^13 would take 90 years. And if interconnects do store data then:

10^14 in 114 years

10^15 in 136 years

10^16 in 157 years

10^17 in 180 years

10^18 in 200 years

So it’s safe to say that memory throughput of the human brain will exceed the best of our computer technology for at least 25 years assuming 1 bit per dendrite.

But if we assume it's capacities are based on interconnection then we will be well into the next century, somewhere from 2090 to 2150 is my estimate. At that point we have only just matched our minimum waged employee intelligence… We would still need to send such an potentially intelligent computer to school since it would have to learn in a very similar manner as a human does.

In the end creating a true sentient human like intelligence would required all of the energy and human interaction that raising a infant into adult hood would. And each unique entity would need to also go through those same unique learning experiences to become a unique individual, and the outcome would be hit or miss much like it is with humans now. (We don’t all go to Harvard) The one real advantage it would possess is that it could continue to grow and survive well beyond the life span of a single human. So I don’t see such an artificial intelligence as being competition to humans for at least a hundred years. I would like to think at that time the major mission of mankind would be the exploration of space and a HAL like computer operating interstellar spacecraft would be ideal.

-----

I wrote this back in 2004, but never published it. I have had a lot of thoughts since, and need to publish my ideas on data storage with permeable combinations.

I suspect we will start seeing many turing test passing machines soon, but they will be lacking sentience. I am sure they will make excellent bill collectors and sales people, but probably will not make great programmer or innovators.

I have also learned much about the nature of evolution and that machines are evolving with us. It's really a matter of economics driving things. With the first Luddite moment around 1811 when mechanized looms started replacing human weavers, it was really a straight out matter of economics. When the machines became cheaper then having the humans do it, we were pushed out and had to find other things to do. As we approach the 200th anniversary of the luddites we find that workers have been pushed out of one job after another. I recall when robots replace auto workers. When spreadsheets replace accountants, now the web is replacing print media. But we are in a symbiotic relationship with our technology as without humans the looms have no customers to produce for. Our whole society oscillates back and forth with each step of progress, and in the process kills off many companies and makes a mess of people lives. It seems the inevitable nature of things as they evolve. The alternative, stagnation, something I would prefer not even think about as it tends to be unpleasant for all. At some point I may write more on that subject as history has many examples,the collapse of Rome for one.

So I am sure as time progresses, and with luck we will continue our oscillations, with machines taking over more and more forcing humans to be more intellectual and creative.

We are in a symbiotic relationship with machines and in the end, it's the combination of the two that will trump. I have no doubt all possible combination and hybridization will be tried, but evolution will make the final decision. In the end we will end up with an almost unimaginable diversity of carrying on the struggle.

Sunday, September 26, 2010

Word of the day "bricoleur"

I didn't realize until today that there is a word that really fits how I think and work.

I am a bricoleur.

As in 'She advocates the "bricoleur style" of programming as a valid and underexamined alternative to what she describes as the conventional structured "planner" approach. In this style of coding, the programmer works without an exhaustive preliminary specification, opting instead for a step-by-step growth and re-evaluation process. - Sherry Turkle '

From :
http://en.wikipedia.org/wiki/Bricolage#Information_technology

Until now I have never had a good way to properly define, explain or defend my way of coding, which has time and time again has proven to be faster and more effective then most programming teams can do.

But it's definition extends to also explain of the inventions and design work I do and my whole general approach to life in general! One that has always seemed to be very unique but effective.

"He is a tinkerer, a bricoleur, Dionysian rather than Apollonian."
—post.thing.net - A lean, mean, media machine.

Definitions of Bricoleur on the Web:

Bricolage, is a term used in several disciplines, among them the visual arts and literature, to refer to the construction or creation of a work from a diverse range of things which happen to be available, or a work created by such a process. ...
en.wikipedia.org/wiki/Bricoleur
A person who creates a bricolage; A person, such as a writer, artist, etc, who creates using a diverse range of materials
en.wiktionary.org/wiki/bricoleur
French term meaning “handy-man” or “jack- of-all-trades,” now implying someone who continually invents his or her own strategies for comprehending reality. Marcel Broodthaers has been so described. See bricolage.
thebookman.wordpress.com/2008/03/01/postmodern-terms-absence-to-curtain-wall/

BRICOLAGE: French term meaning “puttering around” or “doing odd jobs.” Claude Lévi-Strauss (see structuralism) gave the term a more precise anthropological sense in books like The Savage Mind (1966) by stipulating that it refer to, among other things, a kind of shamanic spontaneous creativity (see shaman) accompanied by a willingness to make do with whatever is at hand, rather than fuss over technical expertise. The ostensible purpose of this activity is to make sense of the world in a non-scientific, non-abstract mode of knowledge by designing analogies between the social formation and the order of nature. As such, the term embraces any number of things, from what was once called anti-art to the punk movement’s reinvention of utlitarian objects as fashion vocabulary (see, for example, Dick Hebdige’s Subculture [1979]).

See also:
* who's the bricoleur?
* Are you a bricoleur?
* Sherry Turkle
* http://www.bricoleurbanism.org/
* http://www.answers.com/topic/bricolage
* http://www.wordnik.com/words/bricoleur/examples

Converting QIF to TEXT

My bank only outputs my account history's in several propriety formats that require I must buy some commercial software to view. When in reality I just want something quick in dirty to make in to a simple text file that I can grep through, and say search for all my trips to fast food or list out the checks cashed.

One format is QIF and I did a quick and dirty little sed script to turn that in to a simple list in plain text one line per entry.
See: Quicken Interchange Format (QIF)

So I can turn this:

^
D04/26/2010
T-11.81 PMCDONALD'S F21823 GOLETA CA
MWithdrawal Debit Card Debit Card
^

Which is really one never ending mess on a huge single line in notepad

In to this:

04/26/2010 -11.81 MCDONALD'S F21823 GOLETA CA Withdrawal Debit Card Debit Card

Below is a little shell script that used the unix command sed to clean this up.

qif2text

sed \

-e '/^!/d' \

-e 's/\^/ NEWLINE/' \

-e 's/^.//' $1 | \

sed -e ':a;N;$!ba;s/\n/ /g'  | \

sed -e 's/NEWLINE/\n/g'

One you have that you can quickly use grep to filter out specific entries and then sum then with awk.

./qif2text  myaccounthist.qif | grep Draft | awk '{ SUM += $2} END { print SUM }'

Friday, September 03, 2010

The perfect altitude for a Clark orbit

Quote:
Looking at those figures, I realized that I needed to round off the actual planned final orbital altitude to the nearest hundred. That was as close as most people, and in particular the news media, would ever remember. So I selected 22,300 miles as the figure for the planned altitude, and used that.
[...]
But the perfect altitude for a Clark orbit, it turns out, is 22,238 statute miles above mean sea level.
[...]
That meant I should have rounded off geosynchronous altitude as 22,200 miles, the closest hundred. Using 22,300 miles was a mistake.

By the time I recognized my error, everyone was using the 22,300 mile figure, even engineers and others who were experts in orbital mechanics.
[...]
It's wrong. And it's all my fault.

Rings of Earth

I just realized, that after we build a space elevator, we will eventually build an artificial ring around the earth. We've already started with our geosynchronous satellites. Starting clumps of tethered satellites, it will grow massive, and slowly get populated. It will eventually have its own nervous system and be reactive, to stresses and forces. Initially it will be power stations and computer server farms because of the better access to power, solar power. It will quickly get populated by robots and human maintenance workers that live permanently there. Eventually it will be like a shipping port connecting incoming and outgoing passengers and cargo. In enough time it may become fully populated.
Space elevators will be it's loading docs connecting but instead of connecting the land and sea, this will connect the land and space. With docks you have to move material using small ships, which is where we currently are in space. If we extend this ocean analogy, most ships are literally only good for one trip only because once we get to the beach, the surf tears them apart. So we're not even to the point of good canoes.

V Cast SONG ID

Trying out the "V Cast SONG ID" app on the new Droid X. Been sitting here with the internet and radio seeing how good it is for like the past 2 hours, while I am babysitting some shell scripts. It got all the sorta ASCAP , rock / rap / dance correct. Even country western and jazz. It even got ambient electronica that is little more then just a slowly undulating humm. But Beethoven's 5th it couldn't get, even in it's most recognizable Dun Dundun Duh! *:-O Wow, now that's got to mean something.