The Genius Scientist-Player: How the Father of Information Theory Gamed Life | 5Y View

五源资本·April 19, 2024·4·1

Curiosity drove his exploration.

We're constantly swept along by torrents of information, yet perhaps we've never stopped to consider what information actually is. This very question once captivated a man named Shannon. In a letter to Vannevar Bush, he wrote, "I have been working on some fundamental properties of general systems for the transmission of information, including telephony, radio, television, telegraphy, etc. Almost all communication systems can be reduced to the following general form..." He went on to establish modern information theory and became known as the father of information. Concepts we take for granted today — "information entropy," "bit" — all came from him.

And he did all of this not for practical utility, but simply for fun. He admitted frankly, "I've been guided by curiosity my whole life, and utility has never been my main goal..." Curiosity drove his explorations. This article introduces the life of this legendary figure, and I hope it inspires you :)

The following article is republished from "The Intellectual" (知识分子)

He described himself as an apolitical atheist. As a child, he played with inventions; as an adult, he played with mathematics; in old age, he played with stocks — whatever was fun, he played with. He roamed across all of science and technology. He didn't care whether something had practical value, whether a theory had academic significance, whether his work contributed to his company, or whether a problem was important. He only cared if it was fun or not.

His name isn't exactly a household word, but he is revered as the father of information theory. He is Claude Shannon.

Edison's Distant Relative

Claude Shannon (1916–2001) was born and raised in a small town in central Michigan. Interestingly, his father gave him the exact same name[1]. The elder Claude Shannon was descended from early European immigrants, had built a successful business, and served as a local probate judge. Shannon's mother was the daughter of German immigrants — a rare college-educated woman at the time — who worked as a language teacher and later became a middle school principal. Shannon also had an older sister, who was given the same name as their mother. The Shannon family was well-regarded in their small town: both parents were respected for their professions and character. Shannon's sister, six years his senior, was an obedient child and excellent student, the kind of person we'd now call a "straight-A student." Shannon, meanwhile, was exceptionally frail — skin and bones, with sharp, angular features. He spoke little, yet was extraordinarily intelligent. In that small town, he always looked like the kind of young man who'd just been bullied and beaten up, the sort who aroused sympathy.

What young Shannon loved most was electrical and electronic machinery. He idolized Thomas Edison. Later, he discovered that Edison was actually a distant relative, and he took immense pride in this.

Besides the great inventor Edison, another source of pride was his grandfather David Shannon, a farmer and inventor with considerable mechanical talent who made a series of improvements to early washing machines and held U.S. Patent No. 407,130.

It seems inventive talent runs in the family. Young Shannon had a heart that loved invention, treating tinkering with machinery as play. He frequently worked on model airplanes at home and built radio-controlled model boats. Beyond his extraordinary love of play, his childhood was also remarkably lonely. The town where he lived was sparsely populated — amid Michigan's vast farmland, a few streets and shops appeared here and there, and his nearest friend lived half a mile away. So Shannon put his gift for manipulating objects to work, building his own radio station to communicate with friends. His experience accumulated steadily, and by high school graduation, he was already a skilled inventor. His creations included a simple elevator, a backyard go-cart, and a telegraph system that transmitted encrypted messages through barbed wire fences, among others.

Figure 1: Young Shannon

Besides inventing small machines, Shannon also displayed astonishing mathematical talent. Legend has it that at age 8, he could solve his sister's advanced math homework. In 1934, at 17, Shannon published his first academic paper in The American Mathematical Monthly. He solved a difficult mathematical problem, making his mathematical abilities known to the public for the first time. The combination of inventive skill and mathematical abstraction — these two qualities made Shannon what he was and enabled him to benefit humanity.

"The Most Famous Master's Thesis"

In 1932, 16-year-old Shannon entered the University of Michigan, wandering between his beloved and fun fields of mathematics and electrical engineering. A few years later, at 20, he earned bachelor's degrees in both. As graduation approached, a notice on the bulletin board caught his interest. It announced that Vannevar Bush, then president of MIT, was offering a graduate position — the qualified candidate would pursue a master's degree while also operating and managing a differential analyzer for Professor Bush. This was reportedly a strange machine weighing hundreds of tons, composed of rotating bearings and gears, hailed in the press as a "mechanical brain" or "thinking machine": it could perform advanced mathematics and solve equations that would take humans months to crack!

This was the period after World War I, when the world's economic center had shifted from Europe to America. Bell Labs, General Electric, and MIT became the three great centers of American electrical engineering application. As mechanical and electrical technology matured, many fields had urgent computing needs. Ubiquitous differential equations, for instance, rarely had analytical solutions, and the tedious, massive calculations were a headache. So Professor Bush designed the "differential analyzer" to tackle such problems. In today's terms, this machine was analog rather than digital — its wheel-and-disc integrators were physical simulations of differential equations. Regardless, the machine was quite useful for its time, and students and faculty flocked to it, lining up to solve engineering problems. Bush couldn't keep up with demand, so he urgently needed an assistant to operate, study, and improve the machine.

For the soon-to-graduate Shannon, this was a gift from heaven, an opportunity devoutly to be wished. And for Bush, gaining a versatile talent like Shannon who understood both machinery and electronics was exactly what he needed. Thus Shannon smoothly entered MIT for advanced studies, and in those days, a new sight appeared on campus! Shannon's reputation for "play" spread far and wide. People said: on this campus, if you saw a young man with deep-set eyes and a lean face, riding a strange unicycle flying past you — that was almost certainly Shannon.

Figure 2: Shannon with MIT's Differential Analyzer (MIT Museum)

Shannon quickly became deeply fascinated by this big toy. Those endlessly rotating analog components, spread across a large platform machine — when plugged in and started, the whole room buzzed and clattered. This noise that people disliked delighted the inventor.

The differential analyzer was built by Bush and Hazen between 1928 and 1931. The two engineers created it to solve practical problems in applied mathematics and physics — namely, solving differential equations. It was essentially a complex analog computer composed of drive shafts, gears, and discs. In dynamic systems of engineering and physics, if two physical quantities had a certain mathematical relationship, such as differentiation or integration, that mathematical calculation could be performed through the actual operation of the dynamic system. This is called analog computation, and the system is an analog computer.

Figure 3: Analog Integrator

(https://sinclairtarget.com/differential-analyzer/)

The most important components of Bush's MIT differential analyzer were six integrators. The history of mechanical integrators dates back to the 19th century — they could simulate integration operations using two physical quantities in a dynamic system that had an integral functional relationship. A mechanical integrator is essentially a special kind of summing machine, as shown in Figure 3. Each integrator corresponded to a first-order differential equation. Bush discovered a method of connecting integrators in series to solve higher-order differential equations. Thus, his differential analyzer with six integrators could solve differential equations up to sixth order.

Shannon loved machinery, so of course he was interested in the differential analyzer. Bush supplied many components, hoping Shannon could improve the machine toward greater generality and automation. Shannon planned to use hundreds of relays to achieve this. He watched those seemingly silent relays that occasionally clicked — and found them fascinating. Morse code flashed through Shannon's mind; the relays resembled the switches used to send Morse code. Shannon then recalled a symbolic logic course he'd taken not long before, which included Boolean algebra.

Boolean algebra is named after the self-taught British mathematician Boole, who introduced this algebraic logic system in a pamphlet published in 1847. People found it interesting, but it seemed to have no practical use. American logician Charles Sanders Peirce had proposed that Boolean calculations could be performed through electrical switches, but this too received little response.

Switches are circuit devices; Boolean logic is mathematical operation. The two seem completely unrelated. Yet Shannon felt a sense of familiarity, sensing these two had something in common. He thought: there might be something worth exploring here!

Then, in the summer of 1937, Shannon interned at Bell Labs in New York for a summer. The troubling ideas about switches and Boolean logic never left his mind.

Shannon thought further and discovered that whether a relay was "open, closed" or "yes, no" didn't actually matter. What mattered was that these two types, when connected, could be expressed through logic's "AND," "OR," "NOT," and so forth. For example, two switches in series is "AND"; in parallel is "OR." In fact, this is computation. That is, circuits can perform logical operations!

Every concept in Boolean algebra had a corresponding physical representation in circuits. People who operated circuit relays knew these circuit principles, but they hadn't abstracted what they were doing into a mathematical model of Boolean algebra. Thus, Boolean logic could make the leap from symbols to circuits. Once logical symbols were defined, the behavior of complex circuits composed of relays could be represented as algebraic equations with multiple binary variables. For example, the following equation:

x'y'z + x'yz + xy'z + xyz' + xyz

could represent a circuit of three relays.

This 21-year-old youth found this idea fascinating. He was thrilled and excited that he saw something in switch boxes and relays that others hadn't. In autumn 1937, Shannon presented his master's thesis, A Symbolic Analysis of Relay and Switching Circuits, to a committee in Washington, D.C., and published it in a journal the following year.

In this paper, Shannon analyzed the analogy between telephone switching circuits and Boolean algebra, using Boolean algebra to analyze and optimize switching circuits. Shannon's master's thesis had epoch-making significance. It laid the theoretical foundation for digital circuits. Thus, some scientists have evaluated this paper as "the most important master's thesis ever written," saying it transformed circuit design "from art to science." Shannon turned circuit design technique into a science!

Because of this paper, Shannon won the Alfred Noble American Engineer Award from the American Institute of Electrical Engineers — a major prize that brought him excellent opportunities to make his mark in the electronics industry. Yet he seemed to disappear from this field.

In the summer of 1939, Shannon arrived at Cold Spring Harbor, at America's top genetics laboratory. He had changed course, crossing over from electrical engineering to biology, attempting to apply his "algebraic" theory to the emerging field of genetics — he was playing with genes now! Two years later, he completed his doctoral dissertation — An Algebra for Theoretical Genetics — using linear algebra to describe the probabilities of different inherited traits in heredity, employing a peculiar symbolic algebra to predict how traits pass from generation to generation.

Figure 4: Shannon's doctoral dissertation | Source: dspace.mit.edu

At that time, the DNA double helix structure had not yet been discovered, nor had the genetic code. Shannon was entirely conceiving genetic mechanisms in his own mind, attempting to describe genetic phenomena with algebraic theoretical models.

In 1940, with this dissertation, Shannon earned his Ph.D. in mathematics from MIT and received a postdoctoral opportunity at the Institute for Advanced Study at Princeton. There, Shannon could work freely across disciplines. He had opportunities to discuss his ideas with mathematicians such as Hermann Weyl and John von Neumann, and occasionally encountered Einstein and Gödel[2].

Computation, Cryptography, Information

Turing, four years Shannon's senior and celebrated as the father of computing, is far better known to the public. But in fact, the two men's thoughts and experiences had much in common: both were interested in machine computation, both studied cryptography, both had applied mathematics to biology, both liked to think about artificial intelligence. Sadly, one was in Britain and the other in America; they rarely met in their lifetimes. What strikes one is their later fates: Turing was persecuted for his sexuality and died young; Shannon "played" into his eighties, passing away from dementia.

In early 1943, during World War II, in a Bell Labs cafeteria in America, the two prodigies hit it off immediately. Turing had been sent to America that year, arriving in New Jersey to encrypt transatlantic telephone conversations so enemies couldn't eavesdrop on Allied intelligence. Shannon had transferred from the Institute for Advanced Study at Princeton to Bell Labs, working on how to encrypt the communication line from Washington to London. Though we now know both were studying cryptography at the time, their respective projects were absolute state secrets of their two nations, so neither knew what the other was actually working on. Their cafeteria chats didn't touch on decryption techniques. Still, they had so many common topics and ideas. In those days, they chatted about "thinking machines" — that is, what were the limits of ideal computers. In that era, no one yet knew what computers would actually look like! So their discussions were mostly based on mathematics and logic. Neither imagined that they would each independently open up entirely new fields in the history of science.

Turing approached from a more mathematical perspective, believing ideal computers should be purely logical deductive devices. Shannon, the passionate inventor, thought more broadly — he believed computers would become social tools, even capable of processing non-logical things like music.

So Turing showed Shannon his 1936 paper, the one that defined what we now call the "universal Turing machine." This made a deep impression on Shannon, because many of its ideas coincided with and complemented his own. Turing's work sparked sudden inspiration in Shannon — he discovered that some seemingly completely different things actually shared a common essence.

Turing was also greatly excited during his exchanges with Shannon about the "universal Turing machine." Their voices involuntarily grew louder and louder during discussions, even attracting attention from others dining around them. From an electrical engineer's perspective, Shannon saw the practical value of the Turing machine. This thrilled Turing and also pushed him beyond pure mathematics to reconsider the significance of the Turing machine. Could thinking machines really become reality through circuits? Turing found this fascinating! So before leaving America, he bought an introductory circuits book and took it on the ship back to Britain, reading it voraciously.

Turing's cryptographic work mainly involved cracking the ENIGMA cipher invented by the Germans, which was widely used by German forces including submarines roaming Atlantic supply lines. At the time, both the British and French believed ENIGMA was unbreakable. Turing led roughly 200 elite personnel in cryptanalysis, mastering a complete set of methods to break this cipher, thereby understanding German movements and seizing the initiative in the war, making outstanding contributions to the Allied defeat of Germany.

Shannon's cryptographic work was closely related to his later communication theory. Shannon himself said his wartime insights into communication theory and cryptography developed simultaneously — "they were so closely intertwined that you couldn't separate them."

In fact, Shannon had vague ideas even earlier. In early 1939, in a letter to Bush, he wrote that besides the differential analyzer, he was considering what he considered a more important problem: "I have been working on some fundamental properties of general systems for the transmission of information, including telephony, radio, television, telegraphy, etc. Almost all communication systems can be reduced to the following general form: information goes from sender to receiver, involving three 'time functions' — the initial information to be transmitted fi(t), the intermediate signal f(t), and the final output ff(t)." This was Shannon's conception of the communication process, shown in Figure 5a.

Figure 5: Information Theory

Shannon recognized that real systems also contained noise, and he tried to prove theorems to mathematically describe communication systems. At Bell Labs, a series of works on information secrecy and concealment techniques in digital communication enabled Shannon to establish modern information theory. In 1948, as a summary report of this work, Shannon published the paper "A Mathematical Theory of Communication"[3]; in 1949, he published "Communication in the Presence of Noise." These articles expounded the basic content of information theory.

I consider information theory one of the most beautiful theories in science. Where does its beauty lie? In its universality, in its simplicity, and most valuably: in how simplicity and profundity coexist[4].

The two simple diagrams in Figure 5 nearly summarize all of information theory. Figure 5a describes the information transmission process, which applies not only to communication but also to computers, genetics, biology, and physical and chemical processes. In fact, information and its transmission process are everywhere; thus, information theory can be extended to almost all fields.

How to describe the information transmission process in mathematical language? For this, Shannon defined the basic concept of information, shown in Figure 5b.

What is information? The concept of information is both abstract and changeable. Information is neither matter nor energy. To give a precise definition to this thing that everyone understands yet cannot grasp or touch — this is no easy task. It is no exaggeration to say: thank goodness for Shannon! As an engineer who had played with all kinds of technologies, Shannon deeply understood the essence of "information"; as a mathematician skilled at abstraction, Shannon knew he must first give information a quantitative description — only after quantification could there be theory! Matter and energy are both measurable; how to measure information?

Shannon borrowed a term from thermodynamics: entropy. Perhaps in academic debates about information theory, John von Neumann and Norbert Wiener both inspired Shannon to some degree. But regardless, it was Shannon who ultimately proposed the concept and expression of "information entropy" (Figure 5b). In the information entropy formula, Shannon with genius and ingenuity connected information to probability, revealing the microscopic essence of information as "a measure of uncertainty."

Anyone who has used a computer is familiar with the word "bit," but you may not know that the bit is the unit of information defined by Shannon. Since computation is the transmission of information, it naturally became an important measure of computer processing and storage capacity as well.

Let's take "linguistic information" as a simple example to understand the formula in Figure 5b. When people speak a sentence, the information is a "string," such as "I am Elon Musk" — a string of five Chinese characters. Information entropy H is determined by summing, for all symbols xi in the string, the probability p(xi) multiplied by the logarithm of the probability. The information content of each symbol xi relates to the probability (uncertainty) of that symbol's usual occurrence. Thus, Shannon's information theory formula built a bridge between information and uncertainty, disorder — this is the deep connection between information and the natural world.

A Legendary Figure Fades from View

The hottest science news these past two years has undoubtedly been "AI" — artificial intelligence. Yet few know that Shannon was also one of the AI pioneers. In a paper published around 1950, Programming a Computer for Playing Chess, Shannon described how to make a computer play chess. This is considered one of the earliest published articles on computer chess and using computers to solve game problems. In it, he proposed basic strategies for limiting the number of possibilities to consider in chess games. Shannon gave the complexity of chess as roughly on the order of 10^120 (known as the "Shannon number"), though in the paper he presented a more intelligent algorithm that could greatly simplify the calculation. In 1997, algorithms evolved from this paper ran in Deep Blue and successfully defeated Garry Kasparov.

Most of Shannon's academic achievements were completed before the 1960s. Then, he stopped "playing" with academia and started playing with other things.

Figure 6: Shannon the "Player" (Image from internet)

"What's your secret to a carefree life?" an interviewer asked Shannon near the end of his life. Shannon replied: "I've been guided by curiosity my whole life, and utility has never been my main goal..." To summarize how Shannon could play so carefreely: first and foremost, he cast aside the entanglements of worldly fame and fortune, with curiosity driving his exploration of nature. Throughout his life, Shannon played games, invented games, tinkered with small machines and gadgets, maintaining that childlike heart even in adulthood.

In 1951, Shannon published the paper "Presentation of a Maze Solving Machine," about a mechanical mouse he built named Theseus. The maze configuration was flexible and could be rearranged and modified at will. The mechanical mouse moved through a 25-square maze, finding its way out through repeated trial and error. After the mouse passed through the maze once, if placed somewhere it had been before, it could go directly to the goal based on previous experience. If placed in an unfamiliar area, it was programmed to "search," then gradually search until successful. After success, it would add new knowledge to its memory and learn new behaviors. Doesn't this resemble today's AI learning machines?

Shannon was an avid unicycle enthusiast. He liked designing and building all kinds of bizarre unicycles. He invented a Roman numeral computer called THROBAC and a juggling machine; he built a device that could solve Rubik's Cube; he co-invented the first wearable computer with Edward Thorp to improve odds at roulette. He designed a "mind-reading" machine that, by observing and analyzing samples of an opponent's past choices in chess, could quite accurately guess the opponent's next move.

Although Shannon remained an MIT professor, his obsession with playing at invention meant he later stopped publishing papers and attending professional conferences, gradually fading from public view. A very dramatic moment occurred at the 1985 International Symposium on Information Theory in England: Shannon made a surprise appearance. Many attendees didn't even know he was still alive. This modest, shy celebrity, once recognized, was surrounded by fans. Finally, he reluctantly took the stage at the banquet, introduced by the host as "one of the greatest scientific giants of our age." After the applause died down, Shannon blurted out: "This is ridiculous!" Then he reached into his pocket and, like a magician, pulled out three balls — and actually started juggling on the spot.

There are thousands upon thousands of people who can juggle, and countless inventors in the world. Yet the person who could play and thereby perceive the mysteries of nature — only Shannon. Who else could play with cryptography and think of information theory? Who else would ponder AI while playing chess? Who could practice juggling and propose a "juggling theorem"? What unicyclist would insist on converting unicycle motion into equations? No one else, only Shannon.

References:

[1] A Mind at Play: How Claude Shannon Invented the Information Age, by Jimmy Soni and Rob Goodman, translated by Ye Yang, CITIC Press Group, February 2019

[2] The Information: A History, a Theory, a Flood, by James Gleick, translated by Gao Bo, Posts & Telecom Press, December 2013

[3] Shannon C E. A mathematical theory of communication[J]. ACM SIGMOBILE Mobile Computing and Communications Review, 2001, 5(1): 3-55.

[4] ScienceNet — "Entropy" — Information World Also Shows Its Strength — Zhang Tianrong's Blog: https://blog.sciencenet.cn/home.php?mod=space&uid=677221&do=blog&quickforward=1&id=987651

5Y Capital seeks out, supports, and inspires solitary entrepreneurs, providing support from spiritual to all operational aspects. We believe that if the you whom others see as crazy begins to be believed in, the world will become a different place.

BEIJING · SHANGHAI · SHENZHEN · HONG KONG