Entropy information theory

10/7/2023

“Shannon showed there is something like the speed of light, a fundamental limit,” said Javidi. Shannon entropy sets an inviolable floor: It’s the absolute minimum number of bits, or yes-or-no questions, needed to convey a message. Note that in examples such as these, you can ask better or worse questions. Put another way, patterns reduce uncertainty, which makes it possible to communicate a lot using relatively little information. Shannon calculated that the entropy of the English language is 2.62 bits per letter (or 2.62 yes-or-no questions), far less than the 4.7 you’d need if each letter appeared randomly. Now you can tailor your guessing to take advantage of the fact that some letters appear more often than others (“Is it a vowel?”) and that knowing the value of one letter helps you guess the value of the next (q is almost always followed by u). In the second version of the game, instead of guessing the value of random letters, you’re trying to guess letters in actual English words. (A useful first question would be, “Is the letter in the first half of the alphabet?”) If you use the best possible guessing strategy, it will take you on average 4.7 questions to get it. In the first, I’ve selected a letter at random from the English alphabet and I want you to guess it. To take another example, consider two versions of an alphabet game. The more certainty there is around the content of a message, the fewer yes-or-no questions you’ll need, on average, to determine it. Louis you almost have to work your way through the forecast one day at a time: Is the first day sunny? What about the second? How many yes-or-no questions would it take to transmit each seven-day forecast? For San Diego, a profitable first question might be: Are all seven days of the forecast sunny? If the answer is yes (and there’s a decent chance it will be), you’ve determined the entire forecast in a single question. Louis is more uncertain - the chance of a sunny day is closer to 50-50. San Diego is almost always sunny, meaning you have high confidence about what the forecast will say. Each wants to send the seven-day forecast for its city to the other. The logarithmic formula for Shannon entropy belies the simplicity of what it captures - because another way to think about Shannon entropy is as the number of yes-or-no questions needed, on average, to ascertain the content of a message.įor instance, imagine two weather stations, one in San Diego, the other in St. In information theory, it’s the logarithm of possible event outcomes. In physics, the formula for entropy involves taking a logarithm of possible physical states. There are also formal similarities in the way that entropy is calculated in both physics and information theory. In an analogous way, a random message has a high Shannon entropy - there are so many possibilities for how its information can be arranged - whereas one that obeys a strict pattern has low entropy. A cloud has higher entropy than an ice cube, since a cloud allows for many more ways to arrange water molecules than a cube’s crystalline structure does. The term “entropy” is borrowed from physics, where entropy is a measure of disorder.

“He had this great intuition that information is maximized when you’re most surprised about learning about something,” said Tara Javidi, an information theorist at the University of California, San Diego. He also showed that if a sender uses fewer bits than the minimum, the message will inevitably get distorted. He captured it in a formula that calculates the minimum number of bits - a threshold later called the Shannon entropy - required to communicate a message.

Shannon was the first person to make this relationship mathematically precise. More generally, the less you know about what the message will say, the more information it takes to convey. In the second you had a 1-in-4 chance of guessing the right answer - 25% certainty - and the message needed two bits of information to resolve that ambiguity. So, what’s the point? In the first scenario you had complete certainty about the contents of the message, and it took zero bits to transmit it. There are four possible messages - 00, 11, 01, 10 - and each requires two bits of information. We can communicate the result using binary code: 0 for heads, 1 for tails. In the second scenario I do my two flips with a normal coin - heads on one side, tails on the other.

0 Comments

Entropy information theory

Leave a Reply.

Author

Archives

Categories