There are [3nd $~$\log_2(3) \approx 1.585$~$] bits to a Trit. Why is it that particular value? Consider the GalCom thought experiment, in which you make money by sending information about the stock market across the celestial reaches. Let's say that Company X is about to publish a big earnings report, and that you want to send a message to your stock trader in the Vega star system that says one of three things: "Performance was in the top 1/3 of what I expected," "Performance was in the middle 1/3 of what I expected," or "Performance was in the bottom 1/3 of what I expected." You obviously don't want to send whole sentences; it suffices to transmit a single Trit of information.
That is, if you had control of a communication channel that could hold one of three signals (say, a wire that could be positively charged, negatively charged, or uncharged) which was connected to your computer on Vega, then you could program the computer to recognize a positive charge as "top 1/3 performance", a neutral charge as "middle 1/3 performance", and a negative charge as "bottom 1/3 performance." You could then program it to buy, hold, or sell respectively, and make your money.
Unfortunately, you don't control a wire connected to your computer on Vega. The computer on Vega is in a different star system, and the only way to communicate with it is through GalCom, the galactic communications network. And GalCom doesn't sell trits (which let you transmit one of three messages) it sells bits (which let you transmit one of two messages).
So, how many bits on GalCom will it cost you to send your trit?
(You're invited to pause and find some method for encoding a trit using bits.)
You can easily transmit a trit via two bits. If you reserve two bits on GalCom, you get to control two of the pulses of light on the sender, by choosing wether they are present or absent. This affects whether the receiver writes down 0 or 1, and those two binary digits will be sent to your machine on Vega. Thus, if you purchase two bits on GalCom, you can send one of four messages to your machine on Vega: 00, 01, 10, or 11. This is more than enough to encode a trit: You can program the machine such that 00 causes it to sell, 01 causes it to hold, and 10 causes it to buy. Then you have one code left over, namely 11, that you'll never send.
To encode a bit using trits, you could purchase two bits and then make use of only 3 of the 4 available codes.
However, this is inefficient: You purchased one code (11) that you are never using. Is there any way to repurpose this code to make some of your money back?
(You are invited to pause and find some method of encoding a trit using two bits such that, sometimes, you're able to sell one bit back to GalCom and recover some of your losses.)
Here's one thing you could do: Program your machine on Vega such that it recognizes both 10 and 11 as "top 1/3 performance."
Then, by the time you've transmitted a 1, the machine on Vega already knows to start purchasing stock: It doesn't need the next bit. This means that it doesn't matter (to you) whether you send 10 or 11; these are two different ways of representing the same message. Thus, if Company X has performance in the top 1/3 of your expectations, you can sell one bit back to GalCom: Send 10 if the next bit they were going to send was a 0, and 11 if the next bit they were going to send was a 1. You've just shaved one bit off of the next message that was going to come through GalCom!
(The above exposition implies that you need to know the first bit of the next message, and that your computer on Vega needs to be set up to forward the second bit to where it needs to go. In practice, this sort of thing would be standard operating procedure on GalCom, which could use [ prefix free codes] to avoid the need for seeing/forwarding bits of following messages.)
If you encode your trit this way, then you reserve 2 bits in advance, and 1/3 of the time you get to sell one back. Thus, if you send lots and lots of trits using this encoding, you only pay $~$2 - \frac{1}{3} \approx 1.67$~$ galcoins per trit on average (assuming, [assume_independent_equally_likely_messages as we should], that each message occurs with equal frequency).
Can we do better? Yes. (TODO)