Looking at Human Instability, Trust, from Blockchain's Perspective

While most people associate the birth of blockchain with Bitcoin in 2008, the actual technology was invented nearly two decades earlier.

The origin of blockchain is a story of 3 distinct eras:

the theoretical invention in the early 90s

The concept of a cryptographically secured chain of blocks was essentially invented in 1991 by two research scientists, Stuart Haber and W. Scott Stornetta. They wanted to create a system where digital document timestamps could not be tampered with. They needed a way to prove that a specific document existed at a specific time, much like a digital notary. So they designed a system where documents were cryptographically hashed (turned into a unique string of characters). Crucially, the hash of each new document included the hash of the previous document. This created a “chain” where changing one old record would break the link to every subsequent record - the definition of a blockchain.
the structural optimization shortly after

In 1992, Haber and Stornetta, along with Dave Bayer, upgraded their system by incorporating Merkle Trees (invented by Ralph Merkle in 1979). Before this, we could only store one document per “block.” Merkle Trees allowed them to bundle thousands of document hashes into a single block, making the chain efficient enough to handle large amounts of data. This is the exact structure Bitcoin uses today.
the practical application by Satoshi Nakamoto.

How to Time-Stamp a Digital Document (1991)#

As we said earlier, Stuart Haber and W. Scott Stornetta’s work provides the direct architectural blueprint for what we now call the “blockchain.” We are going to expand the specific pieces of literature that established the technology, starting with their first breakthrough paper

The Breakthrough
This is arguably the most important paper in the history of blockchain. It proposed a solution to the problem of certifying when a digital document was created without relying on a centralized party (like a government or notary) to keep the records by introducing the concept of linking documents together using cryptography. The “hash” (digital fingerprint) of a new document would contain the hash of the previous document. This created an immutable chain. If one tried to change a document from the past, it would change its hash, which would break the link to the next document, and so on, alerting everyone that the record had been tampered with. This is the definition of a blockchain (originated in 1991).

Background - Need for Digital Trust#

Blockchain emerged for a reason - not a tech-savvy show-off but to address a problem of trust.

Suppose both you and I are two scientists working independently on a same hard problem and both of us happen to come up with the same solution. This is going to be unfortunate for one of us because only one can patent it and get global recognition.

In an ideal world, let’s say you put down your solution in a paper whose publication date is 2024 and I do the same except that I am little late. My paper was submitted on 2025 instead. Who gets the reward is not going to be a question.

But if I play a trick by somehow tampering the record, wiping out the year 2025 and re-stamping it with “2023”, then a genuine truth would have been completely compromised.

This is an example showing the importance of certifying the date a document was created or last modified. It is crucial to verify the date a scientist first put in writing a patentable idea, in order to establish its precedence over competing claims.

In a pre-digital age, one solution for time-stamping a scientific idea involves daily notations of one’s work in a lab notebook. The dated entries are entered one after another in the notebook, with no pages left blank. The sequentially numbered, sewn-in pages of the notebook make it difficult to tamper with the record without leaving telltale signs. If the notebook is then stamped on a regular basis by a notary public or reviewed and signed by a company manager, the validity of the claim is further enhanced. If the precedence of the inventor’s ideas is later challenged, both the physical evidence of the notebook and the established procedure serve to substantiate the inventor’s claims of having had the ideas on or before a given date

An example of time-stamped lab notebook what would have been used during the pre-digital age

“But wait”, you might ask, “I don’t think this works even in a pre-digital age, because I can produce fake time-stamp in the first place, right?”. You are right that a lab notebook is not a perfect system, but in the pre-digital world, the “Lab Notebook” method relied on physical friction and external reputation to make forgery extremely difficult, even if not impossible. Here is why the “fake timestamp” attack was harder to pull off in 1990 than you might think, and why the transition to digital made this paper necessary.

We are talking about “sequentially numbered, sewn-in pages”. If we want to insert a fake invention from 2025 into a page dated 2020, we have a physical problem, because we cannot rip out the page (it is sewn in). We cannot insert a new page (the numbers won’t match, e.g., jumping from page 50 to 52). We would have to forge the entire notebook from scratch, rewriting years of work to fit the new fake entry in the middle. This is high-effort “Proof of Work.”

“Well that doesn’t sound too bad because I can still produce fake time-stamp anyway, right?” You might defend. This is when the Trusted Third Party (The Notary) comes into play in the system. We emphasized that the notebook is “stamped on a regular basis by a notary public or… company manager.” You write your idea. You walk to the manager’s office. The manager looks at the calendar, writes the date, and signs over your entry. Simply put, you don’t put down them time-stamp, only the “officials” of the system do. We would call then the central authority (But if you bribe the notary to backdate the signature, the system breaks and that’s a legal issue beyond the coverage of this post)

As we can see, the key here is the truthfulness of timestamp. Luckily pre-digital age established 2 assumptions that made traditional time-stamping generally safe:

Paper-documents can be examined for signs of tampering
Notary as the central contemporary timestamp issuer

In a ditigal world, however, the 2 assumptions completely crash down. Unlike paper documents, electronic digital documents are so easy to tamper with. A new time-stamping technique applied to the digital documents must be invented and satisfy the following 2 requirements

Instead of put a timestamp on physical paper, we “timestamp” the document content
It will be impossible to stamp a document with a time and date different from the actual one.

A Naive Start#

The revolutionary paper starts its narrative here by first proposing a “naive” setup: whenever a client has a document to be time-stamped, they transmit the document to a time-stamping service (TSS). The service records the date and time the document was received and retains a copy of the document for safe-keeping. If the integrity of the client’s document is ever challenged, it can be compared to the copy stored by the TSS. If they are identical, this is evidence that the document has not been tampered with after the date contained in the TSS records

Note that this approach is considered “naive” because it relies on a centralized, trusted third party, which is the very problem Haber and Stornetta were trying to solve with their more advanced, blockchain-like solutions later in the paper. But for now, let’s still assume TSS is trusted for just a little while because the paper will next be introducing couple of concepts as the foundation of the final solution.

The naive solution, although literally time-stamped a digital document, raises the following defects:

Privacy: TSS and network eavesdropping could see every detail of the document, including those meant to be private.
Scaling Issue: The time to transmit and store accumulating documents in TSS server is impractical
Network Security: The document in transmission could be tampered in any way
Trust: As we stated above, this fundamental problem of potentially colluded TSS is that we are going to ultimately address in this paper

Looks like the real headache is the 4th defect. So how about tackling the first 3 to make the whole problem easier to deal with? That’s exactly what Haber and Scott decided on. They went on by temporarily assuming TSS is by all means trusted and introduced hashing, i.e. instead of transmitting a document, a hash of fixed length of that document is sent to the TSS. This knocks away the privacy and scaling issues

One more to go - network security issue. This is when computer network concepts comes in handy. We take digital signatures that uniquely identifies the signer, the person issuing the document. With this, when the TSS receives the hash value, it appends the date and time, then signs this compound document and sends it to the client. By checking the signature, the client is assured that the TSS actually did process the request, that the hash was correctly received, and that the correct time is included. This takes care of the problem of present and future incompetence on the part of the TSS, and completely eliminates the need for the TSS to even store records. The signature has the effect that the only way to fake a time-stamp at this point is to collaborate with TSS, which brings up the last piece of puzzle: trust from authority.

Tackling Trust Issue Step 1 - Linking#

To prevent a fundamental misunderstanding, we must be very specific to our problem setup here first: in the “Linking” scheme, we have random and unrelated people sending their own distinct files to the TSS service. For example, we have 3 clients - A, B, and C

Client A is an inventor in New York time-stamping a patent (Document A).
Client B is a novelist in London time-stamping a book manuscript (Document B).
Client C is a bank in Tokyo time-stamping a contract (Document C).

Clients A, B, and C are timestamping different documents. They don’t know each other, and they don’t care about each other’s documents.

If Client A wants to timestamp Document A, this is what’s going to happen between Client A and TSS:

Client A hashes Document A and sends TSS $(y_n, \text{ID}_n)$ , where
- $n$ is the current sequence number, i.e. Client A is the $n$ -th client and Document A is the $n$ -th document
- $y_n$ is the hash of Document A
- $\text{ID}_n$ is the ID of Client A
TSS receives the timestamping request and, either honestly or deceptively, dates Document A with a time $t_n$

WARNING
$t_n$ is NOT the timestamp
TSS puts everything inside a sequentially numbered timestamp certificate in the form of $c_n = (n, t_n, \text{ID}_n, y_n; L_n)$ , where $L_n = (t_{n-1}, \text{ID}_{n-1}, y_{n-1} and H(L_{n-1}))$ is the linking information The hash of the linking information of request $n-1$ . This effectively makes every certificate recursively depend on all of its previous timestamp

WARNING
Certificate $c_n$ is NOT timestamp
TSS signs the certificate ( $s = \sigma{c_n}$ ) and sends signed certificate back to the scientist.
When the next timestamp request $(n + 1)$ (say, Client B) is sent to TSS, the TSS would repeat steps 2 ~ 4 and sends a followup message containing the ID of client B, $\text{ID}_{n+1}$ , to client A, who now has the complete timestamp defined by $(s, \text{ID}_{n+1})$

WARNING
Timestamp is $(s, \text{ID}_{n+1})$ , NOT $t_n$ or $c_n$

The genius of the Haber-Stornetta “Linking” solution is that it forces these strangers to unknowingly become “witnesses” for each other. Imagine a busy notary’s office where everyone stands in a single line.

Client A gets their document stamped.
Client B walks up next. The Notary says: “Before I stamp your book manuscript, I am going to write ‘I just finished helping Client A’ on the bottom of your page.”
Client C walks up next. The Notary says: “Before I stamp your contract, I am going to write ‘I just finished helping Client B’ on the bottom of your page.”

Now, let’s look at a collusion scenario. Suppose Client A wants to lie. She goes to the Notary (TSS) today (2025) and says: “Here is $1 million. Please give me a backdated stamp saying I wrote my patent in 2020.”

If the Notary agrees, but since Client B and C have already stamped their documents, when Document A is backdated to 2020, so do Document B and C. Client A has to find Client B (and Client C, and Client D…) and bribe all of them to destroy their old receipts and accept new fake ones, the fraud is impossible.

Essentially, the system uses the continuous stream of the world’s data to lock every individual document into a permanent, unchangeable timeline. We don’t need to trust the Notary. We just need to trust that the Notary cannot bribe the entire world to change their old receipts.

Tackling Trust Issue Step 2 - Making the World Distributed#

The flaw of “linking”, however, is that the chain is serial. Because A is linked to B, who is linked to C, a sufficiently rich attacker theoretically only needs to bribe the specific people in that specific chain. If the TSS has low traffic (e.g., only one client per day), the “chain” might be short and easy to bribe. The paper itself also acknowledges this: “The only possible spoof is to prepare a fake chain… long enough to exhaust the most suspicious challenger.”

Addressing this issue, Haber-Stornetta dropped their final atomic-bomb - Distributed Trust: $k$ random clients are pulled out of the crowd specifically to act as independent notaries.

The Distributed solution is stronger because the witnesses are chosen by a hash function (which is effectively random), an attacker cannot predict who they will be. If I want to backdate a document to 2020 using the Distributed method:

I create the document.
I hash it.
The math says: “Okay, for this hash, the witnesses were User #405, User #922, and User #11 in 2020.”
I now have to find those specific three people and bribe them to lie about 2020.
The Catch: I couldn’t have known who to bribe until I created the document. And if I change the document even slightly, the math selects a totally different set of three people.

Randomness as Valid Substitute for Authority#

Let’s conclude what this paper just presented:

Naive Solution: Trusts one person (TSS). Fails if TSS is corrupt.
Linking Solution: Trusts the sequence of users. Fails if you can bribe the chain (rewrite history).
Distributed Solution: Trusts the Law of Large Numbers. Fails only if you can corrupt the majority of the entire population instantly.

The idea of “Distributed Trust” solved a centries-old problem: How do we establish objective truth without appealing to a higher authority?, especially in the social media. “News are all fake” is an exaggeration of a reality that social truth is not binary (True/False based on media’s reports); it is asymptotic (becoming more and more certain as more witnesses sign)

More importantly, however, this paper would be seen as the birth of “Algorithmic Trust” - the idea that code can function as a social institution. In the pre-digital age, trust was “Human-to-Human.” In the “Naive Solution,” trust was “Human-to-Institution.” This paper revolutionized trust to become “Human-to-Math” by replacing Moral Character with Game Theory. We don’t trust the witnesses because they are “good people”; we trust them because the randomness makes it impossible for the cheater to bribe them in time. This leads to a modern concept called “Trustlessness”. This is a misnomer; we don’t remove trust, we relocate it from fallible humans to immutable mathematics.

Trust is unstable in human of unpredictable nature. The absolute trust must be enforced by deterministic Game Theory.

Interludes: The Down of an Innovation Launch#

Although the Distributed Trust was presented as the
ultimate “pure” trustless solution in the 1991 paper, it had a massive practical flaw that made it almost impossible to build in the early 90s.

The “Distributed Trust” solution relied on selecting $k$ random witnesses.

$G(y) = \{ID_{Bob}, ID_{Charlie}, ID_{Dave}\}$

This sounds great on paper, but

Who is in the directory? To select “random users,” we need a global list of all users. Who keeps this list? If a central company (TSS) keeps the list, we are back to trusting a central authority.
The “Fake Crowd” Attack: An attacker could register 1,000,000 fake users. If the algorithm picks 3 random witnesses, it is statistically likely to pick 3 of the attacker’s fake bots.
Availability: Even if we pick honest witnesses, what if they are asleep? What if their computer is off? The system halts until they sign.

We are talking about a Global ID System that didn’t exist (and still effectively doesn’t).

On the other hand, the linking scheme only had a flaw that was “merely” an engineering problem. Having realized that creating a single, shared global timeline (the linking scheme) was more valuable than collecting individual signatures, the authors chose to double down on the linking scheme for practical implementation. The only problem with the “Global Timeline” was that it was too slow (linear), which leads to Haber and Stornetta’s next paper to address this issue.

Improving the Efficiency and Reliability of Digital Time-Stamping (1992)#

We have to remember the limitation of the 1991 paper. The 1991 system, there was a linked list

$Doc_A \rightarrow Doc_B \rightarrow Doc_C \rightarrow Doc_D$

If the Time-Stamping Service (TSS) receives 10,000 requests in a minute, it has to perform 10,000 separate linking operations. Worse, to verify the 10,000th document, we might need to trace the chain back through all 9,999 previous steps. It was too slow and too heavy to work globally.

(To be continued…)