Disco-ball Earth

Too bad we can’t convert the infrared getting trapped by the greenhouse effect back into ultraviolet that can escape. Or… can we?

[This is another in my occasional series of half-baked ideas for saving the world. If you can actually make this idea work, it is all yours, and a grateful planet will thank you.]

As you probably know, the climate crisis is due to the greenhouse effect, in which the Earth absorbs more energy from the sun than it is able to shed back into space, causing the planet gradually to grow warmer and warmer. This is a change from the past, when the Earth’s “energy budget” — the amounts of arriving and departing energy — was more or less in balance.

The problem is that a lot of solar radiation reaching the Earth is in the form of ultraviolet light, which is easily able to pass through the atmosphere and reach the surface, where it heats things up. Hot things emit infrared light, and if enough of that can escape back into space to offset the incoming ultraviolet, all is well.

But carbon in the atmosphere blocks infrared from escaping — while doing nothing to reduce the amount of ultraviolet getting in. All substances absorb some wavelengths of light and not others; that’s simply “color.” (Ultraviolet and infrared are just colors our eyes can’t perceive.) When it comes to the gases in the atmosphere, the color hand we’ve been dealt is: let in UV, trap IR. It seems unfair, but chemistry doesn’t care about your feelings.

The politics of our age make it doubtful we can rebalance the energy budget by meaningfully reducing the amount of carbon in the air in a useful timeframe. Too bad we can’t convert the infrared getting trapped back into ultraviolet that can escape, as an alternative.

Or… can we?

Much of what the sun heats up is ocean — naturally, since that’s most of the Earth’s surface. The warming of the oceans is associated with more-intense storms, acidification and coral bleaching, imperiling the Gulf Stream, and a host of other ills.

Most of the warming of the oceans is confined to the upper few hundred feet of depth. Below that in most places is a thermocline — an abrupt temperature drop, with much cooler water below, mostly isolated from the warming effects above.

The thermoelectric effect is a physical phenomenon that can convert a difference in temperature into an electric current.

Putting all of the above together, here’s the idea: build a buoy that floats on the ocean. Beneath the buoy, a long wire extends down past the thermocline. The difference between the surface temperature and the temperature at depth creates a current in the wire — small, but continuous. The current is used to power an ultraviolet laser in the buoy, aimed at the sky. It shines weakly, but continuously, steadily drawing heat from the ocean and beaming it into space.

With enough of these simple, inexpensive units built and deployed, we should be able to offset the greenhouse effect. Doing the math on how many that would be is left as an exercise for the reader. Undoubtedly it would take thousands, perhaps millions, of UV-laser buoys floating in oceans all around the world.

One thing is for sure, though: if this solution works, saving the world wouldn’t be the only cool thing about it. An alien looking at the Earth from space, with eyes that can perceive ultraviolet rays, would see a spinning celestial disco ball.

Yegging him on

It is a good day when Steve Yegge has a new rant to read.

Yegge is a veteran software engineer whose career runs strangely parallel to mine. We overlapped for a short time at Amazon in the early 2000’s, and a few years later at Google. More recently we both worked for companies enabling mobile payments in Asia. We’re both opinionated bloggers (each of whom has name-dropped the other), we’re both Emacs partisans, and we’re both anguished by how Google’s technical superiority is matched by utter cluelessness in product design and marketing.

Where Yegge outshines me by far is in his entertaining, informative, impassioned, and dead-on-accurate rants. His most famous one is probably his Platforms Rant, which was meant to be Google-internal only but made headlines when it was posted publicly by mistake. In that one he implored Google to invest more effort into making its products, which were increasingly “walled gardens” with inflexible feature sets dictated by competitors, into platforms that would allow others to build onto them, the way Amazon was doing. This rant came in the early days of Google+, when many of us within Google were expressing concern over its product design and the lack of any useful APIs that would allow an open ecosystem to develop around it. Ironically, his rant was a Google+ post, and it was the product design, in part, that led to its being misposted publicly. Also ironically, Google+ is now dead—arguably from the very causes Yegge and I and others identified back then—taking his Platforms Rant post with it. (However, it’s preserved in other forms around the net; just google [yegge platform rant].)

In his latest rant he again improves on one of my own frequent refrains: that Google keeps giving you shiny new things and then keeps yanking them away. Like me, he’s a user of Google Cloud Platform products; like me, he is increasingly frustrated by how often those products require you to rewrite your own code to adapt to Google’s changes; and like me, he is entertaining abandoning Google Cloud Platform for this reason, in favor of the more stable (if less technically excellent) Amazon Web Services platform.

Dear Google Cloud: Your Deprecation Policy is Killing You

RMS, titanic

One afternoon in 1996, as I worked with my partners at our software startup, the phone rang. I answered it, and a voice on the other end said, “Richard Stallman?”

This was disorienting. Richard Stallman was the legendary technologist who had created the Free Software Foundation, dedicated to freedom from corporate and government control for those who program computers and those who use them. He founded the GNU project, dedicated to creating an alternative to the Unix operating system unencumbered by patents and copyrights. He was famously ensconced in an office at MIT, not a house in a northern California suburb doubling as office space for our startup. Why would someone call us looking for him, there?

Or did the caller think I was Stallman??

The moment was even more baffling because I was then at work (as a side project) on a book about Stallman’s other great creation, Emacs, the text editor beloved by programmers. So there wasn’t no connection between me and Stallman. But he wasn’t involved in my writing project; he had merely invented the thing it was about. That was a pretty slender thread. How do you get from that to expecting to find the great man himself in our humble headquarters?

Three years earlier I did work briefly with Stallman, after a fashion. The GNU project was releasing a new file-compression tool called gzip. Stallman wanted files compressed by gzip to have names ending with “.z”. In an e-mail debate with him, I argued that this would make them too easy to confuse with files created by “compress,” a predecessor to gzip, which used a “.Z” filename suffix. The distinction between uppercase “.Z” and lowercase “.z” would be lost if those files were ever stored on, or passed along by, an MS-DOS computer, which permitted only monocase filenames. Stallman, in his typical mulish way, refused to allow any consideration of how Microsoft software behaves to influence what the GNU project should do. But I was insistent, not least because I believed that the potential for confusion would harm the reputation of the GNU project, and I wanted GNU to succeed. I was on Stallman’s side! I was joined in my opinion by a couple of others on that thread. In the end Stallman relented, and as a result gzip used (and still uses) the filename suffix “.gz”.

This was a rare concession from a man whose primary goal with the Free Software Foundation was the repudiation, on principle, of the entire edifice of intellectual property law. The creation of actually useful software was only ever secondary to that goal.1 To the extent that Microsoft owed its existence to intellectual-property plunder, Stallman would have seen it as a moral obligation not to allow it to affect the design of GNU gzip.

Stallman was never one to allow pragmatism to overcome principle, an outlook that extended far beyond his professional pursuits and into all aspects of his public persona, with results often off-putting and occasionally problematic. In principle, why should anyone object to an impromptu solo folk dance in the middle of a fancy restaurant (as recounted in Steven Levy’s recent Wired article)? No one should, of course — in principle. In practice, most of us would agree there are good reasons to keep your spontaneous folk-dancing inhibitions in place. But Stallman is not most of us. In principle, it’s merely being intellectually honest to engage in a little devil’s-advocate hypothesizing on the Jeffery Epstein scandal, and how Stallman’s colleague Marvin Minsky might have been involved. In practice, for a prominent public figure — one with authority over others — to do so at this moment, and in that way, betrays at best a cluelessness that’s just this side of criminal. It’s what forced Stallman to resign recently from the organization he’s led for over three decades.

But in 1996, when the phone rang at my startup, Stallman was, to me and my colleagues, simply a legendary hero hacker and fighter against oppression. When I said, “Hello?” and the voice on the other end said, “Richard Stallman?” the effect on me wouldn’t have been too different if it had said, “Batman?”

I stammered something along the lines of, sorry, this is Zanshin, in California; Richard Stallman works at the Massachusetts Institute of Technology. The voice said, “No, this is Richard Stallman.” What I had taken for a question mark was really a period. (Or possibly an exclamation point.)

In principle, it makes perfect sense to shorten, “Hello, this is Richard Stallman” to “Richard Stallman.” Those four other syllables seem superfluous; might as well save the effort it takes to utter them. In practice, of course, it is decidedly odd when placing a phone call simply to declare your own identity and expect your intention to be understood, especially when you leave off anything like, “May I speak to Bob Glickstein please?”

Stallman was calling me, it turns out, because of the book I was writing. He wanted to know if I would consent to giving the book away for free. (A few years later Stallman would put the same pressure on his biographer, Sam Williams, as recounted in the Salon.com review of Williams’ book.) I said that I was not unsympathetic to his request — after all, Emacs, the topic of my book and the output of many programmer-hours of labor, was distributed for free by the FSF. But how could I consent, when my publisher had production and marketing costs to recover? What about the value of all the time I had invested, couldn’t I reasonably expect some compensation for that, especially since I was not yet drawing any salary from my startup? I additionally thought, but did not say out loud, that unlike Stallman himself I had not earned a MacArthur genius grant to fund my writing and programming whims.

Stallman had no answer for the questions I posed, other than to reiterate a few times his certainty that the book should by rights be free. We ended our call, and (as it turned out) our professional association, at a stalemate on this topic.

As with the gzip episode, I was nominally on Stallman’s side. I would have given serious consideration to his request if he could have compromised somehow, or if he could have spoken about the prospects for earning revenue from a product even when it’s given away for free, or, hell, if he could simply have articulated some understanding of or sympathy for the objections I raised. But he was doctrinaire. The principle was the one and only consideration for him.

The paradox of Richard Stallman is that this single-mindedness made him remarkable and allowed him to achieve remarkable things; but his disregard for pragmatism in favor of an insistence on principle cost him the goal of freely distributing my book, on this occasion — and, on another occasion twenty-odd years later, also cost him his career.

  1. Ironically it’s that secondary goal at which the FSF has been more successful by far (despite the many who have rallied to Stallman’s anti-copyright banner — myself included, with varying degrees of conviction over the years). Intellectual property law is as constraining to individuals and organizations as ever. But you and I and everyone we know and, not to put too fine a point on it, our entire modern information economy, depend daily on infrastructural software created by the FSF. []

I am not a shady Internet cigarette affiliate

If you’ve ever read a post on this blog and wondered why occurrences of the word “cigarette” were linked to a shady (now defunct) online cigarette store called Cigazilla, the answer is this blog was hacked. It happened some time between 2012 (when the most recent Wayback Machine snapshots show no Cigazilla links) and 2015 (when the earliest backup I still have shows Cigazilla links embedded in the text). I suspect some unpatched WordPress exploit, or an exploit in one of the few WordPress plugins I use, allowed this to happen.

Please let me know if you see anything else on the site that looks like it doesn’t belong.

Tredd


Recently published: my side project, Tredd, “Trustless Escrow for Digital Data,” a proposal for the secure exchange of data for payment online.

What’s your position on GPS?


[Cross-posted at https://medium.com/@bob.glickstein/whats-your-position-on-gps-bc98a5dff6db.]

How does GPS work?

If you’re like most people, you think it works something like this: there are satellites in orbit around the Earth. Your phone or other GPS device sends a signal to the nearest one of the satellites. Some math happens and the satellite responds with your location.

This is wrong. It’s wrong for reasons that should be obvious. Despite that, everyone believes some version of this, as near as I can tell.

This came up during the current high school mock trial season, in which my son is a mock prosecutor. In a mock trial season, all the schools in California study the same fictional case, and then the prosecution of one school meets the defense from another school in a “scrimmage” conducted like a jury trial. This year’s fictional case is a murder, and the trial begins with a defense motion to suppress some evidence: namely, GPS location data from the defendant’s car (which shows the defendant apparently stalking the victim in the days before the murder). The question argued by the kids, and that the judge must decide, is this: if a person’s car is continually transmitting its location to a third-party service provider (think Google) and the police search that third party’s records, does this infringe upon the person’s Fourth Amendment rights protecting against unreasonable searches?

I’ve sat in on several practices and scrimmages. The discussion of this motion centers on something called the Third Party Doctrine, which says that if you voluntarily give your information to a third party, you cannot reasonably expect that information to remain private, and the government can obtain that information without violating your Fourth Amendment rights. So what’s “voluntary” and what’s “giving” and what’s a “third party”? Drilling into these questions is where the universal misunderstanding of GPS often comes up. If your GPS device is already giving its location to a satellite (the debate goes), how is that different from giving it to a company that provides driving directions?

I’ve heard this now from the students arguing the case, and from their teacher, and from the volunteer attorneys coaching the team, and even from the Superior Court judge who presided over their first tournament meet yesterday. It’s disturbing not only because of the technological illiteracy it reveals, but also because it shows how accepting we’ve become of the idea that our private data is simply out of our control.

In fact a GPS device never sends anything to the satellites in orbit. The satellites are broadcast-only, like a radio station, which has no idea when you tune into it, or a clock tower, which doesn’t respond with the current time only when you ask for it. They are artificial stars that are always “visible” to the devices that know how to see them.

Each satellite continually broadcasts its own position in space, plus the current time according to its super-accurate atomic clock. Your GPS device receives this signal from several different satellites at once. Because of the speed-of-light delay, the signals from different satellites take different amounts of time to reach you. So though the satellite-A signal might say “it’s six o’clock and 33.227 seconds,” the satellite-B signal reaching you at the same instant might say “it’s six o’clock and 33.221 seconds,” which tells your GPS device that you’re closer to satellite B than to satellite A and by how much.1 With a couple more satellites’ signals it’s possible for your device to triangulate its position on Earth with high accuracy.

Why do people mistakenly believe that GPS satellites answer location queries from devices on Earth? In large part because of the way our smartphones work. They depend heavily on outsourcing work to computing resources in “the cloud,” continually sending requests and receiving responses, and we’ve grown accustomed to things working this way.

Why should it be obvious that, in the case of GPS, this is wrong? For one thing, our personal electronics have worked this way for not very long. We’ve forgotten that, before smartphones, standalone GPS receivers were sold as exactly that: receivers. Back then (just a decade or so ago) I don’t think anyone believed GPS devices ever sent signals anywhere, or in any other way leaked information about our whereabouts. With a court order, the police could seize your GPS receiver and inspect its memory of where it had been, but that information lived nowhere else, and it was largely outside anyone’s imagination that it even could.

Another reason this should be obvious: your smartphone is small. It has a small little battery and a small little antenna inside. They’re strong enough to send signals to the nearest wifi station, which is usually located within a few dozen feet, or the nearest cell tower, which is within a few dozen miles, but not to GPS satellites, which are over twelve thousand miles away.

A final reason this should be obvious: there are very many GPS devices making very many location queries every minute of the day. Responding to that many requests in a centralized location would take massive computing resources, the kind that Google and Amazon and Facebook have built multiple gigantic data centers to handle. We can’t put gigantic data centers in space. The stuff we can put in space has to run on solar power and be light and simple as possible. It has to require no maintenance.

Now, to be fair, when you use a service like Google Maps to get driving directions, you do send your location to Google, which is then able to compute the best route for wherever it is you’re going. So the misconception isn’t total. But the location you send to Google came in the first place from old-fashioned GPS triangulation that, in itself, never needs to send anything anywhere. (Note that you can use Google Maps in “offline mode,” where maps are downloaded to your device before you start your trip, and while you’re en route, Google’s servers never get involved. Your device has everything it needs to show you your location and the route you should take. Not so long ago this was how all GPS devices worked!)

What does it say that so many of us believe the wrong thing about how GPS works, and are happy to use it anyway? It suggests to me one of two things: either we’re inattentive to encroachments on our privacy, the basis of our liberty; or we are attentive, we just put a low price on that privacy, trading it away for the convenience our smartphones offer. I’m not sure which is worse. I am sure that earlier generations would not have been nearly so willing to use technology that they understood so poorly.

  1. In this example, you’re 0.006 light-seconds closer to satellite B, which is about 1,118 miles. []

Requiem for Warhol

Once, in a Warhol team brainstorming meeting, I had a pretty good idea.

Warhol was the team responsible for the YouTube video editor, which was sometimes described as “iMovie in the cloud.” You could assemble new videos out of pieces of old ones, apply various special effects, add titles and transitions and so on, all in your web browser. It was pretty sweet.

There were some basic features we knew we needed to add to the editor. “Undo,” for example. Fast “scrubbing” through clips, and easy clip splitting and merging. Audio “ducking” and “pre-lap” and “post-lap.”

But in this meeting we were brainstorming ideas that could distinguish us from tools like iMovie, rather than merely achieve parity with them. What’s something that a YouTube-based video editor could do better than others?

To me the answer was clear: it could use YouTube’s unfathomably vast collection of videos as a stock-footage library, allowing users to create mashups from among billions of source clips.

There was one problem: nearly all those billions of videos had been uploaded under the terms of the standard YouTube license, which prohibited third parties from using videos in novel ways (ways that the original uploader might not approve of, after all). True, there was an option to upload your video under a Creative Commons license that did allow reuse by others. In fact I had personally worked on adding that option. But that option was not well-known, and hadn’t existed for long, and it required uploaders proactively to choose it, so only a tiny fraction of the videos on YouTube were licensed that way. The overwhelming majority were legally unavailable to would-be masher-uppers.

My idea for fixing this was called “reactive licensing.” You could create a video in the editor using whatever clips you wanted, pulled from all over YouTube no matter how they were licensed, but you couldn’t publish your video until getting approval from the clips’ owners. You’d click a “request approval” button and we’d send a message asking those owners to review your video project. They could respond with “Approve,” “Reject,” “Ignore,” “Block,” etc. If you got all the needed approvals, your edited video would become publicly playable.

Reactive Licensing generated some excitement. Here was something that YouTube, and only YouTube, was perfectly suited for. I whipped together a working prototype and we were just about to staff the project when the Legal department quashed it. Turns out a prospective use of someone’s video in a mashup—even one visible to no one but the creator and the owner—still violates the terms of service.1

Some time much later, Legal pushed through a change to the standard YouTube license (for unrelated reasons), and now Reactive Licensing became feasible! A couple of us on the Warhol team got excited again, and I started gearing up a development effort. But things had changed since I’d first conceived of Reactive Licensing. For one thing, both management chains—engineering and product—had been entirely replaced, from my boss all the way up to and including the CEO of Google. There were a couple of departmental reorgs thrown in to boot. For another, Google had become fixated on mobile computing, determined not to miss the boat on that trend as it felt it had with social networking. Everything that wasn’t a mobile app or couldn’t be turned into one became a red-headed stepchild, and the video editor was fatally desktop-bound. Finally, the creator and chief evangelist of the Warhol project had left to go work at Facebook. With his leadership, YouTube had harbored an institutional belief in the importance of balancing video-watching features with features for video creators and curators. Now, despite my efforts to keep it alive, that belief seemed to have departed along with him and the other managers who had supported it. The priorities that came down to the Warhol team now amounted to building toy apps that barely qualified as video-creation tools, such as the Vine workalike, or the thing for adding fun “stickers” to a video. (“Wow!”)

By that point my days at YouTube were numbered. This stuff simply wasn’t interesting—not to me, nor (I was sure) to our users. There were many interesting things we could have been doing, and that we knew our users wanted, but my strenuous efforts to make any of those happen were all denied.

My days at YouTube had seemed numbered once before, years earlier, after a frankly undistinguished tenure on two other teams that held little interest for me.

Back in those days, it was Google’s policy not to hire engineers for any specific role, but to hire “generalists” whom they felt could learn whatever they needed to know for wherever Google most needed them. I knew this when they hired me, but I still expected they’d put me on their new Android team (because I’d just finished 5+ years at Andy Rubin’s previous smartphone startup, Danger) or on their Gmail team (because I’d spent most of the preceding two decades as an e-mail technologist). I was surprised and a little disappointed when they put me at YouTube instead. I had no particular interest in or knowledge of streaming video. But more than that: YouTube was and is designed to keep you in a passive, semi-addicted state of couch potatohood, for which I was philosophically misaligned. I wanted to produce tools people could use. I wanted to empower the little guy and disintermediate the gatekeepers. Working on e-mail all those years, I’d been able to tell myself I was improving the world by making it easier for people to communicate with each other. Helping YouTube reach a milestone like a billion hours of watched video per day failed to move me.

On the other hand, Google was the Cadillac of software engineering jobs, and in those days it was still doing pretty well at living up to its “don’t be evil” motto. That, and the proximity of the YouTube office—half an hour closer to home than the main Google campus—was enough to energize me for a while… but only for a while.

If I hadn’t learned of the Warhol project, or if I’d been unable to transfer onto that team, my time at Google would have been over after two mostly forgettable years instead of seven mostly exciting ones. I hadn’t dreamed it was possible to build a working video editor in a web browser, but once I knew it was, I was hooked on the idea of delivering an ever more powerful creative tool to aspiring moviemakers who lacked the fancy computers and software they would otherwise need. To me it was the early days of desktop publishing all over again, but for video. Here at last was a niche at YouTube that wasn’t about driving increased “watch time.” It was about nurturing artistic expression.

We had big plans. We had working prototypes of a variety of special effects. We would build “wizards” that could make suggestions about shot sequences and pacing. We would give guidance on composition and color. We would commission educational materials from professional filmmakers. It would be “film school in a box.”

But even at its height, the Warhol project never quite got the resources or the marketing it needed, and certainly not enough executive leadership. Only seldom did we get to add one of the essential missing features we needed (like “undo”), to say nothing of the ones on our blue-sky wishlist. The rest of the time we were diverted onto other corporate priorities, such as specialized video-editing support for the short-lived Life In a Day tie-in, or addressing some complex copyright issue, or fixing bugs and performance problems.

Still, the YouTube video editor was well-loved and well-used by a small, dedicated group of users in the know. I myself relied on it while my kids were growing up for sharing well-edited videos of them to the families back east. But given its declining importance to YouTube’s management, it was just a matter of time before they killed it, like so many other beloved but neglected projects at Google. And now that inevitable day has come: the YouTube video editor will be discontinued on September 20th.


  1. For you copyright nerds: This was due to “synchronization rights,” an aspect of copyright that prohibited us from combining two videos in a way that could be construed as synchronizing one to the other. The design of the Warhol service was such that the edited video was created on our servers, and the result streamed to the user’s computer. If we could have arranged for the actual edits to happen on the user’s computer—ironically, the way iMovie works—we would have sidestepped the sync-rights issue. While not impossible, that would have been a cumbersome experience that defeated the purpose of a cloud-based video editor.

    Sync rights doomed another feature I’d hoped to create: “serendipitous multicam.” I was at a school play at my kids’ elementary school when I realized that nearly all the parents were shooting the same video. Several of them would later upload their videos to YouTube. If it weren’t for sync rights, YouTube could identify clusters of videos all recording the same event (using Content ID, the same audiovisual matching system used for detecting illicit uploads of copyrighted material), arrange them on a common timeline, and present them as different “camera angles” in a video-editing project, allowing everyone to stitch together their own best-possible movie from them. []

IT don’t come easy


Currently in the news, as I write this: the personal data of nearly all American voters was accidentally leaked by Deep Root Analytics, a conservative marketing firm employed by the Republican party, specializing in targeting political ads.

It is only the latest (and largest) in a seemingly endless stream of stories about accidental leaks of sensitive data online.

It isn’t that computer security is hard – at least, not compared with other kinds of engineering challenges, such as building a bridge that won’t fall down. Paradoxically, the problem is that programming is so easy.

Never mind that programmers often get called “wizard” and “genius” – that’s only occasionally true. The fact is that most programming is dead easy. Indeed the ease of creating working software is the very reason for the technology revolution we’ve been living through these past few decades. Remember when HTML was new and suddenly everyone and their dog had a web page because it was so easy? Programming is like that, but for making machines do useful work.

Not all programming is easy of course. Some of it is quite tricky – like computer security. But because so much of the rest of programming is so easy, most software engineers never develop the habits of rigor and precision that in other fields are simply the price of admission. The result: incompletely tested code full of exploits, best practices not followed and oftentimes not even known about, and your personal data and mine secured by the digital equivalent of Barney Fife.

Behind the magic of the blockchain

This entry is part 2 of 2 in the series The magic of the blockchain

[Cross-posted at blog.chain.com.]

This is part two in a series. In part one, we learned that the big idea behind blockchains is this:

I don’t give you digital data as payment. I give the rest of the world a signed statement saying I paid you.

In this article we’ll take a closer look at just how this is done. That is, we’ll look at how:

  • I give the rest of the world
  • A signed statement
  • Saying I paid you

Let’s take these one at a time, in reverse order.

Step 3: …Saying I paid you

Suppose I want to pay you ten dollars on a blockchain. To “say” that I paid you, I have to construct a message called a transaction that combines information about what I’m paying with where I’m sending it.

what “ten dollars”
where “to you”

The ten dollars is called the input to the transaction. Where it’s going is called the output. Ultimately this message will be incorporated into a blockchain, which we learned last time is a ledger – a record of transactions – that is immutable, distributed, and cryptographically secure. More about this below.

Of course I have to have ten dollars before I can pay it to you. It has to come from somewhere. So the input needs to be some earlier transaction saying that someone paid me ten dollars.1 This means each transaction has to have some sort of unique name, or number, or other identifier, so later transactions can refer back to earlier ones.

transaction-id “unique identifier for this transaction”
input “unique identifier for some earlier ten-dollar transaction”
output “you”

What if the only earlier transaction I have is one where I received twelve dollars? Since I only want to send you ten, and since I have to use up all of the earlier transaction (for reasons that will become clear), my new transaction must send you your ten dollars and must also send me two dollars as change. This means that a transaction must be able to have multiple outputs.

transaction-id “unique identifier for this transaction”
input “unique identifier for some earlier $12 transaction”
output1 “$10 to you”
output2 “$2 to me”

Now that we’ve decided transactions can have multiple outputs, it’s necessary to say which output of an earlier transaction you’re using as the input.

transaction-id “unique identifier for this transaction”
input “unique identifier for output1 of some earlier $12 transaction”
output1 “$10 to you”
output2 “$2 to me”

And what if I don’t have a single $10 or $12 transaction to draw on, but I do have a $5 one and a $7 one? Let’s let transactions have multiple inputs as well as multiple outputs.

transaction-id “unique identifier for this transaction”
input1 “unique identifier for output1 of some earlier $5 transaction”
input2 “unique identifier for output1 of some earlier $7 transaction”
output1 “$10 to you”
output2 “$2 to me”

Let’s now focus on those unique transaction identifiers. How should they be chosen so that:

  • Distinct transactions have distinct identifiers, and
  • Anyone in the world can construct his or her own transaction, and
  • No one needs to coordinate with anyone else, or with any central authority, in order to construct a transaction?

The main problem is to prevent “collisions” – two different transactions having the same identifier. If you and I both construct a transaction at the same time, on opposite sides of the world, and don’t coordinate with each other or anyone else, what’s to stop us from accidentally choosing MYCOOLTRANSACTION17 as the identifier for both transactions?

Blockchains solve this problem using a technique called hashing. This is a process that transforms a message of any length, such as the transactions we’re constructing, into a single number of a predetermined size, called a hash. There are several different recipes for computing the hash of a message; they have names like MD5 and SHA1. But good hashing recipes all have the same goals:

  • Given a message, it must be easy to compute the hash (well, easy for a computer);
  • Given only the hash, it must be close to impossible to come up with a message that produces it (even for a computer!);
  • Two identical messages always produce the same hash;
  • Even a tiny difference between two messages must produce wildly different hashes.

The ease of going from message to hash, and the difficulty of going from hash to message, makes this a so-called one-way function, an idea that will be important a little later on.

Now, when squashing a long message down to a number of a predetermined size, it’s unavoidable that different messages will collide – i.e., produce the same hash. But if the predetermined size is big enough – 32 bytes, say – and if the recipe is very good at scattering hashes evenly throughout all 232×8 possible values (that’s 100 quadrillion-quadrillion-quadrillion-quadrillion-quadrillion, give or take a few quadrillion-quadrillion-quadrillion-quadrillion-quadrillions), then the odds of a collision are so low as to be effectively impossible.2

So when you and I construct our transactions, we don’t choose identifiers at all. Instead, we compute identifiers that are nothing more or less than a hash of each transaction’s contents.

input1
  • transaction hash of some earlier $5 transaction
  • output1
input2
  • transaction hash of some earlier $7 transaction
  • output1
output1 $10 to you
output2 $2 to me

When you are deciding whether to accept this transaction as payment for something, you can consult the complete history of transactions on the blockchain to make sure that the inputs of this transaction really do exist, and that they haven’t already been spent in some other transaction. Later on, when you want to spend this money you’re now receiving, someone else will look at this transaction to make sure you own it.

Using a transaction’s hash as its unique identifier also explains why one must consume all of a transaction’s output at the same time (as when, in an earlier example above, I had to consume a $12 transaction output and return $2 to myself as change). If I could consume only part of an old transaction, that would alter the amount available from that old transaction. Altering the transaction would change its hash, which cannot be allowed if hashes are permanent, unchanging unique identifiers for transactions. Once published on a blockchain, a transaction can never change, it can only be referenced by newer transactions.

Step 2: …A signed statement…

Remember that this transaction, like all others on a blockchain, is a message that’s going to everyone in the world. My earlier $5 and $7 transactions, the source of the funds I’m paying to you, are sitting out there on the blockchain for everyone to see, like all unspent transaction outputs, just waiting to be used. What prevents someone else from using them in a payment of their own?

This is where the “to you” and “to me” part of the transaction outputs come into play. I need to be able to write “to you” in such a way that no one but you can construct a new transaction claiming that $10.

This is done using so-called public-private keypairs. You choose a very (very, very) large random number and keep it secret. This is your “private key.” This number can be transformed with some fancy arithmetic into another number, the “public key,” that you publish for everyone to see. The fancy arithmetic is a one-way function akin to hashing, so no one with only your public key can figure out your private key.

Public-private keypairs have some amazing superpowers. One of them is that you can digitally sign a message so that everyone in the world can be sure it’s you signing it. You do this by combining your private key in a particular way with the message you’re signing (or, more typically, a hash of the message you’re signing). The resulting “signature” has some special properties:

  • It was created using another one-way function, so no one looking at just the signature can discover either your private key or the message you’ve signed;
  • There remains a mathematical relationship between the signature and your public key, so if someone has that and the message you signed, they can verify that the signature is genuine. Even without knowing your private key, they can be sure the signature was made from it, and from that particular message and no other. (So no one can take your valid signature from one transaction and stick it on another one in the hope that it’ll be valid there – it won’t.)

So to make sure that only you can access the $10 I’m paying you, I secure the output of my transaction by attaching your public key. I also secure the $2 in change that I’m paying to myself by attaching my public key.

input1
  • transaction hash of some earlier $5 transaction
  • output1
input2
  • transaction hash of some earlier $7 transaction
  • output1
output1
  • $10
  • to your public key
output2
  • $2
  • to my public key

In order to redeem one transaction’s output for use as the input to another transaction, the payee supplies a digital signature made from the new transaction’s hash and his or her private key. My transaction paying you $10 redeems $5 and $7 from two earlier transactions, which were paid to my public key, so I redeem them like so:

input1
  • transaction hash of some earlier $5 transaction
  • output1
  • signature made from this transaction’s hash and my private key
input2
  • transaction hash of some earlier $7 transaction
  • output1
  • signature made from this transaction’s hash and my private key
output1
  • $10
  • to your public key
output2
  • $2
  • to my public key

Anyone can look at this transaction and verify that my signature on the inputs matches the public key attached to the earlier transactions’ outputs. As long as I’ve kept my private key secret, no one else can produce a valid signature that matches both this transaction and my public key.

The balance of money that I own on the blockchain is simply the sum of all unspent transaction outputs that have my public key attached.

Step 1: I give the rest of the world…

These transactions must be distributed to be useful, meaning that everyone in the world has, or can get, the data they need to validate transactions.3 If I create a transaction sending you $10, in principle you’ll need the entire history of earlier transactions leading up to that one in order to validate it (i.e., to believe that you’re really receiving $10), including all the unrelated transactions in the system to ensure I haven’t spent that same $10 somewhere else. When you want to spend the $10 I send you, your payee will need the same thing.4

It’s easy to imagine a system in which each new transaction is broadcast to all blockchain participants that are somehow subscribed to new-transaction notices. But the reality of network delays means that different subscribers will receive these notices in different orders. (Transactions that originate closer on the network will arrive sooner, in general, than transactions that need more “hops” to get to you.) The system only works if everyone has a consistent view of the transaction history: if I see A, then B, and you see B, then A, we might disagree about the validity of C, and a distributed ledger (or any ledger, really) can’t work if there’s disagreement about a transaction’s validity. Here’s why: if I were dishonest,5 I might try to exploit network delays to spend the same $10 twice, to two different people, each of whom might believe (thanks to differences in ordering) that theirs is the valid $10 and the other is the invalid double-spend. No one would be willing to accept either person’s (purported) $10 as payment for anything, and confidence in the whole scheme goes out the window.

What’s needed is some authority that everyone can trust to put a stamp on the official correct ordering of transactions; and once the order is set, to publish the sequence for all to see. The published sequence could, in principle, consist of a list of individual, timestamped transactions, digitally signed by the timestamping authority; but if there are more than just a few transactions each second, the processing and communication overhead of this approach is prohibitive. For efficiency, it’s better to group transactions into blocks, certifying and publishing a block containing many transactions every so often, with each block linked to the block before it (by including the earlier block’s unchangeable hash, in the same way transactions refer to other transactions by their hashes) in an ever-lengthening blockchain.

Whom to trust for generating blocks in the chain? That depends on how a particular blockchain is going to be used. If it’s for managing an anti-authoritarian global cryptocurrency, the answer is “no one.” If it’s for managing the loyalty-reward points of a national coffee-shop chain, the answer is probably the corporate parent of the coffee shops. Other use cases require in-between levels of trust.

There are techniques for concentrating trust or spreading it around to match different use cases. The just-trust-headquarters case is easy, of course: everyone sends their proposed transactions there, and listens for the blocks that occasionally emerge, confirming their transactions. The trust-no-one case has everyone broadcasting their proposed transactions to as many others as they can, and everyone racing to collect them up and be the one that produces the next valid block in exchange for some small reward (a process called “mining,” designed so no one person or group can control the contents of the blockchain). The in-between case of trusting a group of independent authorities can require that, if one of that group proposes a block, all or a majority of the others must endorse it by adding their digital signatures.

In most cases, the simple existence of a transaction in a block of the blockchain is the transfer of money: final and authoritative, with no further steps required before the recipient can spend what they’ve just received – by adding a transaction of their own.

Sounds great but

Transferring money (or other kinds of value) on a blockchain is as fast and easy as handing someone cash – easier, since you don’t have to be in the same place to do it.

But cash isn’t the right answer for every type of transaction. Sometimes you need a delay, and sometimes you need to cancel or reclaim your payment. And what about this everyone-can-see-every-transaction business? Do you really want to give everyone in the world the ability to look at your whole purchase history?6

There are ways to preserve privacy on a blockchain, as well as ways to delay payment until a certain time elapses or other conditions are met, and even ways to eliminate “counterparty risk” (the risk that you pay for something and then don’t get what you paid for), but I’ve gone on long enough for now and discussion of those will have to wait until part three.

[My thanks to my Chain colleagues Adam Ludwin, Nadia Ali, and Zarya Faraj for their input on early drafts of this article.]
  1. And that transaction had to have a source too, and so on, and so on. Where do the dollars on a blockchain ultimately come from? It’s a good question with a complicated answer that we won’t get to in this article. The short version is that participants can “buy in” to a blockchain in the same way one converts dollars to chips in order to play at a casino (among other options). []
  2. Many newcomers to hashing worry about the difference between “effectively impossible” and “actually impossible” and waste a lot of energy in a vain attempt to eliminate even the tiny remaining possibility of a hash collision. But that’s only because our ape brains are bad at understanding really, really, really tiny possibilities. When it’s likelier that your blockchain system will be disrupted by simultaneous drunken-rhinoceros stampedes at multiple datacenters than by even one hash collision, your efforts are better directed elsewhere (like putting up rhino fencing). []
  3. Who is “everyone in the world”? It would be more accurate to say “everyone participating in a particular blockchain.” A blockchain managing consumer dollars, as in the examples in this article, would necessarily be global, and “everyone in the world” would literally mean everyone in the world. Other blockchains managing other kinds of asset might confine participants to particular companies’ customers, or particular traders, investors, or institutions. []
  4. If you’re thinking that’s a tremendous data requirement, you’re not wrong. In a future article we’ll discuss clever ways to mitigate this and even make it fast. []
  5. I’m not. But if I were, that’s just what I would say. []
  6. Millennials: this is a rhetorical question. The answer is “no.” []

The magic of the blockchain

This entry is part 1 of 2 in the series The magic of the blockchain

[Cross-posted at blog.chain.com.]

You may have heard that the world of finance is getting excited about the potential of the blockchain (Economist, Financial Times, Forbes) and wondered:

What is the blockchain? What problem does it solve?

The blockchain is the technology behind the digital currency Bitcoin, but it has wider applicability. It is a collection of mathematical, recordkeeping, and communication procedures that makes it possible to trade digital assets securely.

Why is that a big deal?

Think of how useful it has been to digitize all kinds of information over the past generation or two.1 Digital information can be transmitted from place to place at lightning speed (literally), stored indefinitely, duplicated endlessly, and analyzed, processed, and transformed automatically, all without any loss of fidelity. This was all flatly impossible until quite recently. When it became possible, it didn’t just make things faster and more efficient. It enabled the creation of entirely new ways to produce and consume information that never existed before, and new industries built on top of them. Think Twitter, YouTube, Uber.

But money hasn’t been digitized – and has therefore been left out of all the dramatic innovation that has happened elsewhere in the economy – because digital information can be duplicated endlessly, which is at odds with the key feature of money: namely, that once you trade it away, you no longer have it. Think about it: without that feature, money would be useless.

If you have something valuable to sell, and I want to pay you with some digital data that I call “money,” what’s to stop me from keeping a copy of that data and then spending it again with someone else?

The blockchain, that’s what.

That’s impossible

You may now be thinking, “There’s no way to prevent the copying of digital data,” and you’d be right. Even so-called copy-protected data, such as a movie on DVD, doesn’t work on the principle of actually preventing copying. (It works by scrambling the data and refusing to descramble it unless the playback conditions are kosher. You can copy the scrambled data as many times as you like.)

And yet the blockchain does manage to prevent “double-spending.” You might now expect to hear an explanation of how it does so in terms of prime numbers, one-way functions, asymmetric encryption, and other arcana. But those are merely the implementation details, which we’ll save for another article. The main idea is this:

I don’t give you digital data as payment. I give the rest of the world a signed statement saying I paid you.

This is a fundamental and surprising insight into the nature of money: the token of exchange doesn’t matter as much as that everyone agrees an exchange took place. When everyone agrees on that, then I can’t double-spend that token, even if I’ve made a copy of it, because whoever I try to spend it with will know that token is no longer mine to spend. And they’ll know that you can spend it… and you’ll know that they know it.

The money at the bottom of the sea

Here it’s worth taking a little digression into the story of the Yapese and their Rai stones.

The Yapese live on Yap, an island in Micronesia in the South Pacific. You may have heard of the giant stone discs that they use as a traditional form of money. Hewn out of limestone rocks on Palau, some 200 miles away, and standing on edge, they tower over their owners, who sometimes have to stand on tiptoe just to peer through the holes drilled in their centers.

These coins weigh thousands of pounds each. They can’t be kept in a coin purse or even stored indoors, so they are propped up for display in public places. When it is time to spend one, the coin never moves – that would be too difficult, and might damage the coin (or the mover!). Instead, news of the transfer filters out to the Yapese, who maintain an oral history of the ownership of each coin. This shared “ledger” of trades ensures that only the current owner of a coin can spend it, no matter where it’s physically situated.

In fact, a rai stone being transported from Palau to Yap by outrigger canoe once famously sank to the bottom of the sea in a storm. When the sailors got home without their cargo, the Yapese did not doubt the fact of its existence, and since its location didn’t matter, they proceeded to trade it just like their other giant coins.

Imagine that an earthquake strikes the island of Yap. No one is hurt, fortunately, but all the stone discs are dislodged and they all roll downhill into the sea. No problem – the rai economy could still continue! Now imagine that, instead of an earthquake, collective amnesia strikes the Yapese. No one can remember who owns what! In that case the rai economy is destroyed and actual economic value is irretrievably lost. This illustrates that, in a very real sense, the record of trades is the money.

That kinda makes sense

Right?

Think about depositing money in the bank. You go to the bank and hand the teller some cash. Does the teller put the cash in a box with your name on it? No. Some of it goes into a vault, mixed with everyone else’s money. Some of it is put to work in the form of loans. In what sense is your money still in the bank? In the sense that the bank maintains a record of what it owes you if you ever come asking for it.

(To keep the bank honest, you also maintain your own records – deposit receipts, a checkbook register, etc. Occasionally your records and the bank’s may disagree. We’ll come back to this idea.)

Don’t we already have secure digital asset trading?

In a word, no.

The problem is that there are multiple recordkeeping systems that have to be reconciled with one another. When you swipe your debit card at a gas station (say), you initiate a series of steps in which you, your bank, the gas station, the gas station’s bank, and the card-processing network all have to make updates to their records. For efficiency, those updates are usually batched together with others, and they happen at different times for different participants in this transaction. The updates get transmitted between and among the participants, and those transmissions produce acknowledgements that also get transmitted. Each party has to incorporate the others’ details into its own recordkeeping, and if everything doesn’t agree, there may need to be some sort of dispute-resolution step, unless the cumulative error is small enough that it’s not worth it and someone just eats the loss.

All of this transaction clearing and settlement is comparatively slow and expensive and happens long after you drive away from the pump. The gas station has some “counterparty risk”: it has let you have its gasoline without being sure that it will get your money. (But that risk is small compared to the value of letting customers pay this way, which is why the gas station accepts it.)

This is all because no one involved – not you, not the banks, not the gas station or the card network – can be quite sure at any given moment where the money is,2 only that if they follow these procedures, it usually ends up in the right place. Each entity therefore does its own recordkeeping as a check and balance on the others – just the same way that you keep all your deposit receipts (you do, don’t you?) in case your bank ever shows the wrong balance on your account.

How does the blockchain help?

The blockchain is a ledger that is immutable, distributed, and cryptographically secure.

  • Ledger means that it’s a historical record of trades;
  • Immutable means that once a trade is added to the ledger, it is permanent and unchangeable;
  • Distributed means that everyone gets a copy of it (and keeps getting updates as they happen); and
  • Cryptographically secure means that that everyone can trust what’s in it.

If the parties in the gas-station example were all on the blockchain, what would be the steps by which the gas station gets paid?

  1. You add a transaction to the blockchain stating that some funds that you control (because in an earlier transaction, someone else transferred them to you) now belong to the gas station.

That’s all! When you commit to the idea that the record of trades is the money, there is no separate clearing or settlement step needed. The trade is its own settlement. As soon as you add that transaction to the blockchain, you lose control over those funds and the gas station gets control over them. The gas station can now add its own transaction to the blockchain transferring those funds to someone else – and you can’t.3

Would you like to know more?

In the original Bitcoin blockchain, there is one type of asset – bitcoin – and a predefined way in which new bitcoins can be “minted.” It is possible to generalize the idea of the blockchain, however, so that it can encompass many different kinds of asset (dollars, airline miles, corporate securities, loyalty reward points) with differing rules for issuing units of those assets onto the network. The next article in this series will take a closer look at the mechanisms behind the blockchain (including explaining why it’s called a “blockchain”) and describe some reasons and ways to alter the Bitcoin blockchain to make it suitable for other uses.

  1. I like to think of that scene in All The President’s Men when Woodward and Bernstein have to thumb through thousands of Library of Congress call slips one by one by one, hour after hour after hour. Today a few tap-tap-taps at a computer terminal are all that’s needed. []
  2. To say nothing of what the money is – which, as we’ve seen, is the record of who has paid what to whom. In this example (and in the economy at large) that record is a kaleidoscopic agglomeration of many differing and overlapping records, some of which lag behind others, some of which will never agree. It’s no wonder people are confused about money. []
  3. Of course it isn’t quite that simple. To achieve the cryptographic security that allows everyone to trust the contents of the blockchain, it takes a little time for the transaction to propagate across the network and for other participants on the network to certify it. []