Ibid bugfixes

The current version of my backup tool ibid is 52. Since the last time I wrote about it, I’ve added a new option, –check-names, for when you can’t rely on device/inode pairs (e.g., when moving your files onto a new filesystem); and fixed some important bugs, including ones that created unnecessary duplicate backups of files and unnecessary “update” records.

Download ibid here.

What a decade and a quarter can do

Halloween, 1998: I take a weekend trip to San Diego with my girlfriend, Andrea, and a few other friends. Money’s tight, in part because I haven’t drawn a salary from my struggling startup company for over two years, but that’s OK: our friends, most of whom are graduate students, are all broke too, so we crash in the living room of a couple we’ve come to see. We visit the San Diego Zoo, where I meet and feed a baby giraffe. My friend Paul captures the event (and much else from the weekend) on an amazing new device: an SLR camera body that has been partially hollowed out and fitted with a digital sensor and a small LCD display on the back (since the prism is gone and the viewfinder no longer works). It’s borrowed from the university where he works, which custom-built it for about ten thousand dollars. He shares the resulting digital photos with the rest of us by putting them on his department’s web server, but only temporarily because they take up so much disk space that he has to delete them after a few weeks.

1.25 decades later: Andrea is now my wife. We visit San Diego again. Thanks in part to income from my startup company, we have the means to stay in a hotel, and not only to visit the San Diego Zoo but to spring for their Safari Park’s “Roar and Snore” overnight camping experience. For his part, Paul is now an Academy-Award-winning computer-graphics researcher. Digital cameras on a par with his custom-built experimental rig from 1998 can now be had for around a hundred bucks and are so ubiquitous that they’ve all but killed the consumer film business. Disk space, likewise, is cheap enough that the company I work for has made a lucrative business out of giving away essentially unlimited amounts of it for free. I meet and feed not one giraffe, but two…

…and two amazing people who didn’t even exist 1.25 decades ago feed some giraffes too.

Quitting time

On this date fifteen years ago, several employees of NCD Software, formerly Z-Code, resigned simultaneously. I was one of them.

Two years earlier, Z-Code’s founder, Dan, sold out to Network Computing Devices over the objections of most of his staff. NCD, whose line of business had no discernable overlap with Z-Code’s, proceeded to drive Z-Code and itself right into the ground. Dan was the first casualty, lasting only a few months after the merger. NCD’s CEO and top VP, informally known as “the Bill and Judy show,” followed not long after. A lot of clueless mismanagement ensued. The energy of our once terrific engineering team dissipated before our eyes. We tried to turn things around, to make our bosses understand (for instance) that you can’t just tell an e-mail software team to make their e-mail suite into one of those newfangled web browsers that the new CEO had heard so much about, or that if you don’t pay your salespeople a commission for selling the company’s software, they won’t sell the company’s software.

Each time management did something boneheaded, we convened a session of “The Alarmists’ Club,” which met at lunch over beers and tried to think of ways to effect change at NCD. After enough of those proved fruitless, our discussions turned to how we could do things better ourselves. And so some time early in 1996 we sought the advice of a Silicon Valley lawyer about how to leave NCD en masse with minimal legal repercussions. The bulk of the advice was to put off discussion of any new venture until after the separation was complete; and to be aware that NCD was liable to use veiled threats, emotional pleas, and vague promises in an attempt to get us not to leave.

On 14 February 1996, NCD did all these things. We had prepared our terse resignation letters, offering two weeks notice, and delivered them in the morning. Within a couple of hours, Mike Dolan, one of the bigwigs from NCD headquarters in Mountain View, made the trip to the Z-Code offices in Novato to meet with us individually.

I was not yet 30, and when Dolan, an industry veteran, leaned on me in our one-on-one meeting I was definitely cowed. But my co-resigners and I had coached one another on how to withstand exactly the sort of combined intimidation and guilt trip that I was now getting, and so I stuck to my guns, kept the pointless justifications to a minimum, and refrained from blame or recrimination.

We maintained our solidarity, and because NCD declined our offer of two weeks’ notice, that was our last day there. We left feeling victorious, though what exactly we had won was never clear, and our sense of triumph was tempered by having effectively sandbagged our erstwhile coworkers.

After enjoying a few days of freedom it was time to start planning our new enterprise. But that’s another story…

SIEVE

In a recent e-mail exchange with my friend Kurt, we were discussing the problem of orbital space junk and the difficulty of cleaning it up. It’s a subject we’ve batted around on and off for many years, wondering about a workable and economical solution but never managing to find one. It’s been in the news more lately, as the crisis has grown more acute and inventors have trotted out different proposals, each more outlandish than the last.

In the middle of this exchange, after years of coming up with nothing, I suddenly invented my own solution, an idea I now offer publicly as the second in my occasional save-the-world series. It’s called SIEVE: Scanning, Illuminating, Even Vaporizing Engines.

It involves deploying into low earth orbit thousands of semi-autonomous robots. Each SIEVE unit is small and light and costs no more than a few hundred dollars of off-the-shelf components. Specifically, these components:

  • A solar panel for power;
  • Gyros for orientation;
  • A radio for coordination with other SIEVE units;
  • A camera;
  • A simple computer;
  • A Mylar mirror; and
  • A small rocket engine.

Each unit, when in sunlight, is in one of three modes: Scanning, Illuminating, and Vaporizing.

In Illuminating mode, the unit orients itself so that the mirror reflects sunlight through a given volume of space.

In Scanning mode, the unit trains its camera on a region of space that other nearby units are Illuminating and searches for debris.

In Vaporizing mode, numerous units all aim their mirrors to shine sunlight on a piece of debris, one previously identified by Scanning and Illuminating units and whose orbital trajectory has been plotted. Focusing enough sunlight on the debris for a long enough time should heat it to the point of vaporizing. If the debris can be fully vaporized, great; it should be harmless in that form. If it can’t, it might still expel enough vapor to slow its orbit (a la the laser broom idea) to the point where it falls back into the atmosphere.

The rocket engine is only needed twice: once to insert the unit into a distinct orbit when initially deployed, and once to deorbit the unit at the end of its service life.

Care will have to be taken that the SIEVE robots do not themselves become hazards to space navigation. And that they don’t go into Michael Crichton mode, become sentient, and decide the Earth is a gigantic ball of debris.

Elbows deep

Last week I replaced my six-year-old home server (which serves this website among many other functions) with a newer, faster, quieter computer. Transferring all the data and functions was a considerable effort in system administration. For the record, here are the steps I had to take.

  1. Download Fedora 12 install-CD image.
  2. Burn Fedora 12 install CD.
  3. Shut down sendmail and Apache.
  4. Dump MySQL database contents.
  5. Dump Postgresql database contents.
  6. Bring up new computer with temporary hostname.
  7. Install Fedora 12 on new computer.
  8. Create user accounts.
  9. Copy all data from old computer to new, under /old tree.
  10. Shut down old computer (permanently).
  11. Take over old computer’s hostname and IP address.
  12. Restore firewall config from /old.
  13. Restore DNS config from /old, bring up DNS.
  14. Restore sshd config from /old, bring up sshd.
  15. Restore Maildir trees from /old.
  16. Restore IMAP server config from /old, bring up IMAP server.
  17. Restore sendmail config from /old, bring up sendmail.
  18. Restore WordPress environment from /old.
  19. Bring up MySQL, restore contents from MySQL dump.
  20. Bring up Postgresql, restore contents from Postgresql dump.
  21. Restore Apache config from /old, bring up Apache.
  22. Restore Mailman environment from /old, bring up Mailman.
  23. Bring up apcupsd.
  24. Add printer.
  25. Set up network printing.
  26. Set up NFS.
  27. Resume backups.

Naturally not everything went according to plan. So in addition to the steps above I also had to solve:

  • Why all of my domains but one could be resolved;
  • Why the firewall was getting reset at startup;
  • Why inbound mail was not flowing;
  • Why the Ethernet interface had the wrong parameters at startup;
  • Why the monitor would not go into power-save mode;
  • How to get the Flash plugin running under x86_64;
  • Why the DVD-RW drive wasn’t visible some of the time.

Throughout all this, I frequently had to pause to locate and install needed software packages and Perl modules that weren’t part of the default Fedora setup. For good measure I also had to replace an external hard drive that was about to fail. (Thanks for the warning, Palimpsest!)

Happily all these things are now done, except that the monitor issue is a bona fide bug in the xorg video driver (duly filed) that someone else will have to deal with. Until then I just have to remember to switch the monitor off when I walk away.

This may all sound like deep wizardry, but it doesn’t feel like it to me. Having spent a lifetime coping and communing with these sometimes-cantankerous machines, it’s just busywork. Then I think of the number of other people in the world who could do all of this single-handedly and I become impressed with myself.

Don’t dis “don’t be evil”

Dear Steve Jobs,

We have some Apple products in our household. Also, I’m an employee of Google.

“Don’t be evil” is not bullshit. I and a lot of my colleagues work there precisely because of that mantra, and many of us are prepared to pack up and leave if we ever discover Google straying meaningfully from it. Gratifyingly, opportunities arise often in which to apply “don’t be evil” to a business or engineering decision, and a culture of vigorous and principled internal debate helps to ensure we choose correctly. Not all cases are black and white, of course (though some are), and it’s possible to err, but on the whole we do pretty well, non-evil-wise, especially compared to, well, every other publicly traded technology company.

In short, I take your remark as a personal insult, not to mention a telling comment on your own sense of right and wrong and, by extension, that of your company. I would welcome a sincere retraction, failing which I will have to reconsider continuing to be an Apple patron.

Thanks,
– Bob

Ibid 2009

The last I wrote about my backup tool, ibid, was three years ago (here; earlier post here), but I’ve continued making refinements to it. Then it was at version 24; now it’s at version 47 (download it here). Here are the changes since then, minus the uninteresting ones:

  1. Add –maxfiles.
  2. Don’t use Storable for the complete record structure; apparently a stringified form gets constructed in memory before it’s written to the file, which is disastrous for very large records. Use a custom streaming serialization method instead. Also, detect and reject unknown record versions.
  3. Another major rewrite. This one does away with the old runtime data structures based on big, inefficient Perl hashes, replacing them with strings containing compact “pack”ed values. This change yields enormous runtime memory savings, which matters after a few hundred sessions and many tens of thousands of files have accumulated in your fileset record. Also fixed a few documentation bugs and eliminated some dead code.
  4. Rename options for greater consistency: –limit (-l) is now –maxbytes (-b); –files (-f) is now –maxfiles (-f).
  5. Report when one or the other limit (bytes or files) is reached.
  6. Add –prune-sessions option.
  7. Add new –single-file-size-limit option; renamed –maxbytes and –maxfiles to –session-size-limit and –session-files-limit, respectively. Switched from &foo() function-calling syntax to foo().
  8. Add a new history-entry type: zero-length (“empty”) files. These are recorded in the session record but not copied to the archive, to save overhead.
  9. Include <dev:ino> in –dump output when –verbose is supplied.
  10. Support a device-map file, $HOME/.ibid/.devmap, for tracking a filesystem when its device number changes.
  11. Document the .devmap file; add –trim-report; support “xz” compression of session files; support optional callbacks in foreach_name_history().
  12. Add another level of depth to the “target” path for each new power of 10 in the session number. So session 7311 is rooted at TARGET/FILESET/7000/300/7311, and session 29582 is rooted at TARGET/FILESET/20000/9000/500/29582. Path elements that would start with a 0 are omitted; e.g., session 4006 is rooted at TARGET/FILESET/4000/4006, not at TARGET/FILESET/4000/000/4006.
  13. Document –trim-report.

There’s still no home page for ibid, but at least now all ibid-related posts on my blog are grouped under the tag ibid.

Right move made

Before the iPhone and the Blackberry was the Sidekick, a.k.a. the Hiptop, the first mass-market smartphone and, for a while, the coolest gadget you could hope to get. Famously, and awesomely, the Hiptop’s spring-loaded screen swiveled open like a switchblade at the flick of a finger to reveal a thumb-typing keyboard underneath, one on which the industry still hasn’t managed to improve. Your Hiptop data was stored “in the cloud” before that term was even coined. If your Hiptop ever got lost or stolen or damaged, you’d just go to your friendly cell phone store, buy (or otherwise obtain) a new one, and presto, there’d be all your e-mail, your address book, your photos, your notes, and your list of AIM contacts.

The Hiptop and its cloud-like service were designed by Danger, the company I joined late in 2002 just as the very first Hiptop went on the market. I worked on the e-mail part of the back-end service, and eventually came to “own” it. It was a surprisingly complex software system and, like much of the Danger Service, required continual attention simply to keep up with rising demand as Danger’s success grew and more and more Sidekicks came online.

Early in 2005, the Danger Service fell behind in that arms race. Each phone sought to maintain a constant connection to the back end (the better to receive timely e-mail and IM notices), and one day we dropped a bunch of connections. I forget the reason why; possibly something banal like a garden-variety mistake during a routine software upgrade. The affected phones naturally tried reconnecting to the service almost immediately. But establishing a new connection placed a momentary extra load on the service as e-mail backlogs, etc., were synchronized between the device and the cloud, and unbeknownst to anyone, we had crossed the threshold where the service could tolerate the simultaneous reconnection of many phones at once. The wave of reconnections overloaded the back end and more connections got dropped, which created a new, bigger reconnection wave and a worse overload, and so on and so on. The problem snowballed until effectively all Hiptop users were dead in the water. It was four full days before we were able to complete a painstaking analysis of exactly where the bottlenecks were and use that knowledge to coax the phones back online. It was the great Danger outage of 2005 and veterans of it got commemorative coffee mugs.


The graphs depict the normally docile fluctuations of the Danger Service becoming chaotic

The outage was a near-death experience for Danger, but the application of heroism and expertise (if I say so myself, having played my own small part) saved it, prolonging Danger’s life long enough to reach the cherished milestone of all startups: a liquidity event, this one in the form of purchase by Microsoft for half a billion in cash, whereupon I promptly quit (for reasons I’ve discussed at by-now-tiresome length).

Was that ever the right move. More than a week ago, another big Sidekick outage began, and even the separation of twenty-odd miles and 18 months couldn’t stop me feeling pangs of sympathy for the frantic exertions I knew were underway at the remnants of my old company. As the outage drew out day after day after day I shook my head in sad amazement. Danger’s new owners had clearly been neglecting the scalability issues we’d known and warned about for years. Today the stunning news broke that they don’t expect to be able to restore their users’ data, ever.

It is safe to say that Danger is dead. The cutting-edge startup, once synonymous with must-have technology and B-list celebrities, working for whom I once described as making me feel “like a rock star,” will now forever be known as the hapless perpetrator of a monumental fuck-up.

It’s too bad that this event is likely to mar the reputation of cloud computing in general, since I’m fairly confident the breathtaking thoroughness of this failure is due to idiosyncratic details in Danger’s service design that do not apply at a company like, say, Google — in whose cloud my new phone’s data seems perfectly secure. Meanwhile, in the next room, my poor wife sits with her old Sidekick, clicking through her address book entries one by one, transcribing by hand the names and numbers on the tiny screen onto page after page of notebook paper.

Score one for the engineers

I’ve been asked about the reason for my low opinion of Microsoft. It isn’t just me of course — a lot of technologists regard Microsoft that way. Here’s an anecdote that illustrates why.


The year is 1993. No one’s ever heard of the World Wide Web. Few people have even heard of e-mail. Too often, when I explain my role at the e-mail software startup Z-Code to friends and relatives, I also have to explain what e-mail is in the first place.

Those who do know about e-mail in 1993, if transported to 2009, would not recognize what we call e-mail now. To them, e-mail looks like this:

It’s all plain, unadorned text rendered blockily on monochrome character terminals. For the most part, variable-width, anti-aliased fonts are years in the future. Boldface and italic text exist only in the imagination of the reader of a message that uses ad hoc markup like *this* and _this_. Forget about embedded graphics and advanced layout.

However, in 1993 something has just been invented that will catapult e-mail into the future: the MIME standard, which permits multimedia attachments, rich text markup, and plenty more. Almost no one has MIME-aware e-mail software yet. Meanwhile, at Z-Code, we’re busy adding MIME capabilities to our product, Z-Mail. The capabilities are primitive: for instance, if we detect that a message includes an image attachment, we’ll launch a separate image-viewing program so you can see the image. (Actually rendering the image inline comes much later for everyone.)

The Z-Mail user is able to choose an auto-display option for certain attachment types. If you have this option selected and receive a message with an image attachment, your image-viewing program pops up, displaying the attachment, as soon as you open the message. (Without the auto-display option set, you explicitly choose whether or not to launch the viewer each time you encounter an image attachment.)

There comes the time that the marketing guy at Z-Code asks if we can add automatic launching of Postscript attachments, too. In 1993, Postscript is the dominant format for exchanging printable documents. (Today it’s PDF.) Turns out that a lot of potential Z-Mail users are technically unsavvy business types who exchange Postscript files often, jumping through tedious hoops to attach them, detach them, and print them out. Automatically popping up a window that renders a Postscript attachment right on the screen would be pure magic to them, changing them from potential Z-Mail users into actual Z-Mail users.

But there is a problem. Postscript files differ from image, sound, and other document files in one important respect: whereas those latter types of file contain static, inert data, requiring special programs to render them, Postscript files are themselves full-fledged computer programs. The Postscript renderer is just a language interpreter — like a computer within the computer, running the program described by the Postscript document.

Virtually every Postscript program — that is, document — is completely innocuous: place such-and-such text on the page here, draw some lines there, shade this region, and so on. But it’s perfectly conceivable that a malicious Postscript document — that is, program — can act as a computer virus, or worm, causing the computer to access or alter files, or use the network or CPU in mischievous ways without the user’s knowledge or approval.

So launching the Postscript interpreter with an unknown document is risky at any time. Doing so automatically — as the default setting, no less, which is what the marketing guy wanted — is foolhardy. (The reason it’s generally safe to send Postscript documents to Postscript printers — which include their own Postscript interpreters — is that unlike computers, printers do not have access to resources, like your files, that can be seriously abused.)

We, the Z-Code engineers, explain the situation and the danger. The marketing guy dismisses the possibility of a Postscript-based attack as wildly unlikely. He’s right, but we point out that adding the feature he’s asking for would make such an attack more likely, as word spreads among the bad guys that Z-Mail (a relatively widely deployed e-mail system in its time and therefore a tempting hacking target) is auto-launching Postscript attachments. Marketing Guy argues that the upside of adding the feature is potentially enormous. We say that one spam campaign containing viral Postscript attachments could cripple the computers of Z-Mail users and only Z-Mail users, a potential PR catastrophe. Marketing Guy says that our users don’t know or care about that possibility and neither should we. We say it’s our job to protect our users from their own ignorance.

The issue gets bumped up to Dan, our president, who is clearly leaning toward the marketing guy’s enormous potential upside. But after we vigorously argue the technical drawbacks of the plan and our responsibility to keep our users safe in spite of themselves, he goes with the suggestions from Engineering: do add a Postscript-launching option but turn it off by default, and educate users about the danger when they go to turn it on.


This is a run-of-the-mill example of the kind of tension that exists between Marketing and Engineering in all software companies. Issues like this arose from time to time at Z-Code, and sometimes Engineering carried the day, and sometimes Marketing did. It was a good balance: it took Marketing’s outlandish promises to keep Engineering moving forward, and it took Engineering’s insight and pragmatism to keep the product safe and reliable.

As an industry insider, my impression of Microsoft is that Marketing wins all the arguments, with all that that implies for the safety and reliability of their software.

Kernel of truth

Yesterday I received a link — buried at the end of a long chain of e-mail forwards all saying variants of, “Wow, this is scary!” — to a video that depicted groups of people using stray microwaves from their cell phones to cause popcorn kernels to pop.

In the video, three or four phones are placed on a table in a ring around a few popcorn kernels, and then someone dials each of the phones to cause them to ring. Within a couple of seconds the kernels start to pop.

The sender who forwarded it to me asked if I knew whether this was for real.

My reply asked the sender to consider the power available to a cell phone versus the power available to a microwave oven. A microwave oven draws 15 amps of current from the household mains, producing hundreds of watts of focused microwave energy with the specific purpose of heating up food placed in the target area — and even so, it takes at least 30 seconds of exposure for the first kernel to pop. A cell phone, by contrast, houses a battery rated in milliamp-hours, with a typical one holding 1500 milliamp-hours of energy. This means it can draw 1 milliamp for 1500 hours, or 1500 milliamps (1.5 amps) for one hour, and so on. If a cell phone tried to draw 15 amps from such a battery, then (apart from the phone melting) the battery would be depleted in 1/10 of an hour — six minutes. Clearly cell phones do not draw 15 amps, but even if they did, they wouldn’t convert nearly as much of that energy into microwaves as microwave ovens do; and even if they did that, the microwave energy wouldn’t be focused the way it is in a microwave oven. Yet the video depicts the first kernel popping within about three seconds.

If that much microwave energy really were reaching the popcorn kernels, we’d also be seeing the other effects of powerful microwaves on objects in the immediate vicinity. For example, we’d see sparks and electrical arcing from metal objects, including the cell phones themselves. But we don’t.

Finally, why would a ringing cell phone cause popcorn to pop? To ring, a cell phone merely has to receive an “incoming-call” signal from a cell tower. The phone doesn’t begin transmitting any appreciable amount of power until after you answer it and begin speaking!

At this writing, the video in question has a supposed 11,031,929 views. When I think of all the people across the Internet who are now arranging their cell phones in rings around a handful of popcorn kernels, I despair.