20200714

2.11BSD Original Tapes Recreation

In Search of 2.11BSD, as released

Almost all of the BSD releases have been well preserved. If you want to find 1BSD, or 2BSD or 4.3-TAHOE BSD you can find them online with little fuss. However, if you search for 2.11BSD, you'll find it easily enough, but it won't be the original. You'll find either the latest patched version (2.11BSD pl 469), or one of the earlier popular version (pl 430 is popular). You can even find the RetroBSD project which used 2.11BSD as a starting point to create systems for tiny mips-based PIC controllers. You'll find every single patch that's been issued for the system.
Great promotional image of a PDP-11. Looks like an LA30 DecWriter ...

What you will not find, however, is the original 2.11BSD release tapes. You won't find the original sources. With some digging, you can find is 2.11BSD pl 195. This was released about 30 months after the original was released, and is the oldest one that's known to exist. The reason is that the original 2.11BSD tapes were distributed by USENIX. They charged a large fee for the tapes, and so not too many people bought them. And this was before Caldera released the ancient Unixes under a permissive license, so the bulk of the feed went to AT&T. It's cost made it a low volume item. Plus, there were patches all the time, so the master tapes were respun from time to time. The originals weren't preserved, alas, because storage was expensive and by the early 1990s the PDP-11s were a bit of a fringe machine, except in certain niches with long procurement times...

But wait, you said we have all the patches, patch -R is super easy to use. Just use that to go backwards, right?

Well, no. The patches aren't all context diffs. Instead they include instructions like "remove these files, then extract this uuencoded compress tarball" or other information destroying instructions. So, the information is lost, maybe for good. We can't get there.

Or can we. If we look at it in a vacuum, it sure sounds hopeless. Information destroyed, you said. However, while it's true information is destroyed in many of the patches, it's only one copy that's destroyed. We have other sources of information. The 2.11BSD release is part of a series of releases in the 2BSD family, so we have 2.10.1BSD, the prior release. That's been preserved. We know from the release notes that significant influxes of code came from 4.3BSD. There's also a usenet news group called comp.bugs.2bsd that posted patches. It's known that these patches wound up in 2.11BSD (also all the patches to 2.11BSD were posted there by the original authors until usenet went away).

The Project

So, that brings us to my 2.11BSD pl 0 restoration project. The goal of the project is to create two main artifacts. First, it would be cool to have a git repo that has all the 2.11BSD patch points in it. Second, it would be really cool to have a near copy of the 2.11BSD release tapes. This project aims to create these artifacts in a reproducible way. When completed, anybody can take the existing artifacts we have, the scripts from the project (including all the hints needed to get the data from other projects, as well as a few hand-crafted patches which produce results consistent with all know info about these files).

Status

I've worked my way through the 195 patches undoing them. Many of them are simple patches, packed in an annoying eclectic number of different ways. Some, however, destroy information and require research to untangle. I've done the best I can and have made it back to patch level 0 sources (almost, there's one or two lingering issues that need to be tracked down on relatively unimportant files). I've created a script to create a tape to load into my 2.11BSD pl 195 to build, in a chroot, a 2.11BSD pl 0 system.

There's a script that I've build that builds everything at the pl0 level (twice). There's noise in the release notes of at least some of these releases that there reproducibility issues. It's currently past the initial bootstrap phase. I can build all the libraries, but automation is needed.

Following Along

There's two ways to follow along. One is to follow me on Twitter. My handle is @bsdimp. Or you can look at my github project. I've written up the status there (though it's a couple of weeks out of date) and you can find the start of a paper (though it's even more out of date, but has more background). I update at least once a week, but sometimes more as I have time.

No comments: