Merkle trees and build systems
This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible. |
In traditional build tools like Make, targets and dependencies are always files. Imagine if you could specify an entire tree (directory) as a dependency: You could exhaustively specify a "build root" filesystem containing the toolchain used for building some target as a dependency of that target. Similarly, a rule that creates that build root would have the tree as its target. Using Merkle trees as first-class citizens in a build system gives great flexibility and many optimization opportunities. In this article I'll explore this idea using OSTree, Ninja, and Python.
OSTree
OSTree is like Git, but for storing entire filesystem images such as a complete Linux system. OSTree stores more metadata about its files than Git does: ownership, complete permissions (Git only remembers whether or not a file is executable), and extended attributes ("xattrs"). Like Git, it doesn't store timestamps. OSTree is used by Flatpak, rpm-ostree from Project Atomic/CoreOS, and GNOME Continuous, which is where OSTree was born.
My company has been using OSTree to build and roll-out software updates to Linux-based devices for the last four years. OSTree provides deployment tools for distributing images to different machines, deploying or rolling back an image atomically, managing changes to /etc, and so on, but in this article I'll focus on using OSTree for its data model.
Like Git, OSTree stores files in a "Content Addressable Store", which means that you can retrieve the contents of a file if you know the checksum of those contents. OSTree uses SHA-256, but I will use "SHA" and "checksum" interchangeably. This store or "repository" is a directory in the filesystem (for example "ostree/") where each file tracked by OSTree (a "blob" in Git terminology) is stored under ostree/objects/ as a file whose filename is the SHA of its contents. This is something of a simplification because file ownership, permissions, and xattrs are also reflected in the checksum.
A "tree" (directory) is stored as a file that contains a list of files and sub-trees, and their SHAs. The filename of this file, just like for blobs, is the SHA of its contents. This way the entire tree, including its sub-trees and their sub-trees, and the contents of each of the files within, can be uniquely identified by a single SHA. This data structure is called a Merkle tree.
You can have different versions of a tree, like Git commits or Git branches, or completely separate trees, but any common files are stored only once in the OSTree repository (in the figure above, file2.txt and d/file3.txt are identical so they are stored only once in ostree/objects/). Like Git, OSTree has "commit" and "checkout" operations.
OSTree "refs" (short for "references"), similar to Git refs, are how OSTree implements branches and tags. A ref is a metadata file in the OSTree repository: Its filename is whatever you want it to be, such as the branch or tag name, and its content is a single SHA pointing at a tree. The connection to the tree is indirect as it is really the SHA of a "commit" which in turn points at a tree, but in this article I'll ignore commits as they aren't directly relevant.
OSTree + Ninja
Ninja is a build system similar to Make. I covered Ninja for LWN three years ago. Unlike Make, Ninja doesn't support wildcards or loops, so you're supposed to write a "configure" script to generate a Ninja file that specifies each build target explicitly. At my company, the internal build system is a 3,000-line Python file, plus several dozen YAML files with packaging instructions for various components; when run, this Python script generates a 90,000-line Ninja file.
In Ninja (like Make) build targets and inputs are files. OSTree refs are also files. The build system, then, creates a different ref for each build step, and the ref itself is the target (output) of that build rule. For example, the target of a generated Ninja rule might be the file "build/ostree/refs/heads/xyz", where "build/ostree" is the OSTree repository in the "build" output directory, and "xyz" is the ref name.
Here's a concrete example from the build system, where it builds a rootfs for the Linux devices:
rootfs = ostree_combine([ l4t_kernel(), bionic_userspace(), package("a"), package("b"), container("c"), ]) phony("rootfs", rootfs) default(rootfs, ...)
Each of l4t_kernel(), bionic_userspace(), package(), and container() is a Python function that creates an OSTree tree (perhaps by downloading and unpacking a tarball, or by running an upstream makefile, the details don't matter right now), then creates an OSTree ref pointing at this tree, and returns the ref (which is a Python string generated internally, perhaps "build/ostree/refs/heads/package/a").
ostree_combine() is a Python function that takes any number of OSTree refs, each of which points at a tree; it combines them together into a single tree, creates another ref pointing at this tree, and returns the ref (this time it might be called "build/ostree/refs/heads/ostree_combine/c85e333f577b").
But I lie — these functions don't do any of this at all. What they do do, is write a Ninja rule, that when invoked via ninja, will carry out those steps.
When you run ninja in an incremental build, if any of the input refs changes — remember, a ref contains a SHA pointing at a tree, so if any file in the tree changes then the ref's content will change — then Ninja will know that the rootfs target is out of date and the ostree_combine rule needs to be re-run.
Crucially, you never need to write out any of these ref filenames explicitly in the build script; you pass them around like any normal Python variable, and other functions can take them and record them as dependencies in their own Ninja rules.
In the example above we also create a Ninja phony rule to create a top-level target name that is convenient to type at the ninja command line, and we add it to Ninja's list of default targets.
Benefits
The ability to specify entire trees as build dependencies or targets means that the same mechanism can be used for specifying coarse-grained dependencies (such as third-party packages that are being integrated into the rootfs) as for fine-grained dependencies (individual files). One obvious benefit is that toolchains and build environments can be managed explicitly: The rule to compile something can take a "build root" rootfs as one of its dependencies, and chroot into it to run the compilation command (some literature calls this "hermetic builds"). This build rootfs can itself be created by other rules.
Less obvious, but possibly the best thing about this approach, is the ability to pass intermediate build outputs around as variables. We saw an example in the Python snippet earlier, where package("a") returns a target name that we passed into another rule, ostree_combine(). This means you don't have to come up with a name for every single intermediate artifact; you can generate them automatically. The composability leads to concise and readable build scripts: the example above is not at all contrived, it is very similar to the production build script. By making this easy to express, it is easier to exploit opportunities to cache or parallelize steps in the build.
To provide a (somewhat contrived) example: Generating the ldconfig cache only depends on the contents of a few directories like /lib and /usr/lib; similarly, the mandb cache only depends on the contents of /usr/share/man.
Traditionally, these operations are run in series. But a build system that can define a dependency that is a subtree of a previous target, could specify separate rules for these operations, run them in parallel, and then merge the results back into a final rootfs tree. In this example, even if the rootfs SHA changes, it's possible that the /usr/share/man subtree hasn't changed, so there's no need to re-run mandb. In the diagram above, red arrows operate on data (file contents); green arrows operate on OSTree metadata.
[Update (added paragraph)] You can imagine there are applications of this to "cloud" build systems that farm out the execution of individual build steps to remote build servers: To run this mandb rule, a remote server doesn't need to download the entire rootfs, only /usr/share/man. Merging the output from this build step back into the rootfs can be done by operating solely on Merkle tree metadata.
OSTree's tooling also gives a few compelling benefits that really need to be pointed out, but you can get them by exporting (committing) the final build artifacts to an OSTree repository (you don't need the close OSTree integration, throughout the intermediate build steps, that I have been describing):
Visibility of changes: My company's continuous integration (CI) system runs ostree diff on each pull request, so that the developers can see exactly which files changed in the output rootfs. This is a wonderful tool for gaining confidence in the correctness of the incremental builds.
Fast incremental deployment: OSTree provides tools for deploying a tree to a remote device. This is used to deploy changes to devices in the field ("over the air" software updates) but this same, production software update process is fast enough for interactive development (an incremental build + deploy + reboot in under a minute).
Implementation details
Our Python script has various functions for getting files, tarballs, Git snapshots, and apt packages into OSTree. A tree can consist of as little as a single file, and refs are cheap.
There are also various functions to manipulate trees, such as ostree_combine() in the example above, but also ostree_ln(), ostree_mkdir(), and ostree_mv(). These are fast because they operate directly on OSTree metadata; they don't need to do ostree checkout to manipulate the trees. Note that a ref can point to any tree, it doesn't have to be rooted at the "/" of your final image.
To run a command, such as a compilation, there is ostree_mod(), which modifies a tree by running a given command. It will check out the specified tree, optionally chroot into it, run the specified command, and create a new tree from the output. For example:
ostree_mod( input_tree=ostree_combine([build_root, src]), command="make -C /src install DESTDIR=/dest", chroot=True, capture_output_subdir="/dest")
This uses fakeroot and bubblewrap to sandbox the command so that it can't access anything outside of the input tree. Bubblewrap is a tool born from Red Hat's Project Atomic, and used by Flatpak among others, that allows unprivileged users to create secure sandboxes. Here bubblewrap is not used for security, but as a convenient way of ensuring correct, "hermetic" builds. Our version of fakeroot is heavily patched so that the build command sees the file permissions that are stored by OSTree; this allows us to run the build as an unprivileged user but still modify root-owned files.
OSTree's "bare" repository format is used, which means that the checkout operation only needs to create hard-links to the relevant files inside the repository; this needs to be fast because every build rule that calls ostree_mod() involves an ostree checkout and an ostree commit. OverlayFS is used to ensure that the OSTree repository is not modified by accident via those hard-links. This patch for bubblewrap is needed to support OverlayFS; the patch probably isn't upstreamable because it requires additional capabilities, which is at odds with the bubblewrap project's security goals. There are also several OSTree patches, some of which are merged and some not (yet).
apt2ostree
apt2ostree is a tool that has been extracted from our build system. It builds a Debian/Ubuntu rootfs from a list of .deb packages — much like debootstrap or multistrap. Unlike those tools, the output is an OSTree tree rather than a normal directory. It is faster, parallelized, and incremental. It also records package versions in a "lockfile" for reproducible builds.
From a list of .deb package names, apt2ostree downloads and unpacks each package into its own OSTree tree, then it combines these into a single tree (so far this is equivalent to debootstrap's "stage 1"). It then checks out the tree, runs dpkg --configure -a within a chroot ("stage 2"), and commits the result to OSTree.
From a list of packages, apt2ostree performs dependency resolution (via aptly) and generates a "lockfile" that contains the complete list of all packages, their versions, and their SHAs. This lockfile can be committed to Git. Builds from the lockfile are functionally reproducible.
"Stage 1" of apt2ostree is fast for several reasons. It only downloads and extracts any given package once; if it is used in multiple images it doesn't need to be extracted again. This saves disk space too, because the contents of the packages are committed to OSTree so they will share disk space with the built images. Downloading and extracting is done in a separate Ninja rule per package; this allows parallelism (it can be downloading one package at the same time as compiling a second image, or performing other build tasks within the larger build system, all thanks to Ninja) and incremental builds (there is no need to repeat work if the package version hasn't changed). Combining the contents of the packages is fast because it only touches OSTree metadata.
apt2ostree only has a single user as far as I know (my company's build system). See the README file in the apt2ostree repository on GitHub for more information. I don't necessarily expect anyone to use it, but it serves as a good self-contained example of the techniques described in this article.
Conclusions and acknowledgments
We have found that OSTree and Ninja work very well together, thanks to a neat hack: Using a "ref" (a file in the OSTree metadata directory) as the target or dependency of a Ninja rule, to track changes to an entire tree. But most important, I think, is the idea of trees as first-class citizens in a build system. For researchers, OSTree and Ninja provide an easy way to explore these ideas. For production, we have also found OSTree and Ninja to work fantastically well for our use case: system integrators building container images and rootfs images for embedded Linux devices.
Most of these ideas (the good ones, at least) are from my colleague William Manley, who also did most of the implementation of the build system. I merely wrote it up.
Index entries for this article | |
---|---|
GuestArticles | Rothlisberger, David |
(Log in to post comments)
Merkle trees and build systems
Posted May 28, 2020 14:37 UTC (Thu) by ScienceMan (subscriber, #122508) [Link]
"Spack models the dependencies of packages as a directed acyclic graph (DAG). The spack find -d command shows the tree representation of that graph. We can also use the spack graph command to view the entire DAG as a graph.
[Example]
$ spack graph hdf5+hl+mpi ^mpich
o hdf5
|\
| o mpich
| |\
| | |\
| | | |\
| | o | | libxml2
| |/| | |
|/|/| | |
| | |\ \ \
o | | | | | zlib
/ / / / /
| o | | | xz
| / / /
| | o | libpciaccess
| |/| |
|/| | |
| | |\ \
| | o | | util-macros
| | / /
| | | o findutils
| | |/|
| | | |\
| | | | |\
| | | | | |\
| | | o | | | texinfo
| | | | | o | automake
| | | | |/| |
| | | |/| | |
| | | | | |/
| | | | | o autoconf
| | | | |/|
| | | |/|/
| | | o | perl
| | | o | gdbm
| | | o | readline
| | | o | ncurses
| |_|/ /
|/| | |
o | | | pkgconf
/ / /
| o | libtool
| |/
| o m4
| o libsigsegv
|
o libiconv
"
Link:
https://spack-tutorial.readthedocs.io/en/latest/tutorial_...
Merkle trees and build systems
Posted May 28, 2020 16:31 UTC (Thu) by nim-nim (guest, #34454) [Link]
There’s no other reliable way to do things, you need the directed property because in the end the computer will need to compute a sequential execution plan, and you need the acyclic property to avoid looping indefinitely in the middle of this plan.
Component systems that break this (for example, when a parent introduces a dependency on its children by testing them in its unit tests) are nor automate-able. They need someone to construct manually a giant pile of poo (vendor) so the computer does not need to resolve the giant poo dependency graph to recreate it from scratch.
Merkle trees and build systems
Posted Jun 6, 2020 5:03 UTC (Sat) by rgh (guest, #13511) [Link]
a giant pile of poo To use the technical term!
Merkle trees and build systems
Posted May 29, 2020 8:31 UTC (Fri) by drothlis (guest, #89727) [Link]
imagine. The novel thing in this article is representing each individual
build artifact (be it a single executable, a package, or an entire tree
rootfs) as its own Merkle tree. Each of these trees is representing the
files & directories in a single build artifact, not dependencies between
artifacts. Dependencies are still handled by the underlying build system
(Ninja, in this case).
Merkle trees and build systems
Posted May 30, 2020 13:20 UTC (Sat) by nim-nim (guest, #34454) [Link]
The result certainly looks convenient but if it was more than an ostree transposition of how things were already done you would not have an apt2ostree in the middle of the article.
Merkle trees and build systems
Posted May 31, 2020 19:53 UTC (Sun) by MrWim (subscriber, #47432) [Link]
This seems like a strange comment to make on an article specifically describing the advantages of making such a transposition.
The only difference between storing source-code in git vs storing it as source tarballs is storage and performance characteristics. But: it's exactly those differences that make git so much more useful than source tarballs. You interact with it and think about it differently once certain operations are cheap.
At the risk of restating things already stated in the article:
- Deployment is now incremental and cheap - you only need to transfer the blobs that have changed, not the whole tarball. The effect is you don't need a separate deployment mechanism during dev vs prod.
- You can store many many different versions of your built images in a reasonable amount of space - so you just stop worrying about it what to keep and what the deletion policy should be
- Your intermediate build steps now share storage with your final rootfses, so you don't need to worry about reducing the number of build steps or intermediate artefacts.
- The cost of comparing two trees is now super cheap, so becomes a natural operation to do. Interested in how a particular change has affected your installed image? It's fine we run a diff on the whole tree for every PR. This is particularly useful for changes to the build system itself.
- Maybe you're performing some transformation on your tree that only depends on one subdirectory? That's fine: extracting a subdir is cheap and because it's a merkle tree you know if you need to rerun your build steps, because you can find the SHA of the subtree cheaply.
- Composing a new tree out of subtrees or partial trees is super cheap, so you do so wherever convenient.
You could implement all of the above with tarballs, but it would be so impractical that you wouldn't. With Merkle trees it's natural. Whether it's innovative or not is irrelevant.
Merkle trees and build systems
Posted Jun 5, 2020 10:03 UTC (Fri) by nim-nim (guest, #34454) [Link]
And I strongly suspect that a lot of the parts where you would find existing systems inefficient, are inefficient because rpm and apt systems have to deal with the real world, where code maintenance and ownership is distributed, and you do no have a single dev entity owning the whole codebase that can do whatever it wants at all stages of the build in its own custom (ostree) sets.
From this POW the article (IMHO) mistakes the convenience of a single unified BSD-style build tree with the convenience of ostree itself. Unified build trees *are* definitely more convenient, they just do not scale to the messiness of real life dev organization structures.
Anyway, I did write that the result looked convenient, so no criticism of the ostree implementation on my part, just reacting to people that implied ostree invented hot water.
Merkle trees and build systems
Posted May 28, 2020 15:42 UTC (Thu) by dezgeg (subscriber, #92243) [Link]
Thu I did not quite understand from this article is how does this system deal with changes to the build rules, e.g. if a package d is added to the rootfs example or that some option to ./configure is added to some dependency. How does it know that the resulting ninja target of ostree_combine() now needs to be rebuilt? Or is the name of the target somehow dependant on the input parameters of the ostree_combine() call (i.e. same as Nix)?
Merkle trees and build systems
Posted May 28, 2020 17:29 UTC (Thu) by atai (subscriber, #10977) [Link]
Merkle trees and build systems
Posted May 28, 2020 21:55 UTC (Thu) by civodul (guest, #58311) [Link]
Merkle trees and build systems
Posted May 28, 2020 21:16 UTC (Thu) by MrWim (subscriber, #47432) [Link]
> is the name of the target somehow dependant on the input parameters of the ostree_combine() call (i.e. same as Nix)?Yes, that's right. The names of (almost) all the targets are auto-generated based on the inputs and the command to execute, although even if they weren't ninja
would take care of that for us because it treats the command to execute as one of the inputs for the purposes of dirty detection.
Merkle trees and build systems
Posted May 28, 2020 21:29 UTC (Thu) by drothlis (guest, #89727) [Link]
Or you could change the target name, as you suggest, though you'd need to GC the ostree repo in your build folder periodically.
Merkle trees and build systems
Posted May 29, 2020 6:29 UTC (Fri) by jeeger (subscriber, #104979) [Link]
Merkle trees and build systems
Posted May 29, 2020 10:34 UTC (Fri) by drothlis (guest, #89727) [Link]
I can't speak about Nix with any authority because I haven't used it, but here are some thoughts:
Nix's Merkle tree structure, as described by dezgeg's comment, is capturing the *dependencies* of a package, rather than the *contents* of a single package.
In Nix, as far as I can tell, there's no sharing/deduplication of individual files across different packages or different versions of the same package. Section 7.5 of the Nix PhD thesis (which was provided by civodul in another comment) talks about this problem and provides some workarounds for making deployments more network-efficient —such as calculating binary diffs— but this problem simply doesn't exist if you store the contents as Merkle trees.
It also means that you can't share/reuse any work done by intermediate build steps *within* a package -- with Nix the granularity is at the level of each package.
According to the Build Systems à la Carte paper, Nix resolves transitive dependencies, so it only stores the hashes of the terminal inputs, ignoring intermediate dependencies (search for "Deep Constructive Traces" in the paper). For "cloud" build systems this means that you don't have to wait until intermediate targets are built before deciding if you need to build the target (because it may already be in the build farm's cache). The downside is that you can't have early cutoff: Say you've added a comment to a source file, if the compiled ".o" is unchanged then early cutoff means you can stop there because the final build artifact is going to be the same.
Chapter 6 of the Nix thesis talks about making the build outputs "content addressable" (where the checksum describes the build target's contents instead of the build command + dependencies) but that's described as experimental. I don't know if it was ever implemented in Nix. Even if it was, it doesn't use a Merkle tree (as described in the thesis); it serialises the entire directory including all its files' contents (like tar) and then calculates a checksum of this.
P.S. I hope it doesn't sound like I'm dissing Nix. It has many advantages, not least of which: It's an open-source tool that actually exists and anyone can use. My article is about sharing a new technique/idea.
---
Another build system I wanted to mention is BuildStream -- but again, I don't know enough about it, and I ran out of word count & time. I'm unsure of the relationship between BuildStream, BuildGrid, and BuildBox; and they have gone through several architectural changes. These projects work together to provide "cloud" capabilities: farming work out to remote build servers, and providing a cloud-based cache of build artifacts. As far as we can tell they have something like this article's "ostree_mod" that works remotely. They used to use OSTree but now they have developed their own buildbox-casd component. Their reasons for migrating from OSTree, based on a conversation at the London Build Meetup in October 2019:
- They want it to interoperate with HTTP-based CAS ("content addressable store") implementations like Bazel's one (a simple HTTP PUT/GET protocol that can be backed by S3, whatever Google's equivalent is, or any plain old HTTP server).
- There's no ostree push. You have to use ostree pull via SSH reverse proxying, etc.
- They've dropped the GC, instead they expire objects in an LRU fashion. OSTree has the guarantee that if you have a ref, you'll have the contents of that tree. This means you need GC to be able to expire objects without breaking these guarantees. Instead whenever they pull a tree they touch all the files in it, and then expire them in an LRU fashion. They were seeing GC pauses of 24hr with OSTree.
Merkle trees and build systems
Posted May 29, 2020 19:08 UTC (Fri) by walters (subscriber, #7396) [Link]
That said, there are a few tricks one can use, such as having multiple repositories, and then one could implement GC by pulling recently-used refs from one into a new repo (which is really just hardlinking, so pretty cheap), then delete the old repo and move it into place. We could probably add this as a primitive into OSTree itself - it'd make GC cost closer to O(data preserved) and not O(data). Could also amortize by having multiple repos that have different subsets of refs and prune them at different points, re-importing whatever canonical data as needed.
(There's a hugely interesting sub-topic here around whether OSTree is a *cache* of something like a .deb/rpm or whether it's canonical, i.e. your build system outputs it)
Merkle trees and build systems
Posted May 30, 2020 13:26 UTC (Sat) by nim-nim (guest, #34454) [Link]
Merkle trees and build systems
Posted May 31, 2020 20:01 UTC (Sun) by MrWim (subscriber, #47432) [Link]
You're referring to maintainer scripts and the like? With apt2ostree we store both the data and the metadata in separate trees. When we come to combine them together we try to reconstruct a dpkg database that would result from all these packages being unpacked into a chroot.
I think of it being a bit like map-reduce, where you design the reduce step to be as cheap as possible, and the map can be expensive if you like because it deterministic, cacheable and parallelizable.
Merkle trees and build systems
Posted Jun 5, 2020 10:09 UTC (Fri) by nim-nim (guest, #34454) [Link]
Merkle trees and build systems
Posted May 31, 2020 20:24 UTC (Sun) by MrWim (subscriber, #47432) [Link]
That said, there are a few tricks one can use, such as having multiple repositories, and then one could implement GC by pulling recently-used refs from one into a new repo (which is really just hardlinking, so pretty cheap), then delete the old repo and move it into place. We could probably add this as a primitive into OSTree itself - it'd make GC cost closer to O(data preserved) and not O(data). Could also amortize by having multiple repos that have different subsets of refs and prune them at different points, re-importing whatever canonical data as needed.
I've been thinking about this and I'm having difficulty seeing how it can be cheaper (at least theoretically).
GC with making new repo
- Walk the trees finding objects to keep - O(metadata preserved) syscalls, O(objects preserved) operations
- New hard-link for each one - O(data preserved) syscalls
- List all objects in old repo deleting each one - O(total objects) syscalls, O(total objects) operations
Theoretical performance of GC with current system
- Walk the trees finding objects to keep - O(metadata preserved) syscalls, O(objects preserved) memory, O(objects preserved) operations
- List all objects in old repo deleting each one not preserved - O(total objects) syscalls, O(total objects) operations
It seems to me that the latter is the same as the former, but you're using the hardlinks in the filesystem to mark an object, while in the former you could be using a hashmap in memory. Unless theres some way of deleting a directory recursively that is cheaper than iterating over all the files I can't see how the second option could be cheaper?
I've not personally had a problem with ostree gc, but I've found in the past that one of the causes of ostree performance problems is that the GFile interface makes it difficult to reason about what os-level operations will occur when you make a method call. For example when I was implementing #1643 I found that it was much faster to work with the GVariants directly than the OstreeRepoFile*
interface.
It's not quite done yet, but I've been working on something that might make working with the GVariants directly somewhat more convenient: https://github.com/wmanley/gvariant-rs/blob/master/examples/ostree-ls/src/main.rs
Merkle trees and build systems
Posted May 30, 2020 17:14 UTC (Sat) by Ericson2314 (guest, #139248) [Link]
What's new is that we're now doing it! The company I working on IPFS and Nix, as described in https://discourse.nixos.org/t/obsidian-systems-is-excited... . The underlying changes should make it easy to support other hashing schemes like OSTree's.
I'm really glad to see you all are working on similar things---the shift in perspective from rules on files to rules on subtrees is huge and I hope to see it emerge and spread in as many ways is possible. Hopefully everyone can modularize and we're have more interopt / drop-in replacements and independent composition of networking and storage methods.
Happy to answer any questions about Nix / would love to compare notes on these sorts of things.
Merkle trees and build systems
Posted May 28, 2020 17:18 UTC (Thu) by estansvik (subscriber, #127963) [Link]
Merkle trees and build systems
Posted May 28, 2020 21:20 UTC (Thu) by MrWim (subscriber, #47432) [Link]
It's both really. ldconfig creates the symlinks in /lib and /usr/lib, so it needs to be that too.
Merkle trees and build systems
Posted May 29, 2020 14:36 UTC (Fri) by estansvik (subscriber, #127963) [Link]
Merkle trees and build systems
Posted May 30, 2020 0:31 UTC (Sat) by guillemj (subscriber, #49706) [Link]
While many of the hardcoded assumptions might appear to work, perhaps better now that we have been improving the pseudo-essential package set when it comes to installation bootstrapping (see https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap), these cannot be generally applied to any package in distributions like Debian (or Ubuntu). Things like not running the preinst, or the bare passwd databases, or only handling a hardcoded list of control files, etc. In any case I guess apt2ostree could already further benefit from some of the things that we have been working on with the installation bootstrap improvements.
The general concept looks nice though. :)
Merkle trees and build systems
Posted May 31, 2020 20:44 UTC (Sun) by MrWim (subscriber, #47432) [Link]
To get the apt2ostree build rules right to work it needs to know exactly what the inputs and outputs are. I found this difficult to work out with dpkg and apt so I ended up copying the multistrap approach and doing the unpacking and creating the dpkg database myself. In particular with apt I was worried about how the local system configuration of apt would affect the creation of these chroots.
Not running preinst has worked out ok for us so far. I guess because we're never doing `apt upgrade`, we always build a new tree from scratch.
> While many of the hardcoded assumptions might appear to work, perhaps better now that we have been improving the pseudo-essential package set when it comes to installation bootstrapping (see https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap)
This seems like really good news. Thanks for sharing.
> these cannot be generally applied to any package in distributions like Debian (or Ubuntu). Things like not running the preinst, or the bare passwd databases, or only handling a hardcoded list of control files, etc. In any case I guess apt2ostree could already further benefit from some of the things that we have been working on with the installation bootstrap improvements.
I guess it's worked for us because we're not trying to build a box of parts OS like debian, we're building a embedded product, so we know that the thing that we test is the same as what our customers will be using, whereas with Debian you don't know what combination of packages your users will be using.
On the subject of apt2ostree - I think the key innovation in it is apt lockfiles. I talk a bit about it in the README: https://github.com/stb-tester/apt2ostree#lockfiles . This allows us to do all the apt bits, and store the result in source control, then all the dpkg bits that happen during the build should be relatively deterministic. It also means that an apt upgrade is explicit and visible in the source repo for the rootfs image.
Merkle trees and build systems
Posted Jun 1, 2020 4:57 UTC (Mon) by pabs (subscriber, #43278) [Link]
Merkle trees and build systems
Posted Jun 1, 2020 5:01 UTC (Mon) by pabs (subscriber, #43278) [Link]
Some folks are also thinking about applying things like ostree or squashfs overlays in order to provide upgradable Debian "appliances".
Some folks are also looking at converting Debian binary packages to Debian "apps" using AppImage/Flatpak.
Merkle trees and build systems
Posted May 31, 2020 22:41 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]
I'm using Docker (just like pretty much everybody these days) and I'm really disgusted by Dockerfiles. It would be nice to replace them with something better. It's already fairly easy to do by simply tar-ing the target image and importing it, but this loses the "layer" structure of Dockerfiles and negates all the caching advantages. It looks like OSTree can be a perfect fit there.
Merkle trees and build systems
Posted Jun 6, 2020 5:58 UTC (Sat) by rgh (guest, #13511) [Link]
Merkle trees and build systems
Posted Jun 6, 2020 6:08 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]
For Docker caching to properly work, you basically need to do content-based addressing for its layers. I'm actually looking at OSTree and it seems eminently doable, I might actually take a stab at it.
Merkle trees and build systems
Posted Jun 7, 2020 12:10 UTC (Sun) by mathstuf (subscriber, #69389) [Link]
Merkle trees and build systems
Posted Jun 8, 2020 3:38 UTC (Mon) by pabs (subscriber, #43278) [Link]
Sadly the restic storage design misses out splitting directories into chunks of filenames, which means that there is some inefficiency around directories with many files in them.
I wonder when git is going to adopt the file chunking stuff.
Merkle trees and build systems
Posted Jun 8, 2020 4:07 UTC (Mon) by pabs (subscriber, #43278) [Link]
Merkle trees and build systems
Posted Jun 8, 2020 12:19 UTC (Mon) by mathstuf (subscriber, #69389) [Link]
Merkle trees and build systems
Posted Jun 8, 2020 12:30 UTC (Mon) by pabs (subscriber, #43278) [Link]
Merkle trees and build systems
Posted Jun 8, 2020 3:44 UTC (Mon) by pabs (subscriber, #43278) [Link]
Merkle trees and build systems
Posted Jun 17, 2020 1:02 UTC (Wed) by cyphar (subscriber, #110703) [Link]
We are currently going through a more formalised specification process to hopefully get a properly specified version of the scheme I outlined in my talk. While the final scheme might not be the same as the one I outlined (which should be unsurprising given I hacked it together pretty last-minute), the general design should be similar. Unfortunately it will certainly be some time before we can point to production users of such a system.
Merkle trees and build systems
Posted Jun 17, 2020 1:54 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
But will Docker (or Moby or whatever they'll be called in a week) implement them?
Merkle trees and build systems
Posted Jun 17, 2020 5:28 UTC (Wed) by cyphar (subscriber, #110703) [Link]
Merkle trees and build systems
Posted Jun 17, 2020 11:36 UTC (Wed) by pabs (subscriber, #43278) [Link]
Merkle trees and build systems
Posted Jun 9, 2020 3:16 UTC (Tue) by bergwolf (guest, #55931) [Link]
Could you elaborate a bit why you dislike Dockerfiles?
Merkle trees and build systems
Posted Jun 9, 2020 3:27 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
Just take a typical Dockerfile from Github: https://github.com/wurstmeister/kafka-docker/blob/master/... - this is random example from using their code search function.
You can see that it does: "apk add --no-cache bash curl jq docker" - basically installs the most recent available version of packages, without any notion of "lockfiles".
apt2ostree vs rpm-ostree
Posted Jul 20, 2020 13:36 UTC (Mon) by fencekicker (guest, #140266) [Link]
We build a custom distro based on CentOS atomic host, and discovered that some things won't work in %pre/%post scripts - e.g. if you want to copy some files around, that won't work; I think 'systemctl enable <service>' works, and I'm not sure about adding users or groups. Obviously, if you want to support multiple platforms, the package scripts won't help, because they will not run on the real target.
The way I see it, rpm-ostree is a very interesting project, but it breaks quite a few expectations that came with creating RPM packages, so it feels rather kludgy in this respect. I don't think people are eager to rewrite their RPMs to handle the rpm-ostree workflow.