Show HN: Building better base images

github.com

37 points by akrylov 2 days ago

This project addresses the inefficiencies of traditional Dockerfile-based container builds where each customization layer creates storage bloat through duplicate dependencies from repeated apt-get install commands, network inefficiency from redundant package downloads across different images, and slow iteration cycles requiring full rebuilds of all previous steps. Our solution enables building minimal base images from scratch using debootstrap that precisely include only required components in the initial build, while allowing creation of specialized variants (Java, Kafka, etc.) from these common foundations - resulting in significantly leaner images, faster builds, and more efficient resource utilization compared to standard Docker layer stacking approaches.

mubou 2 days ago

I'm not really understanding what this does specifically. It looks like it creates the filesystem on the host machine using chroot and then tarballs it?

Is there an advantage to that over combining layers and using cache mounts to avoid those redundant downloads?

A side-by-side comparison of dive's output would be helpful (https://github.com/wagoodman/dive).

Also can you clarify what you mean by "requiring full rebuilds of all previous steps"?

  • akrylov 2 days ago

    It’s basically just a fancy bash script (mkimage.sh) and Makefiles for calling scripts with different sets of paramaters. The process – is the same exact process of creating base docker images – chroot and using package manager apt or yum to install packages in chroot jails. That is how ubi9 or debian slim base images are made. With this tool you can extend the process – install dependencies, run security checks, sign it all in one go. It’s easy to extend it, so you can create base images for Kafka with different Java distributions for example. Which is very useful for testing and performance tuning.

    Imagine you work at a large org and you want to control all images used for CI/CD workers. Instead of scattering it across different Dockerfiles and scripts (Java, NodeJS, python, etc) you can just use a single tool. At least it was why I built it in the first place.

  • mrbluecoat 2 days ago

    I'm similarly curious why not just use Alpine or Void rootfs if container size is important?

    • akrylov 2 days ago

      For the same reason hyperscalers build and maintain their own distro’s and base images – to have complete control over supply chain.

mathfailure 2 days ago

If the idea was to merge different layers - why not do something like this instead?

FROM your_image as initial

FROM scratch

COPY --from=initial / /

  • gkfasdfasdf 2 days ago

    I guess one advantage of the author's approach is that any apt-get's etc done in building the initial image can reuse the host package cache.

anotherhue 2 days ago

If tickles your fancy may I also suggest trying Nix to build docker images?

Personally I've soured on the Dockerfile approach as it feels like we're just shuffling bytes around rather than composing something.

https://nix.dev/tutorials/nixos/building-and-running-docker-...

  • numbsafari 2 days ago

    I have completely soured on Dockerfiles. I view them as anathema.

    The supposed "caching" of layers really doesn't work in practice unless you add a bunch of other infrastructure and third-party tooling to your build process. Getting truly incremental and reproducible layers into your build process is non-trivial, and the Dockerfile approach fails to take advantage of that work once you've done it.

    • onedognight 2 days ago

      You need to start with the right base. Here’s a container-first 100%-reproducible from-scratch base to build on.

      [0] https://stagex.tools/

  • sepositus 2 days ago

    A surprising downside to Nix containers is that a majority of packages are not optimized for containers. For example, trying adding a dependency to `git` and see how big the container grows. Granted, the good packages (like git) allow customization, but it requires really digging into the code. Some packages just straight up ship with a ton of bloat and the only thing you can do is basically fork and maintain it yourself.

    • max-privatevoid 2 days ago

      It's a problem of nixpkgs. It would be cool to have an Alpine-like alternative package set focused on minimal package size.

      • pxc 2 days ago

        There is, isn't there? That's what `pkgsStatic` in Nixpkgs is. Statically compiled packages with small closures built with musl, just like Alpine

    • Rucadi 2 days ago

      you could try to statically link them if the package support it, it does so by using musl

      nix build github:NixOS/nixpkgs#pkgsStatic.git

      return the pacakge as:

      ls -lah git

      -r-xr-xr-x 1 rucadi rucadi 5.1M Jan 1 1970 git

      ldd git

      $ not a dynamic executable

      So you don't really need to really grow the container

      • sepositus 2 days ago

        Yeah, it's a problem on a package-per-package basis. My point isn't how to solve the git problem but that the experience can vary wildly depending on the package. It can be surprising and often comes at the expense of time trying to navigate the insanity that is nixpkgs :)

  • akrylov 2 days ago

    Nix is cool, but with Nix one needs to know Nix. Personally, I prefer just using scripting languages. LLM's made code cheaper, but debugging become expensive.