Somebody asked me to write up something about buildinfo files for alpine, which would be the next step for reproducible builds.
For reproducible builds we want to be able to repeat the build for a given package and get a package that is bit-by-bit identical with the original. It is often essential that an identical build environment is used (for example, a compiler from today might compile the binary differently than a compiler from 2021.
To solve this, distros started generating buildinfo files (either embedding them in the package or serving them seperately) that contains a description of the build environment that can be used to recreate this. The specific format is distro specific and you might need to come up with your own, but you are welcome to borrow heavily from the Arch Linux one: https://www.archlinux.org/pacman/BUILDINFO.5.html
Arch is embedding the buildinfo files into the final binary package, debian is currently serving them seperately but is considering embedding them too. Unless you need to save every possible kb I would recommend embedding them.
Without being super familiar with the alpine build system I would suggest:
A simple file format version so we can update this in the future
The name of the package that was built
The version of the package that was built
The architecture the package was built for (this could be something like 'any' for packages that aren't architecture specific.
If you only distribute packages that are built by your CI after the package was commited (from my experience as a maintainer this is the case in alpine) you are able to save yourself from a few headaches and simply record the aports commit that you're using to build the package.
This allows you to canonically identify the APKBUILD that was used to build the package.
Theoretically this would also allow you to skip builddate= and installed= and instead derive builddate= from the commit datetime and the set of installed packages by inspecting the APKBUILD files of the dependencies described in our APKBUILD. This has some interesting advantages because you could get away with not having buildinfo files at all (by rebuilding all packages for a release) but implementing this is probably quite advanced and it wouldn't work for edge. nix/guix have similar properties, but none of the more traditional distros are currently doing something like this.
Some compilers sometimes include the build directory into binaries and we might have to normalize this. Preferably compilers wouldn't do this in the first place, but this is a rather common problem and hard to fix everywhere.
The packages we build for arch are built in containers and the build dir is always going to be /build, but we record this anyway.
Some build systems use the current time/date during the build, hardcoding this to eg 1970-01-01 causes issues sometimes, so there's something called SOURCE_DATE_EPOCH (abuild itself has support for this already when generating tar files) to set a canonical build time. Arch currently needs this field because the canonical build time is simply "whenever the package was first built" and rebuilders read this value from the buildinfo file, debian doesn't need this because you can deterministically read the build date from debian/changelog.
You may not need this field if you are able to deterministically derive a datetime from eg. git with commit=.
Certain abuild options that might've been passed to the build. You may or may not need this (you know the ins and outs of abuild better than me), the default /etc/abuild.conf can be restored from the packages listed in installed= as described below.
Arch also has buildenv= for a very similar purpose, you probably know best how these options should look like for alpine.
The list of all installed packages, including specific versions. A rebuilder backend would use this list to setup an identical alpine chroot. Arch Linux uses containers to ensure the packages are always built in a clean environment with no modifcations to eg. config files.