Evaluate using Profile-Guided Optimization (PGO) and Post Link Optimizatoin (PLO) for building packages
Hello.
Now I am investigating PGO and PLO effects on different kinds of software - all my current results are available at https://github.com/zamazan4ik/awesome-pgo. According to these results, enabling PGO and PLO can help with achieving better overall performance in many cases. Since Alpine packages work often in resource-limited areas (like weak CPUs) I think trying to optimize CPU usage for the Alpine package would be a good idea.
PGO is already a well-known technique. All currently known PGO effects on performance can be found at https://github.com/zamazan4ik/awesome-pgo#pgo-showcases . Several OS distros already enabled PGO for some packages like GCC, Rustc, Chromium, Firefox, and others (it depends on each OS distro, of course).
I think we can try to expand PGO usage across Alpine packages. E.g. PGO can be enabled for Alpine Clang (https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/clang17/APKBUILD?ref_type=heads) since Clang already supports building with PGO + BOLT (https://github.com/llvm/llvm-project/blob/main/clang/cmake/caches/BOLT.cmake).
Regarding Post Link Optimization (PLO), right now there are two main tools - LLVM BOLT and Google Propeller.
According to the Facebook Research Paper (https://research.facebook.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for various packages like compilers and interpreters. I think it would be a good idea to enable LLVM BOLT for some packages to deliver faster binaries for users (since Propeller is less stable right now).
Here I got some examples of how LLVM BOLT is already integrated into other projects:
- Rustc: https://github.com/rust-lang/rust/pull/116352
- CPython: https://github.com/python/cpython/pull/95908
- Pyston:
- Clang: https://github.com/llvm/llvm-project/blob/main/clang/cmake/caches/BOLT.cmake
So at least for the projects above LLVM BOLT effects are tested and some preparations are already done in the upstream projects. In this case, it should be easier to enable BOLT for these packages.
For some projects right now there is ongoing work on integrating LLVM BOLT into the build scripts:
- Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1163978
- Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1789087
- The same for Propeller (a LLVM BOLT alternative): https://bugzilla.mozilla.org/show_bug.cgi?id=1509314
- NodeJS: https://github.com/nodejs/node/issues/50379
- LDC: https://github.com/ldc-developers/ldc/issues/4228
- GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112492
More about LLVM BOLT performance results for other projects can be found in:
- Rustc:
- CPython: https://github.com/python/cpython/pull/95908
- YDB: https://github.com/ydb-platform/ydb/issues/140
- Clang:
- LDC: https://github.com/ldc-developers/ldc/issues/4228#issuecomment-1334499428
- NodeJS: https://aaupov.github.io/blog/2020/10/08/bolt-nodejs
- Chromium: https://aaupov.github.io/blog/2022/11/12/bolt-chromium
- MySQL, MongoDB, memcached, Verilator: https://people.ucsc.edu/~hlitz/papers/ocolos.pdf
I don't create an issue per project (like "Enable BOLT for Clang", "Enable PGO for GCC", etc.) since I think first we need to discuss the approach. If we agree with enabling BOLT or expanding PGO usage for some packages, then we can create an additional issue (and use this issue as a meta issue).