Automated Patching for Unreproducible Builds

Abstract

Software reproducibility plays an essential role in establishing trust between source code and the built artifacts, by comparing compilation outputs acquired from independent users. Although the testing for unreproducible builds could be automated, fixing unreproducible build issues poses a set of challenges within the reproducible builds practice, among which we consider the localization granularity and the historical knowledge utilization as the most significant ones. To tackle these challenges, we propose a novel approach RepFix that combines tracing-based fine-grained localization with history-based patch generation mechanisms.

On the one hand, to tackle the localization granularity challenge, we adopt system-level dynamic tracing to capture both the system call traces and user-space function call information. By integrating the kernel probes and user-space probes, we could determine the exact location of each executed build command. On the other hand, to tackle the historical knowledge utilization challenge, we design a similarity based relevant patch retrieving mechanism, and generate patches by applying the edit operations of the existing patches. With the abundant patches accumulated by the reproducible build practice, we could generate patches to fix the unreproducible builds automatically.

To evaluate the usefulness of RepFix, extensive experiments are conducted over a dataset with 116 real-world packages. Based on RepFix, we successfully fix the unreproducible build issues for 62 packages. Moreover, we apply RepFix to the Arch Linux packages, and successfully fix four packages. Two patches have been accepted by the repository, and there is one package for which the patch is pushed and accepted by its upstream repository, so that the fixing could be helpful for other downstream repositories.

Framework Overview

Supporting Data

This part is under construction

Dataset

Instructions for the package (tested under Debian bullseye)

        
  1. Download the repfix-package tarball and extract
  2. apt-get install pbuilder python3-pip sudo
  3. pip3 install bashlex python-Levenshtein strsimpy zss dill
  4. mkdir /data -p
  5. cd /path/to/repfix-package
  6. mv repfix/strip_dep.py source traces-validate store-validate /data/
  7. mv bullseye-lite.tgz /var/cache/pbuilder
  8. cd /path/to/repfix
  9. sed -i "s#/path/to/#$(pwd)/#g" build_twice.py
  10. If not using root, add $USER to the sudoers file, preferably with NOPASSWD
  11. ./make_patch.py -b ../bugid_map -k ../debian-patches -P apngopt

Patch Information

The patch information for the packages submitted to Arch Linux is listed here. All the bug reports we submitted have been assigned, and 2 patches have been accepted.

PackageVersion Patch Status
when 1.1.40-2 patch Accepted
pythia8 8.3.03-3 patch Accepted
zssh 1.5c-12 patch Assigned
dd_rescue 1.99.11-1 patch Assigned