fuzz: Improve flexibility of the fuzz binary

tgross35 · tgross35 · commit 78c94bc40ea4 · 2026-04-12T01:28:26.000-04:00
There are a number of chnages here that aren't easy to split up:

* Accept rounding mode as an input, and check status flags as an output.
* Catch C++ exceptions (if enabled) in C++ rather than relying on
  `catch_unwind` in Rust.
* A `DecodeError` type is introduced for propagating parse failures,
  i.e.  when the fuzzer gives us an invalid binary. This means
  `catch_unwind` can be eliminated and panics can be reserved for
  failures we want the fuzzer to detect.
* Rename "hard" to "host" since host floats can be softfloat.
* Ignore NaN mismatches with reasons, rather than fixing up the outputs.
* Similarly ignore NaN mismatches when the input is signaling and there
  is a F1-&gt;F2-&gt;F1 roundtrip with F1 and F2 as the same float, since LLVM
  may do nothing here.
* Add a CLI option to check a single file like the fuzzer does.

This introduces some code duplication since the brute force tests are
mostly untouched, and those use some of the same logic. A future commit
will clean this up.

There are some failures here that may be resolved in future LLVM
commits. Examples:

    fuzz/out-foo5/default/crashes/id:000027,sig:06,src:000855,time:629681,execs:17464051,op:havoc,rep:1:
      f16.MulAdd(0x23ff /* 0.015617 */, 0x17ff /* 0.0019522 */, 0x80ff /* -1.5199E-5 */, NearestTiesToEven)
       =&gt; 0x0101 /* 1.5318E-5 */ Status(UNDERFLOW | INEXACT) (Rust / rustc_apfloat)
       =&gt; 0x0100 /* 1.5259E-5 */ Status(UNDERFLOW | INEXACT) (C++ / llvm::APFloat) &lt;- !!! MISMATCH !!!

LLVM has the incorrect result here.

    fuzz/sync/fuzzer02/crashes/id:000021,sig:06,src:001760,time:533420,execs:9318705,op:havoc,rep:1:
      brainf16.MulAdd(0xbe01 /* -0.126 */, 0x007f /* 1.166E-38 */, 0x0010 /* 1.469E-39 */, NearestTiesToAway)
       =&gt; 0x0000 /* 0 */ Status(UNDERFLOW | INEXACT) (Rust / rustc_apfloat)
       =&gt; 0x0001 /* 9.184E-41 */ Status(UNDERFLOW | INEXACT) (C++ / llvm::APFloat) &lt;- !!! MISMATCH !!!

Similarly, LLVM is not correct.

    fuzz/out-06/default/crashes/id:000000,sig:06,src:000060,time:1970,execs:39760,op:havoc,rep:2:
      f32.FToSingleToF(0xff800080 /* NaN */, NearestTiesToEven)
       =&gt; 0xffc00080 /* NaN */ Status(0x0) (Rust / rustc_apfloat)
       =&gt; 0xff800080 /* NaN */ Status(0x0) (C++ / llvm::APFloat) &lt;- !!! MISMATCH !!!
       =&gt; 0xff800080 /* NaN */ Status(0x0) (native host floats) &lt;- !!! MISMATCH (ignored, both are propagated inputs) !!!

Here Rust quiets the input sNaN. This seems like correct behavior since
usually conversions on NaN do make it quiet, but it seems like for LLVM
and the host floats it turns into a no-op since the size is the same.

    fuzz/sync/fuzzer02/crashes/id:000016,sig:06,src:000190,time:255651,execs:2135435,op:havoc,rep:2:
      f8e4m3fn.Add(0xfd /* -416 */, 0xe8 /* -64 */, TowardNegative)
       =&gt; 0xfe /* -448 */ Status(INEXACT) (Rust / rustc_apfloat)
       =&gt; 0xff /* NaN */ Status(OVERFLOW | INEXACT) (C++ / llvm::APFloat) &lt;- !!! MISMATCH (ignored, f8e4m3fn may be broken) !!!

Various cases for f8e4m3fn fail, including the below that crashes in
LLVM APFloat:

    target/fuzz-out/fuzzer07/crashes/id:000000,sig:06,src:000629,time:584123,execs:2204007,op:havoc,rep:1:
      f8e4m3fn.MulAdd(0x9d /* -0.102 */, 0x0a /* 0.0195 */, 0x01 /* 0.00195 */, NearestTiesToEven)
       =&gt; 0x80 /* -0 */ Status(UNDERFLOW | INEXACT) (Rust / rustc_apfloat)

In a future commit, these can all be added to the corpus.
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ allocating to handle the arbitrary precision needed for conversions to/from deci
 
 However, that port had a fatal flaw: it was added to the `rust-lang/rust` repository
 without its unique licensing status (as a port of a C++ library with its own license)
-being properly tracked, communicated, taken into account, etc.  
+being properly tracked, communicated, taken into account, etc.
 The end result was years of limbo, mostly chronicled in the Rust issue
 [`rust-lang/rust#55993`](https://github.com/rust-lang/rust/issues/55993), in which
 the in-tree port couldn't really receive proper updated or even maintenance, due
@@ -28,39 +28,39 @@ due to its unclear status.
 
 This repository (`rust-lang/rustc_apfloat`) is the result of a 2022 plan on
 [the relevant Zulip topic](https://rust-lang.zulipchat.com/#narrow/stream/231349-t-core.2Flicensing/topic/apfloat), fully put into motion during 2023:
-* the `git` history of the in-tree `compiler/rustc_apfloat` library was extracted  
+* the `git` history of the in-tree `compiler/rustc_apfloat` library was extracted
   (see the separate [`rustc_apfloat-git-history-extraction`](https://github.com/LykenSol/rustc_apfloat-git-history-extraction) repository for more details)
 * only commits that were *both* necessary *and* had clear copyright status, were kept
-* any missing functionality or bug fixes, would have to be either be re-contributed,  
+* any missing functionality or bug fixes, would have to be either be re-contributed,
   or rebuilt from the ground up (mostly the latter ended up being done, see below)
 
 Most changes since the original port had been aesthetic (e.g. spell-checking, `rustfmt`),
 so little was lost in the process.
 
 Starting from that much smaller "trusted" base:
-* everything could use LLVM's new (since 2019) license, "`Apache-2.0 WITH LLVM-exception`"  
+* everything could use LLVM's new (since 2019) license, "`Apache-2.0 WITH LLVM-exception`"
   (see the ["Licensing"](#licensing) section below and/or [LICENSE-DETAILS.md](./LICENSE-DETAILS.md) for more details)
 * new facilities were built (benchmarks, and [a fuzzer comparing Rust/C++/hardware](#fuzzing))
 * excessive testing was performed (via a combination of fuzzing and bruteforce search)
 * latent bugs were discovered (e.g. LLVM issues
 [#63895](https://github.com/llvm/llvm-project/issues/63895) and
 [#63938](https://github.com/llvm/llvm-project/issues/63938))
-* the port has been forwarded in time, to include upstream (`llvm/llvm-project`) changes   
+* the port has been forwarded in time, to include upstream (`llvm/llvm-project`) changes
   to `llvm::APFloat` over the years (since 2017), removing the need for selective backports
 
 ## Versioning
 
-As this is, for the time being, a "living port", tracking upstream (`llvm/llvm-project`)  
+As this is, for the time being, a "living port", tracking upstream (`llvm/llvm-project`)
 `llvm::APFloat` changes, the `rustc_apfloat` crate will have versions of the form:
 
 ```
 0.X.Y+llvm-ZZZZZZZZZZZZ
 ```
-* `X` is always bumped after semver-incompatible API changes,  
+* `X` is always bumped after semver-incompatible API changes,
   or when updating the upstream (`llvm/llvm-project`) commit the port is based on
 * `Y` is only bumped when other parts of the version don't need to be (e.g. for bug fixes)
-* `+llvm-ZZZZZZZZZZZZ` is ["version metadata"](https://doc.rust-lang.org/cargo/reference/resolver.html#version-metadata) (which Cargo itself ignores),  
-  and `ZZZZZZZZZZZZ` always holds the first 12 hexadecimal digits of  
+* `+llvm-ZZZZZZZZZZZZ` is ["version metadata"](https://doc.rust-lang.org/cargo/reference/resolver.html#version-metadata) (which Cargo itself ignores),
+  and `ZZZZZZZZZZZZ` always holds the first 12 hexadecimal digits of
   the upstream (`llvm/llvm-project`) `git` commit hash the port is based on
 
 
@@ -84,7 +84,7 @@ involves an automated build of the original C++ `llvm::APFloat` code with `clang
 Rust code), and has been prototyped and tested on Linux (and is unlikely to work
 on other platforms, or even some Linux distros, though it mostly assumes UNIX).
 
-Example usage:  
+Example usage:
 <sub>(**TODO**: maybe move this to `fuzz/README.md` and/or expand on it)</sub>
 
 ```sh
@@ -103,10 +103,15 @@ cargo afl fuzz -i fuzz/in-foo -o fuzz/out-foo target/release/rustc_apfloat-fuzz
 ```
 
 To visualize the fuzzing testcases, you can use the `decode` subcommand:
+
 ```sh
 cargo run -p rustc_apfloat-fuzz decode fuzz/out-foo/default/crashes/*
 ```
-(this will work even while `cargo afl fuzz`, i.e. AFL++, is running)
+
+Note that `cargo run` and `cargo afl build` conflict, so if running the fuzzer
+and then decoding with the same debug/release setting, this will always trigger
+a rebuild. In these cases, launching the binary directly to call `decode` can
+avoid the extra builds.
 
 ## Licensing
 
diff --git a/etc/fuzz-parallel.sh b/etc/fuzz-parallel.sh
@@ -0,0 +1,82 @@
+#!/bin/bash
+
+# Launch the fuzzer with multiple parallel jobs. Requires tmux.
+#
+# Taken from: <https://github.com/rust-fuzz/afl.rs/issues/132#issuecomment-997827086>
+
+set -euxo pipefail
+
+# Detect cores
+all_cores="$(nproc)"
+used_cores="$((all_cores - 2))"
+in_dir="target/fuzz-in"
+sync_dir="target/fuzz-out"
+tmux_window=afl
+
+if [[ "$used_cores" -lt 2 ]]; then
+    echo "Error: used_cores < 2"
+    exit 1
+fi
+
+function print_usage() {
+    set +x
+    echo "Usage: $0 [-j PROCS]"
+    echo ""
+    echo "Options:"
+    echo "  -o DIR      Output directory"
+    echo "  -j PROCS    Number of parallel jobs to use (default: $used_cores)"
+    echo "  -h,--help   Print this help and exit"
+    set -x
+}
+
+# Parse arguments
+while [[ "$#" -gt 0 ]]; do
+    case $1 in
+        -j) used_cores="$2"; shift ;;
+        -o) sync_dir="$2"; shift ;;
+        -h|--help) print_usage; exit 0 ;;
+    esac
+    shift
+done
+
+echo "Using $used_cores out of $all_cores cores"
+
+
+# Make sure we have at least one input file
+mkdir -p "$in_dir"
+echo > "$in_dir/empty"
+
+# Start main node
+tmux new -d -s "afl01" -n $tmux_window \
+    "cargo afl fuzz -i $in_dir -o $sync_dir -M fuzzer01 target/release/rustc_apfloat-fuzz"
+echo "Spawned main instance afl01"
+
+# Start secondary instances
+for i in $(seq -f "%02.0f" 2 "$used_cores"); do
+    tmux new -d -s "afl$i" -n $tmux_window \
+        cargo afl fuzz -i $in_dir -o "$sync_dir" -S "fuzzer$i" target/release/rustc_apfloat-fuzz
+    echo "Spawned secondary instance afl$i"
+done
+
+set +x
+
+# Show status output
+echo ""
+echo "Tmux sessions:"
+tmux ls | grep afl
+echo ""
+echo "Tmux cheatsheet (shell):"
+echo "  Attach:"
+echo "    tmux attach -t afl01"
+echo "  Kill all sessions:"
+echo "    tmux kill-server"
+echo ""
+echo "Tmux chatsheet (inside tmux):"
+echo "  List sessions:"
+echo "    Ctrl-b s"
+echo "  Switch to next session:"
+echo "    Ctrl-b )"
+echo "  Switch to prev session:"
+echo "    Ctrl-b ("
+echo "  Detach:"
+echo "    Ctrl-b d"
diff --git a/fuzz/Cargo.toml b/fuzz/Cargo.toml
@@ -14,4 +14,7 @@ rustc_apfloat = { path = ".." }
 unexpected_cfgs = { level = "warn", check-cfg = [
     # Set by the fuzzer
     'cfg(fuzzing)',
+    # Unstable configs
+    'cfg(target_has_reliable_f16)',
+    'cfg(target_has_reliable_f128)',
 ] }
diff --git a/fuzz/build.rs b/fuzz/build.rs
@@ -6,6 +6,7 @@ use std::process::Command;
 // NB: Any new symbols exported from the C++ source file need to be listed here,
 // everything else will get pruned.
 const CXX_EXPORTED_SYMBOLS: &[&str] = &[
+    "check_error",
     "cxx_apf_eval_op_brainf16",
     "cxx_apf_eval_op_ieee16",
     "cxx_apf_eval_op_ieee32",
diff --git a/fuzz/src/apf_fuzz.cpp b/fuzz/src/apf_fuzz.cpp
@@ -1,6 +1,7 @@
 
 #include <array>
 #include <cstdint>
+#include <string.h>
 #include <stdint.h>
 #include <stdio.h>
 #include "llvm/ADT/APFloat.h"
@@ -43,11 +44,11 @@ enum class OpCode: uint8_t {
 
 /** Similarly, rounding mode is passed as a u8 */
 enum class Round: uint8_t {
-  NearestTiesToEven = 0,
-  TowardZero        = 1,
-  TowardPositive    = 2,
-  TowardNegative    = 3,
-  NearestTiesToAway = 4,
+    NearestTiesToEven = 0,
+    TowardZero        = 1,
+    TowardPositive    = 2,
+    TowardNegative    = 3,
+    NearestTiesToAway = 4,
 };
 
 /* LLVM uses the following values:
@@ -57,8 +58,23 @@ enum class Round: uint8_t {
  * opOverflow  = 0x04,
  * opUnderflow = 0x08,
  * opInexact   = 0x10
+ *
+ * We also use -1 to indicate an exception.
  */
-using StatusFlags = unsigned;
+using StatusFlags = int32_t;
+
+/* Utilities for making C++ exceptions FFI-safe */
+thread_local const char *AP_ERROR = NULL;
+
+StatusFlags handle_std_exception(const std::exception& e) {
+    AP_ERROR = strdup(e.what());
+    return -1;
+}
+
+StatusFlags handle_unknown_exception() {
+    AP_ERROR = "Unknown exception";
+    return -1;
+}
 
 /** Common operations for a given semantics are grouped in this class */
 template<APFloat::Semantics S, typename U, const unsigned BITS = sizeof(U) * 8>
@@ -96,8 +112,7 @@ class FloatEval {
         return APFloat(getSemantics(), APInt(BITS, words));
     }
 
-    /** Evaluate a dynamically specified operation with the given configuration */
-    static StatusFlags eval(OpCode op, Round round, UInt ai, UInt bi,
+    static StatusFlags eval_impl(OpCode op, Round round, UInt ai, UInt bi,
                             UInt ci, UInt &out)
     {
         APFloat a = bits_to_apf(ai);
@@ -154,7 +169,7 @@ class FloatEval {
                 status = a.fusedMultiplyAdd(b, c, rm);
                 break;
             /* FIXME: the below operations could be incorrect and are discarding a
-               status, and (though unlikely) could have mistkes that cancel. It would
+               status, and (though unlikely) could have mistakes that cancel. It would
                be better to make `out` a u128 and only do a single conversion. */
             case OpCode::FToI128ToF:
                 i = APSInt(128, false);
@@ -171,7 +186,7 @@ class FloatEval {
                 a.convert(sem, rm, &cvt_exact);
                 break;
             case OpCode::FToDoubleToF:
-                a.convert(APFloat::IEEEsingle(), rm, &cvt_exact);
+                a.convert(APFloat::IEEEdouble(), rm, &cvt_exact);
                 a.convert(sem, rm, &cvt_exact);
                 break;
             default:
@@ -183,6 +198,24 @@ class FloatEval {
         return (StatusFlags)status;
     }
 
+    /** Evaluate a dynamically specified operation with the given configuration */
+    /* FIXME(perf): since some of these may allocate, we should have an init
+     * function to prealloc, then pass a pointer to this call. */
+    static StatusFlags eval(OpCode op, Round round, UInt ai, UInt bi,
+                            UInt ci, UInt &out) {
+#if __cpp_exceptions
+        try {
+            return eval_impl(op, round, ai, bi, ci, out);
+        } catch (const std::exception& e) {
+            return handle_std_exception(e);
+        } catch(...) {
+            return handle_unknown_exception();
+        }
+#else
+        return eval_impl(op, round, ai, bi, ci, out);
+#endif
+    }
+
     static const fltSemantics &getSemantics() {
         return APFloat::EnumToSemantics(S);
     }
@@ -206,8 +239,13 @@ class EvalX87F80: public FloatEval<APFloat::Semantics::S_x87DoubleExtended, uint
         return Ty::eval(op, round, ai, bi, ci, out); \
     }
 
+/* NB: Every symbol defined here also needs to be in the list in build.rs, otherwise they
+ * will get pruned during optimization. */
 extern "C" {
-    /* NB: Every symbol defined here also needs to be in the list in build.rs */
+    const char *check_error() {
+        return AP_ERROR;
+    }
+
     MAKE_EXTERN(EvalBrainF16, cxx_apf_eval_op_brainf16);
     MAKE_EXTERN(EvalIeee16, cxx_apf_eval_op_ieee16);
     MAKE_EXTERN(EvalIeee32, cxx_apf_eval_op_ieee32);
diff --git a/fuzz/src/apf_fuzz.rs b/fuzz/src/apf_fuzz.rs
@@ -73,6 +73,70 @@ impl<T> FuzzOp<T> {
     }
 }
 
+/// A testable operation, which can be encoded as a byte.
+#[derive(Copy, Clone, Debug, PartialEq)]
+pub enum Op {
+    Neg = 0,
+    Add = 1,
+    Sub = 2,
+    Mul = 3,
+    Div = 4,
+    Rem = 5,
+    MulAdd = 6,
+    FToI128ToF = 7,
+    FToU128ToF = 8,
+    FToSingleToF = 9,
+    FToDoubleToF = 10,
+}
+
+impl Op {
+    pub fn from_u8(tag: u8) -> Option<Self> {
+        let v = match tag {
+            x if x == Self::Neg.to_u8() => Self::Neg,
+            x if x == Self::Add.to_u8() => Self::Add,
+            x if x == Self::Sub.to_u8() => Self::Sub,
+            x if x == Self::Mul.to_u8() => Self::Mul,
+            x if x == Self::Div.to_u8() => Self::Div,
+            x if x == Self::Rem.to_u8() => Self::Rem,
+            x if x == Self::MulAdd.to_u8() => Self::MulAdd,
+            x if x == Self::FToI128ToF.to_u8() => Self::FToI128ToF,
+            x if x == Self::FToU128ToF.to_u8() => Self::FToU128ToF,
+            x if x == Self::FToSingleToF.to_u8() => Self::FToSingleToF,
+            x if x == Self::FToDoubleToF.to_u8() => Self::FToDoubleToF,
+            _ => return None,
+        };
+        Some(v)
+    }
+
+    pub fn to_u8(self) -> u8 {
+        self as u8
+    }
+
+    pub fn airity(self) -> Arity {
+        match self {
+            Op::Neg => Arity::Unary,
+            Op::Add => Arity::Binary,
+            Op::Sub => Arity::Binary,
+            Op::Mul => Arity::Binary,
+            Op::Div => Arity::Binary,
+            Op::Rem => Arity::Binary,
+            Op::MulAdd => Arity::Ternary,
+            Op::FToI128ToF => Arity::Unary,
+            Op::FToU128ToF => Arity::Unary,
+            Op::FToSingleToF => Arity::Unary,
+            Op::FToDoubleToF => Arity::Unary,
+        }
+    }
+}
+
+/// Number of inputs to an operation.
+#[derive(Copy, Clone, Debug)]
+pub enum Arity {
+    Unary = 1,
+    Binary = 2,
+    Ternary = 3,
+}
+
 impl<HF> FuzzOp<HF>
 where
     HF: num_traits::Float
diff --git a/fuzz/src/main.rs b/fuzz/src/main.rs