I thought it'd be nice to have all of these random things I've found / worked around
along the process as little artices ("funkystuff"), so here's the first one.
Compiling glibc with -mfpu=neon-vfpv4 (with gcc 12.3.0) causes the testsuite to fail 8 tests:
math/test- double,float32x,float64,ldouble - j0,log.
(Compiling with -mfpu=neon (=VFPv3) passes.)
The "j0" tests only return "failed" (which is not very helpful), but test-double-log
gave a relatively helpful summary of the problem (I shortened it a bit):
testing double (without inline functions)
Failure: Test: log (0xe.a0288c3cb5ecp-4)
Result:
is: -8.9814025035627298e-02 -0x1.6fe0d4c400978p-4
should be: -8.9814025035627312e-02 -0x1.6fe0d4c400979p-4
Maximal error of `log'
is : 1 ulp
accepted: 0 ulp
Since the only real addition in VPFv4 (compared to VFPv3) seems to be Vector Fused Multiply Accumulate whose description says "The instruction does not round the result of the multiply before the accumulation.", i think this is a rounding difference thats atleast not supposed to be actually an issue (difference of 1 ULP with such functions, eh.)
Anyways, I'm compiling glibc with -mfpu=neon (so -lm is nice and exact), but leaving -mfpu=neon-vfpv4 enabled for the rest of the system.