hardfloat: implement float32/64 square root
authorEmilio G. Cota <cota@braap.org>
Sat, 17 Mar 2018 04:30:40 +0000 (00:30 -0400)
committerAlex Bennée <alex.bennee@linaro.org>
Mon, 17 Dec 2018 08:25:25 +0000 (08:25 +0000)
commitf131bae8a7b7ed1928cc94c69df291db609c316a
tree10c20d7f1abec552934fd03c05450f62dc9232da
parentccf770ba7396c240ca8a1564740083742dd04c08
hardfloat: implement float32/64 square root

Performance results for fp-bench:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 42.30 MFlops
sqrt-double: 22.97 MFlops
- after:
sqrt-single: 311.42 MFlops
sqrt-double: 311.08 MFlops

Here USE_FP makes a huge difference for f64's, with throughput
going from ~200 MFlops to ~300 MFlops.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
fpu/softfloat.c