Math Functions

Source Locations

Implementation Requirements / Goals

  • The highest priority is to be as accurate as possible, according to the C and IEEE 754 standards. By default, we will aim to be correctly rounded for all rounding modes. The current rounding mode of the floating point environment is used to perform computations and produce the final results.

    • To test for correctness, we compare the outputs with other correctly rounded multiple-precision math libraries such as the GNU MPFR library or the CORE-MATH library.

  • Our next requirement is that the outputs are consistent across all platforms. Notice that the consistency requirement will be satisfied automatically if the implementation is correctly rounded.

  • Our last requirement for the implementations is to have good and predicable performance:

    • The average performance should be comparable to other libc implementations.

    • The worst case performance should be within 10X-20X of the average.

    • Platform-specific implementations or instructions could be added whenever it makes sense and provides significant performance boost.

  • For other use cases that have strict requirements on the code size, memory footprint, or latency, such as embedded systems, we will aim to be as accurate as possible within the memory or latency budgets, and consistent across all platforms.

Add a new math function to LLVM libc

Implementation Status

Basic Operations

<Func>

<Func_f> (float)

<Func> (double)

<Func_l> (long double)

<Func_f16> (float16)

<Func_f128> (float128)

C23 Definition Section

C23 Error Handling Section

ceil

7.12.9.1

F.10.6.1

canonicalize

7.12.11.7

F.10.8.7

copysign

7.12.11.1

F.10.8.1

dadd

N/A

N/A

N/A

*

7.12.14.1

F.10.11

ddiv

N/A

N/A

N/A

*

7.12.14.4

F.10.11

dfma

N/A

N/A

N/A

*

7.12.14.5

F.10.11

dmul

N/A

N/A

N/A

*

7.12.14.3

F.10.11

dsub

N/A

N/A

N/A

*

7.12.14.2

F.10.11

f16add

*

*

*

N/A

7.12.14.1

F.10.11

f16div

*

*

*

N/A

7.12.14.4

F.10.11

f16fma

*

*

*

N/A

7.12.14.5

F.10.11

f16mul

*

*

*

N/A

7.12.14.5

F.10.11

f16sub

*

*

*

N/A

7.12.14.2

F.10.11

fabs

7.12.7.3

F.10.4.3

fadd

N/A

N/A

7.12.14.1

F.10.11

fdim

7.12.12.1

F.10.9.1

fdiv

N/A

N/A

*

7.12.14.4

F.10.11

ffma

N/A

N/A

*

7.12.14.5

F.10.11

floor

7.12.9.2

F.10.6.2

fmax

7.12.12.2

F.10.9.2

fmaximum

7.12.12.4

F.10.9.4

fmaximum_mag

7.12.12.6

F.10.9.4

fmaximum_mag_num

7.12.12.10

F.10.9.5

fmaximum_num

7.12.12.8

F.10.9.5

fmin

7.12.12.3

F.10.9.3

fminimum

7.12.12.5

F.10.9.4

fminimum_mag

7.12.12.7

F.10.9.4

fminimum_mag_num

7.12.12.11

F.10.9.5

fminimum_num

7.12.12.9

F.10.9.5

fmod

7.12.10.1

F.10.7.1

fmul

N/A

N/A

*

7.12.14.3

F.10.11

frexp

7.12.6.7

F.10.3.7

fromfp

7.12.9.10

F.10.6.10

fromfpx

7.12.9.11

F.10.6.11

fsub

N/A

N/A

*

7.12.14.2

F.10.11

getpayload

F.10.13.1

N/A

ilogb

7.12.6.8

F.10.3.8

iscanonical

7.12.3.2

N/A

issignaling

7.12.3.8

N/A

ldexp

7.12.6.9

F.10.3.9

llogb

7.12.6.10

F.10.3.10

llrint

7.12.9.5

F.10.6.5

llround

7.12.9.7

F.10.6.7

logb

7.12.6.17

F.10.3.17

lrint

7.12.9.5

F.10.6.5

lround

7.12.9.7

F.10.6.7

modf

7.12.6.18

F.10.3.18

nan

7.12.11.2

F.10.8.2

nearbyint

7.12.9.3

F.10.6.3

nextafter

7.12.11.3

F.10.8.3

nextdown

7.12.11.6

F.10.8.6

nexttoward

N/A

7.12.11.4

F.10.8.4

nextup

7.12.11.5

F.10.8.5

remainder

7.12.10.2

F.10.7.2

remquo

7.12.10.3

F.10.7.3

rint

7.12.9.4

F.10.6.4

round

7.12.9.6

F.10.6.6

roundeven

7.12.9.8

F.10.6.8

scalbln

7.12.6.19

F.10.3.19

scalbn

7.12.6.19

F.10.3.19

setpayload

F.10.13.2

N/A

setpayloadsig

F.10.13.3

N/A

totalorder

F.10.12.1

N/A

totalordermag

F.10.12.2

N/A

trunc

7.12.9.9

F.10.6.9

ufromfp

7.12.9.10

F.10.6.10

ufromfpx

7.12.9.11

F.10.6.11

Higher Math Functions

<Func>

<Func_f> (float)

<Func> (double)

<Func_l> (long double)

<Func_f16> (float16)

<Func_f128> (float128)

C23 Definition Section

C23 Error Handling Section

acos

7.12.4.1

F.10.1.1

acosh

7.12.5.1

F.10.2.1

acospi

7.12.4.8

F.10.1.8

asin

7.12.4.2

F.10.1.2

asinh

7.12.5.2

F.10.2.2

asinpi

7.12.4.9

F.10.1.9

atan

7.12.4.3

F.10.1.3

atan2

1 ULP

7.12.4.4

F.10.1.4

atan2pi

7.12.4.11

F.10.1.11

atanh

7.12.5.3

F.10.2.3

atanpi

7.12.4.10

F.10.1.10

cbrt

7.12.7.1

F.10.4.1

compoundn

7.12.7.2

F.10.4.2

cos

7.12.4.5

F.10.1.5

cosh

7.12.5.4

F.10.2.4

cospi

7.12.4.12

F.10.1.12

dsqrt

N/A

N/A

N/A

*

7.12.14.6

F.10.11

erf

7.12.8.1

F.10.5.1

erfc

7.12.8.2

F.10.5.2

exp

7.12.6.1

F.10.3.1

exp10

7.12.6.2

F.10.3.2

exp10m1

7.12.6.3

F.10.3.3

exp2

7.12.6.4

F.10.3.4

exp2m1

7.12.6.5

F.10.3.5

expm1

7.12.6.6

F.10.3.6

fma

7.12.13.1

F.10.10.1

f16sqrt

*

*

*

N/A

7.12.14.6

F.10.11

fsqrt

N/A

N/A

*

7.12.14.6

F.10.11

hypot

7.12.7.4

F.10.4.4

lgamma

7.12.8.3

F.10.5.3

log

7.12.6.11

F.10.3.11

log10

7.12.6.12

F.10.3.12

log10p1

7.12.6.13

F.10.3.13

log1p

7.12.6.14

F.10.3.14

log2

7.12.6.15

F.10.3.15

log2p1

7.12.6.16

F.10.3.16

logp1

7.12.6.14

F.10.3.14

pow

1 ULP

7.12.7.5

F.10.4.5

powi*

pown

7.12.7.6

F.10.4.6

powr

7.12.7.7

F.10.4.7

rootn

7.12.7.8

F.10.4.8

rsqrt

7.12.7.9

F.10.4.9

sin

7.12.4.6

F.10.1.6

sincos

sinh

7.12.5.5

F.10.2.5

sinpi

7.12.4.13

F.10.1.13

sqrt

7.12.7.10

F.10.4.10

tan

7.12.4.7

F.10.1.7

tanh

7.12.5.6

F.10.2.6

tanpi

7.12.4.14

F.10.1.14

tgamma

7.12.8.4

F.10.5.4

Legends:

  • : correctly rounded for all 4 rounding modes.

  • CR: correctly rounded for the default rounding mode (round-to-the-nearest, tie-to-even).

  • x ULPs: largest errors recorded.

  • N/A: Not defined in the standard or will not be added.

  • *: LLVM libc extension.

Performance

  • Simple performance testings are located at: libc/test/src/math/performance_testing.

  • We also use the perf tool from the CORE-MATH project: link. The performance results from the CORE-MATH’s perf tool are reported in the table below, using the system library as reference (such as the GNU C library on Linux). Fmod performance results obtained with “performance_testing”.

<Func>

Reciprocal throughput (clk)

Latency (clk)

Testing ranges

Testing configuration

LLVM libc

Reference (glibc)

LLVM libc

Reference (glibc)

CPU

OS

Compiler

Special flags

acosf

24

29

62

77

\([-1, 1]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

acoshf

18

26

73

74

\([1, 21]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

asinf

23

27

62

62

\([-1, 1]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

asinhf

21

39

77

91

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

atanf

27

29

79

68

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

atanhf

18

66

68

133

\([-1, 1]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

cosf

13

32

53

59

\([0, 2\pi]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

coshf

14

20

50

48

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

expf

9

7

44

38

\([-10, 10]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

exp10f

10

8

40

38

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

exp2f

9

6

35

31

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

expm1f

9

44

42

121

\([-10, 10]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

fmodf

73

263

[MIN_NORMAL, MAX_NORMAL]

i5 mobile

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

9

11

[0, MAX_SUBNORMAL]

i5 mobile

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

fmod

595

3297

[MIN_NORMAL, MAX_NORMAL]

i5 mobile

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

14

13

[0, MAX_SUBNORMAL]

i5 mobile

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

hypotf

25

15

64

49

\([-10, 10] \times [-10, 10]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

logf

12

10

56

46

\([e^{-1}, e]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

log10f

9

17

35

48

\([e^{-1}, e]\)

Ryzen 5900X

Ubuntu 22.04 LTS x86_64

Clang 15.0.6

FMA

log1pf

16

33

61

97

\([e^{-0.5} - 1, e^{0.5} - 1]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

log2f

13

10

57

46

\([e^{-1}, e]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

sinf

12

25

51

57

\([-\pi, \pi]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

sincosf

19

30

57

68

\([-\pi, \pi]\)

Ryzen 1700

Ubuntu 20.04 LTS x86_64

Clang 12.0.0

FMA

sinhf

13

63

48

137

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

tanf

16

50

61

107

\([-\pi, \pi]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

tanhf

13

55

57

123

\([-10, 10]\)

Ryzen 1700

Ubuntu 22.04 LTS x86_64

Clang 14.0.0

FMA

Algorithms + Implementation Details

Fixed-point Arithmetics

References