math.h¶

Source Locations ¶

The main source is located at: libc/src/math.
The tests are located at: libc/test/src/math.
The floating point utilities are located at: libc/src/__support/FPUtil.

Implementation Requirements / Goals ¶

The highest priority is to be as accurate as possible, according to the C and IEEE 754 standards. By default, we will aim to be correctly rounded for all rounding modes. The current rounding mode of the floating point environment is used to perform computations and produce the final results.
- To test for correctness, we compare the outputs with other correctly rounded multiple-precision math libraries such as the GNU MPFR library or the CORE-MATH library.
Our next requirement is that the outputs are consistent across all platforms. Notice that the consistency requirement will be satisfied automatically if the implementation is correctly rounded.
Our last requirement for the implementations is to have good and predicable performance:
- The average performance should be comparable to other libc implementations.
- The worst case performance should be within 10X-20X of the average.
- Platform-specific implementations or instructions could be added whenever it makes sense and provides significant performance boost.
For other use cases that have strict requirements on the code size, memory footprint, or latency, such as embedded systems, we will aim to be as accurate as possible within the memory or latency budgets, and consistent across all platforms.

Add a new math function to LLVM libc ¶

To add a new math function, follow the steps at: libc/src/math/docs/add_math_function.md.

Implementation Status ¶

To check math functions enabled for Linux:
To check math functions enabled for Windows:
- windows-x86_64
- windows-aarch64 - to be added
To check math functions enabled for macOS:
- darwin-x86_64
- darwin-aarch64
To check math functions enabled for GPU:
- gpu-entrypoints
To check math functions enabled for embedded system:
- baremetal-aarch32
- baremetal-riscv32 - to be added

Basic Operations ¶

<Func>	<Func_f> (float)	<Func> (double)	<Func_l> (long double)	<Func_f16> (float16)	<Func_f128> (float128)	<Func_bf16> (bfloat16)	C23 Definition Section	C23 Error Handling Section
ceil	✅	✅	✅	✅	✅	✅	7.12.9.1	F.10.6.1
canonicalize	✅	✅	✅	✅	✅	✅	7.12.11.7	F.10.8.7
copysign	✅	✅	✅	✅	✅	✅	7.12.11.1	F.10.8.1
dadd	N/A	N/A	✅	N/A	✅*	N/A	7.12.14.1	F.10.11
ddiv	N/A	N/A	✅	N/A	✅*	N/A	7.12.14.4	F.10.11
dfma	N/A	N/A	✅	N/A	✅*	N/A	7.12.14.5	F.10.11
dmul	N/A	N/A	✅	N/A	✅*	N/A	7.12.14.3	F.10.11
dsub	N/A	N/A	✅	N/A	✅*	N/A	7.12.14.2	F.10.11
f16add	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.1	F.10.11
f16div	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.4	F.10.11
f16fma	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.5	F.10.11
f16mul	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.3	F.10.11
f16sub	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.2	F.10.11
bf16add	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.1	F.10.11
bf16div	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.4	F.10.11
bf16fma	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.5	F.10.11
bf16mul	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.3	F.10.11
bf16sub	✅*	✅*	✅*	N/A	✅	N/A	7.12.14.2	F.10.11
fabs	✅	✅	✅	✅	✅	✅	7.12.7.3	F.10.4.3
fadd	N/A	✅	✅	N/A	✅	N/A	7.12.14.1	F.10.11
fdim	✅	✅	✅	✅	✅	✅	7.12.12.1	F.10.9.1
fdiv	N/A	✅	✅	N/A	✅*	N/A	7.12.14.4	F.10.11
ffma	N/A	✅	✅	N/A	✅*	N/A	7.12.14.5	F.10.11
floor	✅	✅	✅	✅	✅	✅	7.12.9.2	F.10.6.2
fmax	✅	✅	✅	✅	✅	✅	7.12.12.2	F.10.9.2
fmaximum	✅	✅	✅	✅	✅	✅	7.12.12.4	F.10.9.4
fmaximum_mag	✅	✅	✅	✅	✅	✅	7.12.12.6	F.10.9.4
fmaximum_mag_num	✅	✅	✅	✅	✅	✅	7.12.12.10	F.10.9.5
fmaximum_num	✅	✅	✅	✅	✅	✅	7.12.12.8	F.10.9.5
fmin	✅	✅	✅	✅	✅	✅	7.12.12.3	F.10.9.3
fminimum	✅	✅	✅	✅	✅	✅	7.12.12.5	F.10.9.4
fminimum_mag	✅	✅	✅	✅	✅	✅	7.12.12.7	F.10.9.4
fminimum_mag_num	✅	✅	✅	✅	✅	✅	7.12.12.11	F.10.9.5
fminimum_num	✅	✅	✅	✅	✅	✅	7.12.12.9	F.10.9.5
fmod	✅	✅	✅	✅	✅	✅	7.12.10.1	F.10.7.1
fmul	N/A	✅	✅	N/A	✅*	N/A	7.12.14.3	F.10.11
frexp	✅	✅	✅	✅	✅	✅	7.12.6.7	F.10.3.7
fromfp	✅	✅	✅	✅	✅	✅	7.12.9.10	F.10.6.10
fromfpx	✅	✅	✅	✅	✅	✅	7.12.9.11	F.10.6.11
fsub	N/A	✅	✅	N/A	✅*	N/A	7.12.14.2	F.10.11
getpayload	✅	✅	✅	✅	✅	✅	F.10.13.1	N/A
ilogb	✅	✅	✅	✅	✅	✅	7.12.6.8	F.10.3.8
iscanonical	✅	✅	✅	✅	✅	✅	7.12.3.2	N/A
issignaling	✅	✅	✅	✅	✅	✅	7.12.3.8	N/A
ldexp	✅	✅	✅	✅	✅	✅	7.12.6.9	F.10.3.9
llogb	✅	✅	✅	✅	✅	✅	7.12.6.10	F.10.3.10
llrint	✅	✅	✅	✅	✅	✅	7.12.9.5	F.10.6.5
llround	✅	✅	✅	✅	✅	✅	7.12.9.7	F.10.6.7
logb	✅	✅	✅	✅	✅	✅	7.12.6.17	F.10.3.17
lrint	✅	✅	✅	✅	✅	✅	7.12.9.5	F.10.6.5
lround	✅	✅	✅	✅	✅	✅	7.12.9.7	F.10.6.7
modf	✅	✅	✅	✅	✅	✅	7.12.6.18	F.10.3.18
nan	✅	✅	✅	✅	✅	✅	7.12.11.2	F.10.8.2
nearbyint	✅	✅	✅	✅	✅	✅	7.12.9.3	F.10.6.3
nextafter	✅	✅	✅	✅	✅	✅	7.12.11.3	F.10.8.3
nextdown	✅	✅	✅	✅	✅	✅	7.12.11.6	F.10.8.6
nexttoward	✅	✅	✅	✅	N/A	✅	7.12.11.4	F.10.8.4
nextup	✅	✅	✅	✅	✅	✅	7.12.11.5	F.10.8.5
remainder	✅	✅	✅	✅	✅	✅	7.12.10.2	F.10.7.2
remquo	✅	✅	✅	✅	✅	✅	7.12.10.3	F.10.7.3
rint	✅	✅	✅	✅	✅	✅	7.12.9.4	F.10.6.4
round	✅	✅	✅	✅	✅	✅	7.12.9.6	F.10.6.6
roundeven	✅	✅	✅	✅	✅	✅	7.12.9.8	F.10.6.8
scalbln	✅	✅	✅	✅	✅	✅	7.12.6.19	F.10.3.19
scalbn	✅	✅	✅	✅	✅	✅	7.12.6.19	F.10.3.19
setpayload	✅	✅	✅	✅	✅	✅	F.10.13.2	N/A
setpayloadsig	✅	✅	✅	✅	✅	✅	F.10.13.3	N/A
totalorder	✅	✅	✅	✅	✅	✅	F.10.12.1	N/A
totalordermag	✅	✅	✅	✅	✅	✅	F.10.12.2	N/A
trunc	✅	✅	✅	✅	✅	✅	7.12.9.9	F.10.6.9
ufromfp	✅	✅	✅	✅	✅	✅	7.12.9.10	F.10.6.10
ufromfpx	✅	✅	✅	✅	✅	✅	7.12.9.11	F.10.6.11

Higher Math Functions ¶

<Func>	<Func_f> (float)	<Func> (double)	<Func_l> (long double)	<Func_f16> (float16)	<Func_f128> (float128)	<Func_bf16> (bfloat16)	C23 Definition Section	C23 Error Handling Section
acos	✅	✅		✅			7.12.4.1	F.10.1.1
acosh	✅			✅			7.12.5.1	F.10.2.1
acospi				✅			7.12.4.8	F.10.1.8
asin	✅	✅		✅			7.12.4.2	F.10.1.2
asinh	✅			✅			7.12.5.2	F.10.2.2
asinpi				✅			7.12.4.9	F.10.1.9
atan	✅	1 ULP		✅			7.12.4.3	F.10.1.3
atan2	✅	1 ULP			1 ULP		7.12.4.4	F.10.1.4
atan2pi							7.12.4.11	F.10.1.11
atanh	✅			✅			7.12.5.3	F.10.2.3
atanpi				✅			7.12.4.10	F.10.1.10
cbrt	✅	✅					7.12.7.1	F.10.4.1
compoundn							7.12.7.2	F.10.4.2
cos	✅	✅		✅			7.12.4.5	F.10.1.5
cosh	✅			✅			7.12.5.4	F.10.2.4
cospi	✅			✅			7.12.4.12	F.10.1.12
dsqrt	N/A	N/A	✅	N/A	✅*		7.12.14.6	F.10.11
erf	✅						7.12.8.1	F.10.5.1
erfc							7.12.8.2	F.10.5.2
exp	✅	✅		✅			7.12.6.1	F.10.3.1
exp10	✅	✅		✅			7.12.6.2	F.10.3.2
exp10m1	✅			✅			7.12.6.3	F.10.3.3
exp2	✅	✅		✅			7.12.6.4	F.10.3.4
exp2m1	✅			✅			7.12.6.5	F.10.3.5
expm1	✅	✅		✅			7.12.6.6	F.10.3.6
fma	✅	✅		✅			7.12.13.1	F.10.10.1
f16sqrt	✅*	✅*	✅*	N/A	✅		7.12.14.6	F.10.11
fsqrt	N/A	✅	✅	N/A	✅*		7.12.14.6	F.10.11
hypot	✅	✅		✅			7.12.7.4	F.10.4.4
lgamma							7.12.8.3	F.10.5.3
log	✅	✅		✅		✅ ?	7.12.6.11	F.10.3.11
log10	✅	✅		✅			7.12.6.12	F.10.3.12
log10p1							7.12.6.13	F.10.3.13
log1p	✅	✅					7.12.6.14	F.10.3.14
log2	✅	✅		✅			7.12.6.15	F.10.3.15
log2p1							7.12.6.16	F.10.3.16
logp1							7.12.6.14	F.10.3.14
pow	✅	1 ULP					7.12.7.5	F.10.4.5
powi*
pown							7.12.7.6	F.10.4.6
powr							7.12.7.7	F.10.4.7
rootn							7.12.7.8	F.10.4.8
rsqrt	✅			✅			7.12.7.9	F.10.4.9
sin	✅	✅		✅			7.12.4.6	F.10.1.6
sincos	✅	✅
sinh	✅			✅			7.12.5.5	F.10.2.5
sinpi	✅			✅			7.12.4.13	F.10.1.13
sqrt	✅	✅	✅	✅	✅	✅	7.12.7.10	F.10.4.10
tan	✅	✅		✅			7.12.4.7	F.10.1.7
tanh	✅			✅			7.12.5.6	F.10.2.6
tanpi	✅			✅			7.12.4.14	F.10.1.14
tgamma							7.12.8.4	F.10.5.4

Legends:

✅ : correctly rounded for all 4 rounding modes.
CR: correctly rounded for the default rounding mode (round-to-the-nearest, tie-to-even).
x ULPs: largest errors recorded.
N/A: Not defined in the standard or will not be added.
*: LLVM libc extension.
? Because of a conflict between float16 logb function and bfloat16 log function, the latter is implemented as log_bf16.

GPU Conformance ¶

Conformance tests are located at: offload/unittests/Conformance.
The math functions for GPUs are compiled with the following optimization options: LIBC_MATH_SKIP_ACCURATE_PASS, LIBC_MATH_INTERMEDIATE_COMP_IN_FLOAT, LIBC_MATH_SMALL_TABLES, LIBC_MATH_NO_ERRNO, and LIBC_MATH_NO_EXCEPT.
The conformance test results for higher math functions on GPUs are reported in the table below. The results show the maximum observed ULP distance when comparing a given GPU implementation against the corresponding correctly rounded implementation from LLVM libc, which is computed on the host CPU and serves as the reference. For comparison purposes, results for CUDA Math and HIP Math against the same reference are also included.

Function	Test Method	ULP Tolerance	Max ULP Distance
Function	Test Method	ULP Tolerance	LLVM libc (AMDGPU)	LLVM libc (CUDA)	CUDA Math (CUDA)	HIP Math (AMDGPU)
acos	Randomized	4	6 (FAILED)	6 (FAILED)	1	1
acosf	Exhaustive	4	1	1	1	1
acosf16	Exhaustive	2	1	1		1
acoshf	Exhaustive	4	1	1	2	1
acoshf16	Exhaustive	2	0	0		0
acospif16	Exhaustive	2	0	0
asin	Randomized	4	6 (FAILED)	6 (FAILED)	2	1
asinf	Exhaustive	4	1	1	1	3
asinf16	Exhaustive	2	0	0		2
asinhf	Exhaustive	4	1	1	2	1
asinhf16	Exhaustive	2	1	1		1
atanf	Exhaustive	5	0	0	1	2
atanf16	Exhaustive	2	1	1		1
atan2f	Randomized	6	1	1	2	3
atanhf	Exhaustive	5	0	0	3	1
atanhf16	Exhaustive	2	0	0		1
cbrt	Randomized	2	1	1	1	1
cbrtf	Exhaustive	2	0	0	1	1
cos	Randomized	4	1	1	2	1
cosf	Exhaustive	4	1	1	2	2
cosf16	Exhaustive	2	1	1		1
coshf	Exhaustive	4	0	0	2	1
coshf16	Exhaustive	2	1	0		1
cospif	Exhaustive	4	0	0	1	1
cospif16	Exhaustive	2	0	0
erff	Exhaustive	16	0	0	1	2
exp	Randomized	3	1	1	1	1
expf	Exhaustive	3	0	0	2	1
expf16	Exhaustive	2	1	1		1
exp10	Randomized	3	1	1	1	1
exp10f	Exhaustive	3	0	0	2	1
exp10f16	Exhaustive	2	1	1		1
exp2	Randomized	3	1	1	1	1
exp2f	Exhaustive	3	1	1	2	1
exp2f16	Exhaustive	2	1	1		0
expm1	Randomized	3	0	0	1	2
expm1f	Exhaustive	3	1	1	1	1
expm1f16	Exhaustive	2	1	1		1
hypot	Randomized	4	0	0	2	1
hypotf	Randomized	4	0	0	1	2
hypotf16	Exhaustive	2	0	0
log	Randomized	3	1	1	1	1
logf	Exhaustive	3	1	1	1	2
logf16	Exhaustive	2	1	1		1
log10	Randomized	3	1	1	1	1
log10f	Exhaustive	3	1	1	2	2
log10f16	Exhaustive	2	1	1		1
log1p	Randomized	2	1	1	1	1
log1pf	Exhaustive	2	1	1	1	1
log2	Randomized	3	1	1	1	1
log2f	Exhaustive	3	0	0	1	1
log2f16	Exhaustive	2	1	1		0
powf (integer exp.)	Randomized	16	0	0	2	1
powf (real exp.)	Randomized	16	0	0	2	1
sin	Randomized	4	1	1	1	1
sinf	Exhaustive	4	1	1	1	2
sinf16	Exhaustive	2	1	1		1
sincos (cos part)	Randomized	4	1	1	2	1
sincos (sin part)	Randomized	4	1	1	1	1
sincosf (cos part)	Exhaustive	4	1	1	2	2
sincosf (sin part)	Exhaustive	4	1	1	1	2
sinhf	Exhaustive	4	1	1	3	1
sinhf16	Exhaustive	2	1	1		1
sinpif	Exhaustive	4	0	0	1	1
sinpif16	Exhaustive	2	0	0
tan	Randomized	5	2	2	2	1
tanf	Exhaustive	5	0	0	3	2
tanf16	Exhaustive	2	1	1		2
tanhf	Exhaustive	5	0	0	2	1
tanhf16	Exhaustive	2	0	0		1
tanpif	Exhaustive	6	0	0
tanpif16	Exhaustive	2	1	1

Notes:

Exhaustive tests check every representable point in the input space. This method is used for half-precision functions and single-precision univariate functions.
Randomized tests check a large, deterministic subset of the input space, typically using 2³² samples. This method is used for functions with larger input spaces, such as single-precision bivariate and double-precision functions.
ULP tolerances are based on The Khronos Group, The OpenCL C Specification v3.0.19, Sec. 7.4, Khronos Registry [July 10, 2025].
The AMD GPU used for testing is AMD Radeon RX 6950 XT.
The NVIDIA GPU used for testing is NVIDIA RTX 4000 SFF Ada Generation.

Performance ¶

Simple performance testings are located at: libc/test/src/math/performance_testing.
We also use the perf tool from the CORE-MATH project: link. The performance results from the CORE-MATH’s perf tool are reported in the table below, using the system library as reference (such as the GNU C library on Linux). Fmod performance results obtained with “performance_testing”.

<Func>	Reciprocal throughput (clk)		Latency (clk)		Testing ranges	Testing configuration
<Func>	LLVM libc	Reference (glibc)	LLVM libc	Reference (glibc)	Testing ranges	CPU	OS	Compiler	Special flags
acosf	24	29	62	77	\([-1, 1]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
acoshf	18	26	73	74	\([1, 21]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
asinf	23	27	62	62	\([-1, 1]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
asinhf	21	39	77	91	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
atanf	27	29	79	68	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
atanhf	18	66	68	133	\([-1, 1]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
cosf	13	32	53	59	\([0, 2\pi]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
coshf	14	20	50	48	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
expf	9	7	44	38	\([-10, 10]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
exp10f	10	8	40	38	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
exp2f	9	6	35	31	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
expm1f	9	44	42	121	\([-10, 10]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
fmodf	73	263			[MIN_NORMAL, MAX_NORMAL]	i5 mobile	Ubuntu 20.04 LTS x86_64	Clang 12.0.0
fmodf	9	11			[0, MAX_SUBNORMAL]	i5 mobile	Ubuntu 20.04 LTS x86_64	Clang 12.0.0
fmod	595	3297			[MIN_NORMAL, MAX_NORMAL]	i5 mobile	Ubuntu 20.04 LTS x86_64	Clang 12.0.0
fmod	14	13			[0, MAX_SUBNORMAL]	i5 mobile	Ubuntu 20.04 LTS x86_64	Clang 12.0.0
hypotf	25	15	64	49	\([-10, 10] \times [-10, 10]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0
logf	12	10	56	46	\([e^{-1}, e]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
log10f	9	17	35	48	\([e^{-1}, e]\)	Ryzen 5900X	Ubuntu 22.04 LTS x86_64	Clang 15.0.6	FMA
log1pf	16	33	61	97	\([e^{-0.5} - 1, e^{0.5} - 1]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
log2f	13	10	57	46	\([e^{-1}, e]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
sinf	12	25	51	57	\([-\pi, \pi]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
sincosf	19	30	57	68	\([-\pi, \pi]\)	Ryzen 1700	Ubuntu 20.04 LTS x86_64	Clang 12.0.0	FMA
sinhf	13	63	48	137	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
tanf	16	50	61	107	\([-\pi, \pi]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA
tanhf	13	55	57	123	\([-10, 10]\)	Ryzen 1700	Ubuntu 22.04 LTS x86_64	Clang 14.0.0	FMA

math.h¶

Source Locations ¶

Implementation Requirements / Goals ¶

Add a new math function to LLVM libc ¶

Implementation Status ¶

Basic Operations ¶

Higher Math Functions ¶

GPU Conformance ¶

Performance ¶

Algorithms + Implementation Details ¶

Fixed-point Arithmetics ¶

References ¶

libc

Navigation

Related Topics