Building libs for GPUs#
Building the GPU C library#
This document will present recipes to build the LLVM C library targeting a GPU
architecture. The GPU build uses the same cross build
support as the other targets. However, the GPU target has the restriction that
it must be built with an up-to-date clang compiler. This is because the
GPU target uses several compiler extensions to target GPU architectures.
The LLVM C library currently supports two GPU targets. This is either
nvptx64-nvidia-cuda for NVIDIA GPUs or amdgcn-amd-amdhsa for AMD GPUs.
Targeting these architectures is done through clang’s cross-compiling
support using the --target=<triple> flag. The following sections will
describe how to build the GPU support specifically.
Once you have finished building, refer to Using libc for GPUs to get started with the newly built C library.
Bootstrap Build#
The recommended way to build the GPU libc is using a Bootstrap Build (see Build Concepts for an overview of build modes). This approach first builds the compiler (clang) and tools using the host compiler, and then automatically uses that newly-built compiler to build the C library for the GPU targets in a single CMake invocation. This is ideal for building for multiple GPU vendors at once.
Set the environment variables for the build:
export BUILD_DIR=build
export INSTALL_PREFIX=install
export BUILD_TYPE=Release
Configure the project:
cmake -G Ninja -S llvm -B $BUILD_DIR \
-DLLVM_ENABLE_PROJECTS="clang;lld" \
-DLLVM_ENABLE_RUNTIMES="openmp;offload" \
-DCMAKE_BUILD_TYPE=$BUILD_TYPE \
-DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \
-DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=libc \
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libc \
-DLLVM_RUNTIME_TARGETS="default;amdgcn-amd-amdhsa;nvptx64-nvidia-cuda"
Build and install the libraries:
ninja -C $BUILD_DIR install
We need clang to build the GPU C library and lld to link AMDGPU
executables, so we enable them in LLVM_ENABLE_PROJECTS. We add openmp to
LLVM_ENABLED_RUNTIMES so it is built for the default target and provides
OpenMP support. We then set RUNTIMES_<triple>_LLVM_ENABLE_RUNTIMES to enable
libc for the GPU targets. The LLVM_RUNTIME_TARGETS sets the enabled
targets to build, in this case we want the default target and the GPU targets.
Note that if libc were included in LLVM_ENABLE_RUNTIMES it would build
targeting the default host environment as well. Alternatively, you can point
your build towards the libc/cmake/caches/gpu.cmake cache file with -C.
Two-stage Cross-compiler Build#
Alternatively, you can use a manual Two-stage Cross-compiler Build (see Build Concepts). This separates the build into two distinct CMake invocations: first to build the host compiler and tools (Stage 1), and then to build the target library using those tools (Stage 2). This provides more direct control over the configuration and is useful if you only need to build for a single target or want to avoid the complexity of the combined bootstrap build.
First, build the host compiler. Set up the variables for the host build:
export HOST_BUILD_DIR=build-libc-tools
export HOST_C_COMPILER=clang
export HOST_CXX_COMPILER=clang++
Configure the host build:
cmake -G Ninja -S llvm -B $HOST_BUILD_DIR \
-DLLVM_ENABLE_PROJECTS="clang" \
-DCMAKE_C_COMPILER=$HOST_C_COMPILER \
-DCMAKE_CXX_COMPILER=$HOST_CXX_COMPILER \
-DLLVM_LIBC_FULL_BUILD=ON \
-DCMAKE_BUILD_TYPE=Release
Build the host tools:
ninja -C $HOST_BUILD_DIR
Once this has finished, use the newly built compiler to build the C library for the GPU. Select your target architecture (amdgcn-amd-amdhsa or nvptx64-nvidia-cuda).
Set up the variables for the target build:
export TARGET_TRIPLE=amdgcn-amd-amdhsa # or nvptx64-nvidia-cuda
export TARGET_BUILD_DIR=build
export TARGET_C_COMPILER=build-libc-tools/bin/clang
export TARGET_CXX_COMPILER=build-libc-tools/bin/clang++
Configure the target build:
cmake -G Ninja -S runtimes -B $TARGET_BUILD_DIR \
-DLLVM_ENABLE_RUNTIMES=libc \
-DCMAKE_C_COMPILER=$TARGET_C_COMPILER \
-DCMAKE_CXX_COMPILER=$TARGET_CXX_COMPILER \
-DLLVM_LIBC_FULL_BUILD=ON \
-DLLVM_DEFAULT_TARGET_TRIPLE=$TARGET_TRIPLE \
-DCMAKE_BUILD_TYPE=Release
Build and install the target library:
ninja -C $TARGET_BUILD_DIR install
The above steps will result in a build targeting one of the supported GPU architectures. Building for multiple targets requires separate CMake invocations.
Build overview#
Once installed, the GPU build will create several files used for different targets. This section will briefly describe their purpose.
- include/<target-triple>
The include directory where all of the generated headers for the target will go. These definitions are strictly for the GPU when being targeted directly.
- lib/clang/<llvm-major-version>/include/llvm-libc-wrappers/llvm-libc-decls
These are wrapper headers created for offloading languages like CUDA, HIP, or OpenMP. They contain functions supported in the GPU libc along with attributes and metadata that declare them on the target device and make them compatible with the host headers.
- lib/<target-triple>/libc.a
The main C library static archive containing LLVM-IR targeting the given GPU. It can be linked directly or inspected depending on the target support.
- lib/<target-triple>/libm.a
The C library static archive providing implementations of the standard math functions.
- lib/<target-triple>/libc.bc
An alternate form of the library provided as a single LLVM-IR bitcode blob. This can be used similarly to NVIDIA’s or AMD’s device libraries.
- lib/<target-triple>/libm.bc
An alternate form of the library provided as a single LLVM-IR bitcode blob containing the standard math functions.
- lib/<target-triple>/crt1.o
An LLVM-IR file containing startup code to call the
mainfunction on the GPU. This is used similarly to the standard C library startup object.- bin/amdhsa-loader
A binary utility used to launch executables compiled targeting the AMD GPU. This will be included if the build system found the
hsa-runtime64library either in/opt/rocmor the current CMake installation directory. This is required to build the GPU tests .See the libc GPU usage for more information.- bin/nvptx-loader
A binary utility used to launch executables compiled targeting the NVIDIA GPU. This will be included if the build system found the CUDA driver API. This is required for building tests.
- include/llvm-libc-rpc-server.h
A header file containing definitions that can be used to interface with the RPC server.
- lib/libllvmlibc_rpc_server.a
The static library containing the implementation of the RPC server. This can be used to enable host services for anyone looking to interface with the RPC client.
CMake options#
This section briefly lists a few of the CMake variables that specifically
control the GPU build of the C library. These options can be passed individually
to each target using -DRUNTIMES_<target>_<variable>=<value> when using a
standard runtime build.
- LLVM_LIBC_FULL_BUILD:BOOL
This flag controls whether or not the libc build will generate its own headers. This must always be on when targeting the GPU.
- LIBC_GPU_BUILD:BOOL
Shorthand for enabling GPU support. Equivalent to enabling support for both AMDGPU and NVPTX builds for
libc.- LIBC_GPU_TEST_ARCHITECTURE:STRING
Sets the architecture used to build the GPU tests for, such as
gfx90aorsm_80for AMD and NVIDIA GPUs respectively. The default behavior is to detect the system’s GPU architecture using thenativeoption. If this option is not set and a GPU was not detected the tests will not be built.- LIBC_GPU_TEST_JOBS:STRING
Sets the number of threads used to run GPU tests. The GPU test suite will commonly run out of resources if this is not constrained so it is recommended to keep it low. The default value is a single thread.
- CMAKE_CROSSCOMPILING_EMULATOR:STRING
Overrides the default loader used for running GPU tests. This is set automatically to
llvm-gpu-loaderfor GPU runtime targets when building via the runtimes build.