The libc code style

Naming style

For the large part, the libc project follows the general coding standards of the LLVM project. The libc project differs from that standard with respect to the naming style. The differences are as follows:

  1. Non-const variables - This includes function arguments, struct and class data members, non-const globals and local variables. They all use the snake_case style.

  2. const and constexpr variables - They use the capitalized SNAKE_CASE irrespective of whether they are local or global.

  3. Function and methods - They use the snake_case style like the non-const variables.

  4. Internal type names - These are types which are internal to the libc implementation. They use the CaptilizedCamelCase style.

  5. Public names - These are the names as prescribed by the standards and will follow the style as prescribed by the standards.

Macro style

We define two kinds of macros:

  1. Build defined macros are generated by CMake or Bazel and are passed down to the compiler with the -D command line flag. They start with the LIBC_COPT_ prefix. They are used to tune the behavior of the libc. They either denote an action or define a constant.

  2. Code defined macros are defined within the src/__support/macros folder. They all start with the LIBC_ prefix.

    • src/__support/macros/properties/ - Build related properties like target architecture or enabled CPU features defined by introspecting compiler defined preprocessor definitions.

      • architectures.h - Target architecture properties. e.g., LIBC_TARGET_ARCH_IS_ARM.

      • compiler.h - Host compiler properties. e.g., LIBC_COMPILER_IS_CLANG.

      • cpu_features.h - Target cpu feature availability. e.g., LIBC_TARGET_CPU_HAS_AVX2.

      • types.h - Type properties and availability. e.g., LIBC_TYPES_HAS_FLOAT128.

      • os.h - Target os properties. e.g., LIBC_TARGET_OS_IS_LINUX.

    • src/__support/macros/config.h - Important compiler and platform features. Such macros can be used to produce portable code by parameterizing compilation based on the presence or lack of a given feature. e.g., LIBC_HAS_FEATURE

    • src/__support/macros/attributes.h - Attributes for functions, types, and variables. e.g., LIBC_UNUSED

    • src/__support/macros/optimization.h - Portable macros for performance optimization. e.g., LIBC_LIKELY, LIBC_LOOP_NOUNROLL

Inline functions and variables defined in header files

When defining functions and variables inline in header files, we follow certain rules:

  1. The functions should not be given file-static linkage. There can be class static methods defined inline however.

  2. Instead of using the inline keyword, functions should be tagged with the LIBC_INLINE macro and variables should be tagged with the LIBC_INLINE_VAR macro defined in src/__support/macros/attributes.h. For example:

    LIBC_INLINE_VAR constexpr bool foo = true;
    
    LIBC_INLINE ReturnType function_defined_inline(ArgType arg) {
      ...
    }
    
  3. The LIBC_INLINE tag should also be added to functions which have definitions that are implicitly inline. Examples of such functions are class methods (static and non-static) defined inline and constexpr functions.

Setting errno from runtime code

Many libc functions set errno to indicate an error condition. If LLVM’s libc is being used as the only libc, then the errno from LLVM’s libc is affected. If LLVM’s libc is being used in the Overlay Mode, then the errno from the system libc is affected. When a libc function, which can potentially affect the errno, is called from a unit test, we do not want the global errno (as in, the errno of the process thread running the unit test) to be affected. If the global errno is affected, then the operation of the unit test infrastructure itself can be affected. To avoid perturbing the unit test infrastructure around the setting of errno, the following rules are to be followed:

  1. A special macro named libc_errno defined in src/errno/libc_errno.h should be used when setting errno from libc runtime code. For example, code to set errno to EINVAL should be:

    libc_errno = EINVAL;
    
  2. errno should be set just before returning from the implementation of the public function. It should not be set from within helper functions. Helper functions should use idiomatic C++ constructs like cpp::optional and ErrorOr to return error values.

  3. The header file src/errno/libc_errno.h is shipped as part of the target corresponding to the errno entrypoint libc.src.errno.errno. We do not in general allow dependencies between entrypoints. However, the errno entrypoint is the only exceptional entrypoint on which other entrypoints should explicitly depend on if they set errno to indicate error conditions.

Assertions in libc runtime code

The libc developers should, and are encouraged to, use assertions freely in the libc runtime code. However, the assertion should be listed via the macro LIBC_ASSERT defined in src/__support/libc_assert.h. This macro can be used from anywhere in the libc runtime code. Internally, all it does is to print the assertion expression and exit. It does not implement the semantics of the standard assert macro. Hence, it can be used from any where in the libc runtime code without causing any recursive calls or chicken-and-egg situations.

Allocations in the libc runtime code

Some libc functions allocate memory. For example, the strdup function allocates new memory into which the input string is duplicated. Allocations are typically done by calling a function from the malloc family of functions. Such functions can fail and return an error value to indicate allocation failure. To conform to standards, the libc should handle allocation failures gracefully and surface the error conditions to the user code as appropriate. Since LLVM’s libc is implemented in C++, we want allocations and deallocations to employ C++ operators new and delete as they implicitly invoke constructors and destructors respectively. However, if we use the default new and delete operators, the libc will end up depending on the C++ runtime. To avoid such a dependence, and to handle allocation failures gracefully, we use special new and delete operators defined in src/__support/CPP/new.h. Allocations and deallocations using these operators employ a pattern like this:

#include "src/__support/CPP/new.h"

...

  LIBC_NAMESPACE::AllocChecker ac;
  auto *obj = new (ac) Type(...);
  if (!ac) {
    // handle allocator failure.
  }
  ...
  delete obj;

The only exception to using the above pattern is if allocating using the realloc function is of value. In such cases, prefer to use only the malloc family of functions for allocations and deallocations. Allocation failures will still need to be handled gracefully. Further, keep in mind that these functions do not call the constructors and destructors of the allocated/deallocated objects. So, use these functions carefully and only when it is absolutely clear that constructor and destructor invocation is not required.

Warnings in sources

We expect contributions to be free of warnings from the minimum supported compiler versions (and newer).

Header Inclusion Policy

Because llvm-libc supports Overlay Mode and Fullbuild Mode care must be taken when #include’ing certain headers.

The include/ directory contains public facing headers that users must consume for fullbuild mode. As such, types defined here will have ABI implications as these definitions may differ from the underlying system for overlay mode and are NEVER appropriate to include in libc/src/ without preprocessor guards for LLVM_LIBC_FULL_BUILD.

Consider the case where an implementation in libc/src/ may wish to refer to a sigset_t, what header should be included? <signal.h>, <spawn.h>, <sys/select.h>?

None of the above. Instead, code under src/ should #include "hdr/types/sigset_t.h" which contains preprocessor guards on LLVM_LIBC_FULL_BUILD to either include the public type (fullbuild mode) or the underlying system header (overlay mode).

Implementations in libc/src/ should NOT be #include’ing using <> or "include/*, except for these “proxy” headers that first check for LLVM_LIBC_FULL_BUILD.

These “proxy” headers are similarly used when referring to preprocessor defines. Code under libc/src/ should #include a proxy header from hdr/, which contains a guard on LLVM_LIBC_FULL_BUILD to either include our header from libc/include/ (fullbuild) or the corresponding underlying system header (overlay).

Policy on Assembly sources

Coding in high level languages such as C++ provides benefits relative to low level languages like Assembly, such as:

  • Improved safety

  • Compile time diagnostics

  • Instrumentation

    • Code coverage

    • Profile collection

  • Sanitization

  • Automatic generation of debug info

While it’s not impossible to have Assembly code that correctly provides all of the above, we do not wish to maintain such Assembly sources in llvm-libc.

That said, there are a few functions provided by llvm-libc that are impossible to reliably implement in C++ for all compilers supported for building llvm-libc.

We do use inline or out-of-line Assembly in an intentionally minimal set of places; typically places where the stack or individual register state must be manipulated very carefully for correctness, or instances where a specific instruction sequence does not have a corresponding compiler builtin function today.

Contributions adding functions implemented purely in Assembly for performance are not welcome.

Contributors should strive to stick with C++ for as long as it remains reasonable to do so. Ideally, bugs should be filed against compiler vendors, and links to those bug reports should appear in commit messages or comments that seek to add Assembly to llvm-libc.

Patches containing any amount of Assembly ideally should be approved by 2 maintainers. llvm-libc maintainers reserve the right to reject Assembly contributions that they feel could be better maintained if rewritten in C++, and to revisit this policy in the future.

LIBC_NAMESPACE_DECL

llvm-libc provides a macro LIBC_NAMESPACE which contains internal implementations of libc functions and globals. This macro should only be used as an identifier for accessing such symbols within the namespace (like LIBC_NAMESPACE::cpp::max). Any usage of this namespace for declaring or defining internal symbols should instead use LIBC_NAMESPACE_DECL which declares LIBC_NAMESPACE with hidden visibility.

Example usage:

#include "src/__support/macros/config.h"  // The macro is defined here.

namespace LIBC_NAMESPACE_DECL {

void new_function() {
  ...
}

}  // LIBC_NAMESPACE_DECL

Having hidden visibility on the namespace ensures extern declarations in a given TU have known visibility and never generate GOT indirextions. The attribute guarantees this independently of global compile options and build systems.