Developing with the LLVM compiler infrastructure

During the course we will use LLVM, which is a well-known open-source compiler and toolchain. It is distributed under the Apache License 2.0 with LLVM Exceptions. Due to its popularity, there are various LLVM programs and libraries are packaged for many operating systems, including Debian GNU/Linux, Arch Linux, and FreeBSD. Therefore we could install LLVM from the operating system repository, although this would later prevent us from modifying its source code.

Setting up the integrated development environment

Furthermore we will also use the Visual Studio Code as the integrated development enviroment. However, the use of any development enviroment for C++ is acceptable, including Qt Creator, CLion, CodeLite, NetBeans, and Eclipse.

Note

The commands below assume that the Unix-like operating system is used, which includes Linux, FreeBSD, macOS, illumOS, and many others, but not Windows. To get a Unix-like environment on Windows 10 and newer, it is recommended to use the Windows Subsystem for Linux (WSL), Windows Terminal, and Visual Studio Code Remote - WSL extension.

First, install the C/C++ Extension Pack, which will install C/C++ and CMake extensions. More details about these extensions can be found in the C/C++ for Visual Studio Code guide.

Building the LLVM compiler infrastructure from source

Hereafter we will more or less follow the directions of Getting started with the LLVM System from the Getting Started/Tutorials section.

It is possible to download the LLVM source code from its releases page. We'll be using the latest patch release from the latest series that is available. At the time of the start of the course, this is release 16.0.3. We'll download the source code from the LLVM 16.0.3 release on GitHub:

$ curl -OL https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-16.0.3.tar.gz

This is the complete source code achive for all tools and libraries. The same page also provides the binaries as well as the separate source code archives for the tools and libraries produced by the LLVM sub-projects:

LLVM core libraries,
Clang compiler and its tools,
compiler-rt runtime library,
Flang compiler,
libclc OpenCL library,
libcxx C++ standard library and its application binary interface,
lld linker,
lldb debugger
OpenMP library for Clang and Flang,
Polly high-level loop and data-locality optimizations infrastructure, and
test suite.

Although all of these tools are interesting in their own way, most of them will not be used here. In particular, we will be using Clang to demonstrate the compile process.

We'll be following Building LLVM with CMake from LLVM documentation, section User Guides. Now it's time to unpack the source code tarballs.

$ tar xzf llvmorg-16.0.3.tar.gz

LLVM, Clang, and related projects use CMake for building. Most notably, it does not support building in the source tree, so it's necessary to start by creating a directory:

$ cd llvm-project-llvmorg-16.0.3

If Visual Studio Code is used for the development, this is the project directory that should be opened in it. Afterwards, the integrated terminal can be used for running the comamnds.

$ mkdir builddir
$ cd builddir

There are many CMake and LLVM-related variables that can be specified at build time. We'll use only three of them, one CMake and two LLVM-related, specifically:

-DCMAKE_BUILD_TYPE=Release sets the build mode to release (instead of the default debug), which results in smaller file size of the built binaries
-DBUILD_SHARED_LIBS=ON enables dynamic linking of libraries, which singificantly reduces memory requirements for building and results in smaller file size of the built binaries
-DLLVM_ENABLE_PROJECTS=clang enables building of Clang alongside LLVM

$ cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLLVM_ENABLE_PROJECTS=clang ../llvm
$ make -j 2
$ make -j 2 check

Assignment

Find out what is the latest released version of LLVM, download it instead of the one used above, and build it.

If you have many CPU cores, you can increase the number of parallel compile jobs by setting the -j parameter of the make command to a number larger than 2, for example the number of cores. This will make make make (!) the code faster, ideally several times faster.

Assignment

Find out how many CPU cores you have and check if increasing the number of jobs speeds up the build process.

Alternatively, LLVM can also be obtained from GitHub using Git. In that case, the branch release/16.x should be used. The rest of the process is pretty similar:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ git checkout release/16.x
$ mkdir builddir
$ cd builddir
$ cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLLVM_ENABLE_PROJECTS=clang ../llvm
$ make -j 2
$ make -j 2 check

Assignment

Find out what is the latest release branch of LLVM, check out that branch instead of the one used above, and build LLVM.

The overview of the LLVM architecture

While LLVM is building, let's take a look at the LLVM architecture. Chris Lattner, the main author of LLVM, wrote the LLVM chapter of The Architecture of Open Source Applications book. To follow the code described in the chapter, open the following files in the llvm-project-16.0.3/llvm-16.0.3 directory:

include/llvm/Analysis/InstructionSimplify.h
lib/Analysis/InstructionSimplify.cpp
include/llvm/Pass.h
lib/Transforms/Hello/Hello.cpp
include/llvm/ADT/Triple.h
lib/Target/X86/X86InstrArithmetic.td
lib/Target/AMDGPU/AMDGPUInstrInfo.td
test/CodeGen/X86/add.ll
test/CodeGen/AMDGPU/llvm.log10.ll

Author: Vedran Miletić