Linking and Loading


Linking, loading, header files, and libraries—what are they all about? This post is a description of how these concepts interrelate for C/C++ code. Since the  Arduino language is really C++, this material applies in that context as well. In fact, this post aims to explain these concepts specifically as they relate to Arduino code development.

Functions

Wait, how did function get in here?  Functions are blocks of code that solve one of the earliest problems of software development: how to re-use code without having to copy and paste it into each project that could use it. There are three interrelated things to do with a function: declare it, define it, and reference it.

declaration
The declaration gives the “signature“ of the function: its name, return type, and the number of parameters to pass to it, along with the types of those parameters. A function definition may serve double-duty as the function’s declaration, but the situation that is important for linking and loading is when the declaration is put in a separate file, which is added to the compilation code by including it at the top of the code file (at the “head of the code file”) that uses it. These are called “header files” and by convention their names end in .h
definition
The definition gives the signature of the function but also includes, inside curly braces following the signature, the
statements that make up the body of the function—the statements that are executed when the function is referenced.
reference
References are the pieces of code that “invoke” or “call” a function—that cause the statements in the function body to be executed. When the compiler encounters a function reference, the reference must agree with a known function signature in order to be compiled.

Loading

The above description of things related to functions deals with source code: what the programmer has to deal with when preparing a project to be compiled. Now we jump to what has to happen to load compiled code into memory for execution.

The machine language code generated by the compiler has to be loaded into the computer’s memory from one or more disk files. All the compiled code for the program has to be loaded into memory: the code for the functions you defined (setup(), loop(), and possibly others); the code for the functions your code references; some special code that provides the “runtime environment” for your code. (For Arduino, this is a function named main() that calls your setup() function once, and then endlessly calls your loop() function over and over again.) Each function definition gets assigned its own location in memory, and each function call has to be filled in with the address that identifies the corresponding definition’s location in memory.

Libraries

For this discussion, a library is a file that contains a set of function definitions that are somehow related, such as to control NeoPixels on an Adafruit board. In Arduino, these libraries are compiled as part of the same process that compiles your .ino file(s) because the machine language to be generated depends on what Arduino board you are using. For C/C++ applications running on a regular computer, these library files are typically precompiled (to save time building an application) because the processor is known ahead of time.

Linking

Linking is the process of deciding what code files to pass on to the loading process. The machine code for your “user defined” definitions of setup() and loop() will come from a file produced by the compiler when it translated your .ino file(s) into machine language. The Arduino runtime code will come from a file that is available inside the package of files that come with the Arduino application. In addition there may be library files that contain definitions for functions that a project uses beyond those that are part of the runtime or user-defined functions.

When the compiler processes your code (and header files), it leaves information in the machine language file that tells where the function references are located within the file and what the functions are that they refer to. It’s the linker’s job to look at this information and then find the other files that contains the needed function definitions, and to link the files together so that the machine language function references connect to the corresponding machine language function definitions.

The linker leaves to the loader the task of filling in the actual memory locations used by functions.

Header Files

We end with a slightly arcane and somewhat pedantic statement of the role of header files in an application. People often say that header files tell what libraries to use. That’s incomplete, but not incorrect. People also say that header files are the libraries that a project uses, and that is (pedantically) incorrect.

A more accurate statement would be that a header file provides the code for function signatures that allow a compiler to generate code for function references in your code. The machine instructions for those function references then have to be “fixed up” (a technical term) by the linker and loader for the program to be able to run.

Header files contain other information besides function signatures. They may also define named constants and class information that are used for working with a library.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.