So… I’m doing software development for microcontrollers at work with a commercial toolchain. I also evaluated the arm-none-eabi-gcc toolchain in the past (briefly).

One thing that continues to confuse me is the different compiler and linker options used in various online examples, tutorials and library Makefiles.

What I especially wanted to know is: What happens before main() with each combination of options? Do the variables get initialized (i.e. does the .data section get copied from flash to RAM) and is the .bss section properly zeroed?

TL;DR: See last paragraph

Some libraries, e.g. the libopencm3 project, as well as the ARM CMSIS contain explicit startup code that do these tasks.

This post uses the latest version of the GNU embedded toolchain for ARM. At the time of writing, that’s 8‑2019‑q3‑update.

This post will take a look at:

  • using -nostartfiles vs. not using it
  • using nano.specs vs. nosys.specs vs. not using any specs file

I start with a simple main() function that should suffice to analyze what I want. This example won’t run on an actual Cortex-M3 microcontroller, since it lacks the vector table. I might (read: probably not ;)) look into this in a later post.

I’m gonna use two different variants, one main() function that exits/returns and one that doesn’t. Note the latter is the usual case when programming microcontrollers.

For sake of simplicity, lets do it like this:

static int zerobss;
static int globalinit = 42;

int main() {
    int localvar = 21;
#ifdef DOEXIT
    return 0;
#else
    while(1);
#endif
}

To test the effect of various options, I wrote the following bash script which loops over all combinations:

#!/bin/bash
# make.sh

export PATH=/opt/gcc-arm-none-eabi-8-2019-q3-update/bin/:$PATH

COMMONFLAGS="-mcpu=cortex-m3 -mthumb "
FILES=main.c
STARTOPT=("" -nostartfiles)
SPECS=("" "-specs=nosys.specs" "-specs=nano.specs")
EXIT=("" "-DDOEXIT")
rm -f *.elf

for f in $FILES; do
    for sf in "${STARTOPT[@]}"; do
        for sp in "${SPECS[@]}"; do
	    for ex in "${EXIT[@]}"; do
                # bash substitutions, remove leading -specs= and trailing .specs
                specname=${sp:7}
                specname=${specname%.specs}
                outfile=`basename $f .c`${sf}-${specname}-${ex:2}.elf
                arm-none-eabi-gcc $ex $COMMONFLAGS $sf $sp -o $outfile $f 2>/dev/null && echo "Compiling $outfile succeeded" || echo "Compiling $outfile failed"
            done
        done

    done
done

arm-none-eabi-size *.elf

I silenced the error output, otherwise it gets too noisy.

First, the following combinations fail:

Compiling main--.elf failed
Compiling main--DOEXIT.elf failed
Compiling main-nano-.elf failed
Compiling main-nano-DOEXIT.elf failed

In these cases, linking fails because (shortened output)

x/arm-none-eabi/bin/ld: x/arm-none-eabi/lib/thumb/v7-m/nofp/libc.a(lib_a-exit.o): in function `exit':
exit.c:(.text.exit+0x16): undefined reference to `_exit'
x/arm-none-eabi/bin/ld: x/arm-none-eabi/lib/thumb/v7-m/nofp/libc_nano.a(lib_a-exit.o): in function `exit':
exit.c:(.text.exit+0x1a): undefined reference to `_exit'

That is, when neither -nostartfiles nor -specs=nosys.specs is specified, the function _exit gets referenced without being defined.

In every other case, the linker emits a warning, since the entry symbol is not defined:

cannot find entry symbol _start; defaulting to 0000000000008000

So, let’s look what we’ve got (sorted by the arm-none-eabi-size output):

   text	   data	    bss	    dec	    hex	filename
     12	      4	      4	     20	     14	main-nostartfiles--.elf
     12	      4	      4	     20	     14	main-nostartfiles-nano-.elf
     12	      4	      4	     20	     14	main-nostartfiles-nosys-.elf
     22	      4	      4	     30	     1e	main-nostartfiles--DOEXIT.elf
     22	      4	      4	     30	     1e	main-nostartfiles-nano-DOEXIT.elf
     22	      4	      4	     30	     1e	main-nostartfiles-nosys-DOEXIT.elf
    932	   1096	     68	   2096	    830	main-nosys-.elf
    944	   1096	     68	   2108	    83c	main-nosys-DOEXIT.elf

Looks interesting. Just using nosys.specs seems to add a lot of stuff.

Let’s look at the disassembly of the return vs. loop code. Since the specs file used makes no difference, use the first file of each group:

$ arm-none-eabi-objdump -S -d main-nostartfiles--.elf main-nostartfiles--DOEXIT.elf

main-nostartfiles--.elf:     file format elf32-littlearm


Disassembly of section .text:

00008000 <main>:
static int zerobss;
static int globalinit = 42;

int main() {
    8000:	b480      	push	{r7}
    8002:	b083      	sub	sp, #12
    8004:	af00      	add	r7, sp, #0
	int localvar = 21;
    8006:	2315      	movs	r3, #21
    8008:	607b      	str	r3, [r7, #4]
#ifdef DOEXIT
	return 0;
#else
	while(1);
    800a:	e7fe      	b.n	800a <main+0xa>

main-nostartfiles--DOEXIT.elf:     file format elf32-littlearm


Disassembly of section .text:

00008000 <main>:
static int zerobss;
static int globalinit = 42;

int main() {
    8000:	b480      	push	{r7}
    8002:	b083      	sub	sp, #12
    8004:	af00      	add	r7, sp, #0
	int localvar = 21;
    8006:	2315      	movs	r3, #21
    8008:	607b      	str	r3, [r7, #4]
#ifdef DOEXIT
	return 0;
    800a:	2300      	movs	r3, #0
#else
	while(1);
#endif
}
    800c:	4618      	mov	r0, r3
    800e:	370c      	adds	r7, #12
    8010:	46bd      	mov	sp, r7
    8012:	bc80      	pop	{r7}
    8014:	4770      	bx	lr

No real surprises here. However: There is no code initializing the .bss section, not copying the .data section from flash memory to RAM (as we already could’ve guessed from looking at the code size). In these cases, writing the code manually would indeed be required.

So, what does nosys.specs add? Let’s just look at the symbol table of the looping example:

$ arm-none-eabi-objdump -t  main-nosys-.elf 

main-nosys-.elf:     file format elf32-littlearm

SYMBOL TABLE:
00008000 l    d  .init	00000000 .init
0000800c l    d  .text	00000000 .text
00008388 l    d  .fini	00000000 .fini
00008394 l    d  .rodata	00000000 .rodata
00008398 l    d  .ARM.exidx	00000000 .ARM.exidx
000083a0 l    d  .eh_frame	00000000 .eh_frame
000183a4 l    d  .init_array	00000000 .init_array
000183ac l    d  .fini_array	00000000 .fini_array
000183b0 l    d  .data	00000000 .data
000187ec l    d  .bss	00000000 .bss
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .debug_aranges	00000000 .debug_aranges
00000000 l    d  .debug_info	00000000 .debug_info
00000000 l    d  .debug_abbrev	00000000 .debug_abbrev
00000000 l    d  .debug_line	00000000 .debug_line
00000000 l    d  .debug_frame	00000000 .debug_frame
00000000 l    d  .debug_str	00000000 .debug_str
00000000 l    d  .ARM.attributes	00000000 .ARM.attributes
00000000 l    df *ABS*	00000000 /opt/gcc-arm-none-eabi-8-2019-q3-update/bin/../lib/gcc/arm-none-eabi/8.3.1/thumb/v7-m/nofp/crti.o
00000000 l    df *ABS*	00000000 /opt/gcc-arm-none-eabi-8-2019-q3-update/bin/../lib/gcc/arm-none-eabi/8.3.1/thumb/v7-m/nofp/crtn.o
00000000 l    df *ABS*	00000000 exit.c
00000000 l    df *ABS*	00000000 __call_atexit.c
0000802c l     F .text	00000014 register_fini
00000000 l    df *ABS*	00000000 crtstuff.c
000083a0 l     O .eh_frame	00000000 
00008040 l     F .text	00000000 __do_global_dtors_aux
000187ec l       .bss	00000001 completed.8885
000183ac l     O .fini_array	00000000 __do_global_dtors_aux_fini_array_entry
00008064 l     F .text	00000000 frame_dummy
000187f0 l       .bss	00000018 object.8890
000183a8 l     O .init_array	00000000 __frame_dummy_init_array_entry
00000000 l    df *ABS*	00000000 /opt/gcc-arm-none-eabi-8-2019-q3-update/bin/../lib/gcc/arm-none-eabi/8.3.1/../../../../arm-none-eabi/lib/thumb/v7-m/nofp/crt0.o
00000000 l    df *ABS*	00000000 main.c
00018808 l       .bss	00000004 zerobss
000183b4 l     O .data	00000004 globalinit
00000000 l    df *ABS*	00000000 impure.c
000183c0 l     O .data	00000428 impure_data
00000000 l    df *ABS*	00000000 init.c
00000000 l    df *ABS*	00000000 memset.c
00000000 l    df *ABS*	00000000 atexit.c
00000000 l    df *ABS*	00000000 fini.c
00000000 l    df *ABS*	00000000 lock.c
00000000 l    df *ABS*	00000000 __atexit.c
00000000 l    df *ABS*	00000000 _exit.c
00000000 l    df *ABS*	00000000 crtstuff.c
000083a0 l     O .eh_frame	00000000 __FRAME_END__
00000000 l    df *ABS*	00000000 
000183b0 l       .fini_array	00000000 __fini_array_end
000183ac l       .fini_array	00000000 __fini_array_start
000183ac l       .init_array	00000000 __init_array_end
000183a4 l       .init_array	00000000 __preinit_array_end
000183a4 l       .init_array	00000000 __init_array_start
000183a4 l       .init_array	00000000 __preinit_array_start
0001880c g     O .bss	00000001 __lock___atexit_recursive_mutex
00018810 g     O .bss	00000001 __lock___arc4random_mutex
000187e8 g     O .data	00000004 __atexit_recursive_mutex
000082e0 g     F .text	00000002 __retarget_lock_close
00018830 g       .bss	00000000 _bss_end__
000187ec g       .bss	00000000 __bss_start__
000183b0 g     O .data	00000000 .hidden __dso_handle
00018814 g     O .bss	00000001 __lock___env_recursive_mutex
00018818 g     O .bss	00000001 __lock___sinit_recursive_mutex
00008394 g     O .rodata	00000004 _global_impure_ptr
00008100 g     F .text	00000048 __libc_init_array
00008080 g     F .text	00000000 _mainCRTStartup
00008000 g     F .init	00000000 _init
000082a4 g     F .text	00000034 __libc_fini_array
0001881c g     O .bss	00000001 __lock___malloc_recursive_mutex
000082fc g     F .text	00000002 __retarget_lock_release_recursive
000082f4 g     F .text	00000004 __retarget_lock_try_acquire_recursive
00018830 g       .bss	00000000 __bss_end__
000081e8 g     F .text	000000b0 __call_exitprocs
00008080 g     F .text	00000000 _start
000082f0 g     F .text	00000004 __retarget_lock_try_acquire
00008300 g     F .text	00000084 __register_exitproc
000082e4 g     F .text	00000002 __retarget_lock_close_recursive
000082ec g     F .text	00000002 __retarget_lock_acquire_recursive
000187ec g       .bss	00000000 __bss_start
00008148 g     F .text	000000a0 memset
000080f4 g     F .text	0000000c main
000082dc g     F .text	00000002 __retarget_lock_init_recursive
00018830 g       .bss	00000000 __end__
000082d8 g     F .text	00000002 __retarget_lock_init
00008388 g     F .fini	00000000 _fini
00008298 g     F .text	0000000c atexit
000183b8 g     O .data	00000004 _impure_ptr
000187ec g       .data	00000000 _edata
00018830 g       .bss	00000000 _end
00018820 g     O .bss	00000001 __lock___at_quick_exit_mutex
0000800c g     F .text	00000020 exit
000082e8 g     F .text	00000002 __retarget_lock_acquire
000082f8 g     F .text	00000002 __retarget_lock_release
00008384 g     F .text	00000002 _exit
00018824 g     O .bss	00000001 __lock___dd_hash_mutex
00018828 g     O .bss	00000001 __lock___tz_mutex
00080000 g       .comment	00000000 _stack
000183b0 g       .data	00000000 __data_start
0001882c g     O .bss	00000001 __lock___sfp_recursive_mutex

In the list, we see _mainCRTStartup. This is the code which is responsible for the tasks we want to be performed.

We also see the _exit function defined (which gets referenced by the startup code). The disassembly doesn’t really surprise, it’s a simple infinite loop:

$ arm-none-eabi-objdump --disassemble=_exit  main-nosys-.elf 

main-nosys-.elf:     file format elf32-littlearm


Disassembly of section .init:

Disassembly of section .text:

00008384 <_exit>:
    8384:	e7fe      	b.n	8384 <_exit>

Disassembly of section .fini:

I’ll spare you with further disassembly output, since it’s pretty long.

Summary

So, a quick recap:

  • not using -nostartfiles means the linker fails because _exit is not defined (except when using nosys.specs which contains a definition for that function). While this might sound contradicting at first, this is because the function is referenced by the startup code. You could simply define the function yourself, if it’s the same as with the commercial Keil compiler, the signature should be void _exit(int). Hint: Look for “retargeting”.
  • When using nosys.specs, you can omit the -nostartfiles option, meaning .bss gets initialized and data copied to ROM. Note that the program will still not run on a microcontroller, like mentioned above, because you need the vector table for this. This will also require a custom linker script, which in turn means this linker script will have to define the symbols assumed by the startup code properly.
    Hint: using -Wl,--verbose with gcc enables verbose linker output, which also shows the default linker script.

Some StackOverflow answers also reference the option -ffreestanding. I found this makes no difference with my example, however.

In short: Yes, you need to write the startup code that initializes .bss and copies .data from flash to RAM manually when doing bare-metal programming.

Update 2019-07-13: Adding links

Also, supplying -nostdlib has the same effect as -nostartfiles for this particular example.

Regarding this topic, please also read “From zero to main(): Bare metal C” on interrupt.memfault.com (archive.org link in case website goes down).

Another source that really made things “click” for me is “Bare-metal C programming on ARM”. Although the book handles the Cortex-A series, the section about startup code is really worth a read. The book is also available as PDF (again, archive.org link).