July 26, 2021

Malware Protection

Dedicated Forum to help removing adware, malware, spyware, ransomware, trojans, viruses and more!

The Newlib Embedded C Standard Library And How To Use It

The Newlib Embedded C Standard Library And How To Use It
The Newlib Embedded C Standard Library And How To Use It

When writing code for a new hardware platform, the last thing you want to do is bother with the minutiae of I/O routines, string handling and other similarly tedious details that have nothing to do with the actual project. On bigger systems, this is where the C standard library would traditionally come into play.

For small embedded platforms like microcontrollers, resources are often tight enough that a full-blown stdlib won’t fit, which is why Newlib exists: to bring the portability benefits of a standard library to microcontrollers.

Whether you use C, C++ or MicroPython to program an MCU, Newlib is likely there under the hood. Yet how exactly does it integrate with the hardware, and how are system calls (syscalls) for e.g. file and input/output handling implemented?

A Stubby Toolkit

The C standard library provides a number of headers that cover the available functionality. With each revision of the C standard new headers are added that cover additional features. Of these original headers, the most commonly used include:

  • <stdio.h>
  • <string.h>
  • <stdlib.h>
  • <math.h>
  • <time.h>

Here one can already surmise that each of these headers differ in how complicated the underlying code is to port to a new platform, especially in the case of an MCU platform with no operating system (OS). Without an OS, there is no obvious way to provide easy access to certain functionality like a standard text input and output, or a system clock and calendar. This leads us to the many stub functions in Newlib.

In the case of <string.h> we’re pretty safe, as C-style strings and their operations essentially concern memory operations, something for which no special syscalls are required. This is very different from <stdio.h> which contains functionality related to file access and operations, as well as input and output to and from standard out or in.

Without some underlying code that connects the libc implementation to e.g. a terminal or storage medium, nothing can happen for these I/O features as there’s no sensible default action for e.g. printf() or fopen(). If we do wish to use printf() or other text output functions, the Newlib documentation tells us that we need to implement a global function int _write(int handle, char* data, int size).

As the name ‘stub’ implies, the Newlib library comes with its own stub implementations that do nothing, so what does one do to make printf() write to somewhere sensible? The most important thing to realize here is that this is completely implementation dependent and what makes sense depends on the specific project. Often in an embedded application formatted text output functions will be used for outputting debug and similar information, in which case outputting to a USART makes perfect sense, for example.

In my Nodate framework, the approach chosen was to allow the code on start-up to pick a specific USART peripheral to send output to, as we can see its implementation of the stub function in the IO module:

bool stdout_active = false;
USART_devices IO::usart;
int _write(int handle, char* data, int size) {
	if (!stdout_active) { return 0; }
	
	int count = size;
	while (count-- > 0) {
		USART::sendUart(IO::usart, *data);
		data++;
	}
	
	return size;
}

A character array is provided along with its length, which we then transfer in this case to the active USART. As a USART transfers single bytes, the provided array is transferred one byte at a time.

As the target USART can change per platform, this is made configurable, allowing the developer to set the target output device once on start-up as well as dynamically during runtime.

bool IO::setStdOutTarget(USART_devices device) {
	IO::usart = device;	
	stdout_active = true;
	
	return true;
}

Of importance with these stub implementations is that they rely on C-style linkage to find overrides. Since in languages like C++ name mangling is applied by default, be sure to apply an extern "C" { } block around either the full implementation, or a forward declaration of the stub implementation.

A Matter of Timing

In order for time-related functionality to work as defined in <time.h>, there needs to be an underlying time base or at least counter from which we can obtain this information. The use of a systick counter which contains the number of milliseconds since boot is not enough to cover e.g. time(), which requires the number of seconds since the Unix epoch.

A possible implementation of the underlying int _times(struct tms* buf) syscall would use the system’s real-time clock (RTC). This also has the major benefit over using a systick in that the RTC can be left running in low-power mode, allowing for accurate timing results even when the system is regularly put into sleep mode or even turned off completely.

In Nodate, this functionality is implemented in clock.cpp for STM32, which enables the RTC if it hasn’t been started already:

int _times(struct tms* buf) {
#if defined RTC_TR_SU
	if (!rtc_pwr_enabled) {
		if (!Rtc::enableRTC()) { return -1; }
		rtc_pwr_enabled = true;
	}
	
	// Fill tms struct from RTC registers.
	// struct tms {
	//		clock_t tms_utime;  /* user time */
	//		clock_t tms_stime;  /* system time */
	//		clock_t tms_cutime; /* user time of children */
	//		clock_t tms_cstime; /* system time of children */
	//	};
	uint32_t tTR = RTC->TR;
	uint32_t ticks = (uint8_t) RTC_Bcd2ToByte(tTR & (RTC_TR_ST | RTC_TR_SU));
	ticks = ticks * SystemCoreClock;
	buf->tms_utime = ticks;
	buf->tms_stime = ticks;
	buf->tms_cutime = ticks;
	buf->tms_cstime = ticks;
	
	return ticks; // Return clock ticks.
#else
	// No usable RTC peripheral exists. Return -1.
	return -1;
#endif 
}

On STM32 Cortex-M-based MCUs (except STM32F1), the RTC’s registers contain the time count in BCD (Binary-Coded Decimal) format, which requires it to be converted to binary code to be compatible with any code using the _times() functionality:

uint8_t RTC_Bcd2ToByte(uint8_t Value) {
	uint32_t tmp = 0U;
	tmp = ((uint8_t)(Value & (uint8_t)0xF0) >> (uint8_t)0x4) * 10;
	return (tmp + (Value & (uint8_t)0x0F));
}

Newlib for Microcontrollers

There are technically two versions of Newlib: one is the regular, full-fat library, the other is the low-fat, nano version, which was created by ARM explicitly for Cortex-M MCUs in 2013. A major disadvantage of the regular Newlib is namely that it takes up a fair amount of space, which in the case of especially smaller MCUs with limited flash storage and SRAM will very likely mean that even a simple ‘Hello World’ compiled against it may be too big to even fit.

When compiling with GCC for an MCU platform like e.g. STM32 or SAM, the compiler can be instructed to link against this Newlib-nano by adding the specs file to use in the linker command with --specs=nano.specs. This spec file essentially ensures that the project is linked against the Newlib-nano library and uses the appropriate header files.

As noted in the linked ARM article, the size difference between regular Newlib and the nano version is quite dramatic. For a project targeting a low-level Cortex-M0 MCU, such as e.g. the STM32F030F4 with a grand total of 16 kB of flash and 4 kB of SRAM, using regular Newlib is impossible, as the resulting firmware image will fill up the flash and then some. With Newlib-nano used, the basic demonstration projects provided with Nodate (e.g. Blinky, Pushy) are about 2 kB in size and thus fit comfortably in Flash and RAM.

Here one can accomplish further space savings by swapping the default printf() implementation that comes with Newlib for an optimized one, such as mpaland’s printf implementation. This implementation has been used together with Nodate to get full printf() support even on these small Cortex-M0 MCUs.

Keep it Pure

When developing for more resource-restricted microcontrollers, literally every byte counts. As most of those MCUs are single-core systems, they do not need the multi-threaded support that would be convenient when using multi-core systems (e.g. STM32H7 family). Whether reentrancy is enabled, this can be easily observed when inspecting the map file for a project after building it.

When one sees entries such as impure_data and similar symbol entries, often contained in lib_a-impure.o or similar, reentrant code is being linked in, which can cost kilobytes of space in the worst case. Often this code is linked in because of certain functionality that is being used in the project’s code, but can also be from e.g. the atexit() handler. An explanation of this reentrancy feature can be found in the Newlib documentation.

Analyzing the map file directly or using a tool such as MapViewer (Windows-only) can help with tracking down those dependencies. One suggestion is to add the flag -fno-use-cxa-atexit to the GCC compile flags, so that it doesn’t use the reentrant version of the exit handler.

Wrapping Up

All of this covers merely the basics of what it takes to integrate and use Newlib, obviously. There are a lot more syscall stubs that weren’t covered yet, and the file handling API is worthy of an article by itself. The dynamics of Newlib also change when moving from a bare metal system as covered in this article from one in which an OS is present.

Process management is another topic covered by the getpid(), fork() and other stub functions. While it does seem somewhat convoluted to have to implement one’s own code here even for basic functionality like printf(), it also highlights the strength of this method, in that it is extremely flexible and can fairly easily be adapted to any platform. This is why Newlib works largely unchanged from resource-limited single-core Cortex-M MCUs all the way up to large multi-core systems including game consoles.

 

Books image: “Public Libraries in Wales / Llyfrgelloedd Cyhoeddus yng Nghymru” by National Assembly For Wales / Cynulliad Cymru is licensed under CC BY 2.0