Chapter 20: Microcontrollers
A microcontroller unit (MCU) integrates a CPU, program memory, data memory, and a rich set of peripherals on a single chip. From the 8-bit Arduino UNO to the 32-bit ARM Cortex-M7, MCUs are the brain of every embedded system: motor controllers, IoT sensors, industrial PLCs, and medical devices. This chapter covers architecture, peripherals, interrupt-driven programming, and real-time operating systems.
Microcontroller Block Diagram
20.1 Harvard vs Von Neumann Architecture
Harvard Architecture
Separate address spaces and buses for program (instruction) memory and data memory. The CPU can simultaneously fetch the next instruction while reading/writing data โ a hardware form of instruction-level parallelism. Used in most MCUs (AVR, PIC, ARM Cortex-M).
- + Simultaneous instruction fetch and data access
- + Fixed instruction width simplifies pipelining
- โ Two separate memory bus interfaces
- โ Cannot execute code from RAM (typically)
Von Neumann Architecture
A single unified memory space shared by instructions and data over one bus. Simple to implement; the bottleneck is the shared bus (the "Von Neumann bottleneck"). Used in x86/x64 PCs, ARM Cortex-A application processors.
- + Flexible: code and data share address space
- + Self-modifying code / JIT compilation possible
- โ Bus contention between instruction fetch and data
- โ Cache hierarchy needed to hide latency
20.2 Memory: Flash, SRAM, EEPROM
Flash
Non-volatile, stores program code. Erased in sectors (512 B โ 128 KB). Write endurance ~10 000โ100 000 cycles. Arduino Uno: 32 KB; STM32F4: up to 2 MB. Wear levelling needed for frequent writes.
SRAM
Volatile, fast random access, stores stack, heap, and global variables. Arduino Uno: 2 KB; STM32F4: 192 KB. Stack overflow is a common MCU bug โ always monitor free stack space.
EEPROM
Non-volatile, byte-level erasable, stores configuration data (calibration, device ID). Typically 256 B โ 4 KB on MCU; slow write (~3 ms/byte). Many modern MCUs emulate EEPROM in Flash using wear-levelling libraries.
20.3 Peripherals
GPIO โ General Purpose I/O
Digital input or output pins, individually configured. Open-drain mode allows wired-OR connections. Pull-up/pull-down resistors programmable in hardware. Typical sink/source current: 8โ25 mA per pin.
UART โ Universal Async Receiver/Transmitter
Asynchronous serial: start bit, 8 data bits, optional parity, 1โ2 stop bits. Baud rates: 9600, 115200, 1 Mbps. No clock line โ baud rate must match on both ends. Used for debug consoles, GPS modules, GSM modems.
SPI โ Serial Peripheral Interface
Synchronous: SCLK, MOSI, MISO, CS#. Full-duplex. Speeds up to 50 Mbps. Multiple slaves via separate CS lines. Used for SD cards, displays, DACs, ADCs, flash memory.
I2C โ Inter-Integrated Circuit
Two-wire (SDA, SCL) multi-master/multi-slave. 7-bit addressing (up to 127 devices). 100 kHz standard, 400 kHz fast, 1 MHz fast-plus. Requires pull-up resistors. Used for sensors (IMU, pressure, humidity), RTCs, EEPROMs.
ADC โ Analog-to-Digital Converter
Samples analog voltage, produces digital code. 10-bit (Arduino) to 16-bit (STM32). Successive-approximation register (SAR) architecture. Nyquist: sample rate โฅ 2ร signal bandwidth. Anti-aliasing filter required before ADC input.
PWM โ Pulse Width Modulation
Timer-generated digital square wave with variable duty cycle D = t_on / T. Used to control motor speed, LED brightness, servo position. Effective analog voltage: V_avg = D ร V_supply. Resolution: 8โ16 bits.
20.4 Interrupt Handling
An interrupt suspends the main program, saves CPU state (registers + PC on stack), executes an Interrupt Service Routine (ISR), then restores state and resumes. The Nested Vectored Interrupt Controller (NVIC) in ARM Cortex-M supports up to 240 external interrupts with 8โ256 priority levels and hardware tail-chaining for back-to-back ISRs.
ISR design rules:
- Keep ISRs short โ do minimal work (set a flag, push to a queue) and process in the main loop.
- Declare variables shared between ISR and main code as
volatileto prevent compiler optimisation from caching them in a register. - Disable interrupts around multi-byte variable reads in main code (
cli()/sei()on AVR;__disable_irq()on ARM). - Never call blocking functions (delay, print, malloc) inside an ISR.
- Latency: ARM Cortex-M0 = 16 cycles; Cortex-M4 = 12 cycles from interrupt assertion to first ISR instruction.
20.5 Arduino Ecosystem & ARM Cortex-M
Arduino
The Arduino platform abstracts AVR and ARM hardware behind a C++ API: pinMode(), digitalWrite(), analogRead(), Serial.print(). The Uno uses an ATmega328P (8-bit AVR, 16 MHz, 32 KB Flash, 2 KB SRAM). The Due/Zero/MKR series use 32-bit ARM Cortex-M cores at 48โ84 MHz.
The setup() function runs once; loop() runs repeatedly. Wire (I2C), SPI, and Servo libraries provide peripheral abstraction. Over 10 000 third-party libraries available via the Library Manager.
ARM Cortex-M & STM32
The ARM Cortex-M family spans M0 (tiny, low-power) through M7 (DSP, FPU, 400+ MHz). STM32 (STMicroelectronics) offers over 1 000 MCU variants. The STM32F4 Discovery (Cortex-M4, 168 MHz, 1 MB Flash, 192 KB SRAM) includes a hardware FPU, enabling single-cycle multiply-accumulate for DSP and motor control algorithms.
HAL (Hardware Abstraction Layer) and CMSIS (Cortex Microcontroller Software Interface Standard) provide portable driver APIs. STM32CubeMX generates initialisation code and FreeRTOS project scaffolding.
20.6 Real-Time Operating Systems โ FreeRTOS
A Real-Time Operating System (RTOS) provides deterministic task scheduling, inter-task communication, and resource management. FreeRTOS is the most widely deployed RTOS, running on hundreds of MCU families and supported directly by AWS (Amazon FreeRTOS).
Tasks & Scheduler
Each task is a C function running in an infinite loop with its own stack. The scheduler (preemptive, priority-based) uses the SysTick timer (1 ms tick). Higher-priority tasks preempt lower ones. Cooperative scheduling is also available.
Queues
Thread-safe FIFO buffers for inter-task communication. Producer tasks block when full; consumer tasks block when empty. Used to pass sensor readings from an ISR to a processing task without shared-variable races.
Semaphores & Mutexes
Binary semaphore: signal/wait (ISR to task synchronisation). Counting semaphore: resource pool. Mutex: mutual exclusion with priority inheritance to prevent priority inversion โ a critical RTOS concept.
Stack & Heap
Each task has its own stack (typically 256โ2048 words). FreeRTOS heap (heap_4.c) provides malloc/free with fragmentation management. uxTaskGetStackHighWaterMark() monitors minimum free stack space.
20.7 PID Control โ Mathematical Foundation
The PID controller is the most widely used control algorithm in embedded systems. The continuous-time output \(u(t)\) given error \(e(t) = r(t) - y(t)\) is:
In discrete time with sample period \(T_s\) (Euler approximation):
Python: PID Temperature Control Simulation
The simulation models a first-order thermal plant (ฯ = 60 s) controlled by a discrete-time PID loop. The left panel shows the temperature response for five controller configurations โ P-only, PI, and tuned PID. The right panel plots steady-state error vs Kp, demonstrating that integral action (Ki > 0) drives the error to zero regardless of gain.
Click Run to execute the Python code
Code will be executed with Python 3 on the server