A watchdog timer (WDT) is an important component in many embedded systems that is used to detect system failures and reset the system if necessary. The main purpose of a WDT is to bring the system to a known good state if the software crashes or gets stuck in an infinite loop. Let’s take a closer look at how watchdog timers work and the different architectures used.
What is a Watchdog Timer?
A WDT is essentially a timer that needs to be periodically refreshed or reset by the main program. If the main program neglects to refresh the WDT before the timer expires, the WDT will trigger a reset of the system. This serves as a fail-safe to prevent system crashes or hangs from bringing down the whole system.
The WDT is implemented in hardware and will continue operating independently of any software crashes. Typically the main program just needs to write a value to a WDT refresh register periodically within a timeout period to keep resetting the timer. If the WDT ever reaches the timeout period without being refreshed, then the reset occurs.
Why Use a Watchdog Timer?
There are a few key reasons why a WDT is important for many embedded devices:
- Recover from software crashes – The WDT acts as an external supervisor that can reset the system if the software malfunctions.
- Detect infinite loops – Code bugs that lead to infinite loops will be detected via the refresh timeout.
- System level resets – The WDT allows a full system reset instead of just processor reset.
- Reduce downtime – Catching crashes quickly via the WDT leads to lower downtime.
- Improve reliability – The WDT serves as an external watchdog for the software processes.
Without a WDT, small software failures could make the system unusable until manually restarted. The WDT allows embedded systems to automatically recover from these situations.
Watchdog Timer Operation
The basic WDT operation involves a timer that counts down from a configured timeout period. The software must periodically service the WDT by writing a refresh value before the timer reaches zero. If the refresh does not occur in time, the WDT will trigger and reset the system.
The act of refreshing the WDT is sometimes referred to as “kicking the dog” or “petting the dog”. Most systems will have a dedicated watchdog timer module or integrated watchdog capability. The refresh operation is usually very simple, involving just writing a value to a specific register.
The timeout period needs to be chosen carefully. You want it to be short enough that any software malfunctions will be detected quickly, but long enough that the software has time to service the WDT during normal operation.
Additionally, there is usually a startup delay period after the processor resets before the WDT is activated. This allows the software time to configure and start up properly before watchdog refresh needs to begin.
Watchdog Timer Architecture
There are several design architecture choices when incorporating a WDT module:
Simple Standalone Module
In this basic architecture, the WDT module sits alongside the main processor and other system modules. It includes the watchdog timer circuitry and reset generation logic. The processor pings the WDT module periodically to keep refreshing it. If the WDT expires, it drives the reset line to restart the system.
Integrated Architecture
Many modern processor cores now integrate the watchdog timer module directly within the CPU. This simplifies the design and eliminates the need for a separate WDT module. The integrated WDT will monitor the core operation and reset it independently on timeout. Any peripherals or co-processors may receive an asynchronous reset signal as well.
Dual Watchdog Architecture
Rather than a single WDT module, there can be both a standalone WDT and an integrated WDT within the processor. This provides redundancy in the reset monitoring. One WDT can cover the processor while the other covers peripheral modules or does system level monitoring. Having two watchdogs makes the reset fail-safe even stronger.
Windowed Watchdog
A windowed WDT adds additional logic to only allow the timer refresh within a certain window period of the timeout. This prevents problems with too early or too frequent refreshes. The window closes well before the timeout period to catch any late refreshes. Windowing provides further assurance of proper software operation.
Software vs Hardware Watchdogs
While most WDT solutions are hardware based for robustness, WDT capability can also be implemented in software. For example, having a system management routine that performs watchdog duties periodically. But software WDTs can be compromised by overall system failures, so hardware WDTs tend to be favored.
Watchdog Usage in ARM Cortex MCUs
The ARM Cortex-M series of processor cores used in many microcontroller units (MCUs) integrate WDT capability. Here are some examples:
- Cortex-M0+ – Optional integrated WDT
- Cortex-M3 – Integrated WDT module
- Cortex-M4 – On-chip WDT module with windowing
- Cortex-M7 – Integrated WDT with timeout down to hundreds of microseconds
- Cortex-M23 – Integrated safety WDT for functional safety support
- Cortex-M33 – Flexible WDG architecture, can be split between lockstep cores
The watchdog architecture in ARM Cortex-M MCUs provides robust reset generation and system recovery capabilities. WDTs are crucial for many real-world embedded applications.
The integrated WDTs offer advantages such as simpler design, tighter processor integration, smaller hardware footprint, and reduced power consumption. The watchdog modules are flexible across the Cortex-M portfolio, with various timeout ranges and operating modes.
Real-World Examples
Here are some examples of how watchdog timers are utilized in commercial and industrial devices:
- Medical devices – WDTs ensure lifesaving devices like pacemakers reboot if software crashes.
- Automotive – WDT modules are used extensively in engine controllers, infotainment, and safety systems.
- Printers – Reset print controller boards if the complex software stalls out mid-job.
- Alarm systems – Recover from faults so security protection continues uninterrupted.
- Firmware updates – If new firmware is faulty, WDT allows reverting to old firmware.
- Robots – WDTs enable robot systems to self-recover in the field during autonomous operation.
In mission-critical and safety-related systems, multiple levels of redundancy like dual-watchdogs further enhance the reliability protections.
Conclusion
In summary, watchdog timers provide an indispensable layer of protection, monitoring and recovery for many embedded systems. The programmable timeouts and periodic refresh requirements act as an elegant fail-safe mechanism. Integrated hardware WDT modules offer robust resets without software overhead. Architectural choices exist depending on level of reliability required, cost and other factors. As embedded software complexity increases, so does the need for watchdog protections to achieve higher functional safety and uptime.