EKF Lane Switching & Redundancy
Executive Summary
Modern flight controllers (like the Cube Orange) have multiple redundant IMUs (Accelerometers/Gyros). ArduPilot leverages this by running Multiple EKF Instances (Lanes) in parallel. Each Lane uses a different IMU. The autopilot constantly monitors the "Health" of each Lane and automatically switches to the best one if the primary sensor fails or produces inconsistent data.
Theory & Concepts
1. Voting Systems & Fail-Operational Logic
In high-reliability engineering, Redundancy is not enough; you need Isolation. If you have two sensors and they disagree, you can't fly safely (which one is right?). By using Three Lanes, ArduPilot implements a "Majority Vote" system. If one Lane (one IMU) fails, the other two still agree, allowing the system to be "Fail-Operational"—it continues to fly safely even after a hardware failure.
2. Common Mode Failures
Lane switching protects against Sensor Failure (stuck gyro), but not against External Failure. For example, if you fly through a massive magnetic field, all three compasses might fail in the same way. This is a "Common Mode Failure." The EKF errorScore tracks the consistency of each lane; if all lanes have high error scores, the system knows the problem is external to the flight controller.
Architecture (The Engineer's View)
The logic is managed by the AP_NavEKF3 frontend.
1. Parallel Lanes
- Configuration:
EK3_IMU_MASKdetermines which IMUs are used. - Execution: Up to 3 independent EKF cores run simultaneously (
NavEKF3_core). They all fuse the same GPS/Baro data but use different IMU data. - Independence: Because they rely on different physical sensors, a mechanical failure (stuck gyro) or aliasing glitch in IMU1 will corrupt Lane 1 but NOT Lane 2.
2. The Error Score
Each core calculates a normalized Error Score (0.0 to 1.0+).
- Metric: It is derived from the Test Ratios of the sensor innovations (Velocity, Position, Height, Mag, Airspeed).
- Formula (simplified):
Score = MAX(Vel_Innovation, Pos_Innovation, Hgt_Innovation). - Meaning:
0.0 - 0.5: Healthy.> 1.0: The EKF is rejecting sensor data (inconsistent).- Code Path:
NavEKF3_core::errorScore().
3. The Switching Logic
The frontend compares the Error Scores of all active lanes.
- The Threshold: A switch occurs if the Active Lane's score exceeds the threshold (
1.0) AND another lane has a significantly better score. - The Hysteresis: To prevent rapid toggling, the alternative lane must be better by a margin (
BETTER_THRESH). - The Failsafe Hook: Before triggering an "EKF Failsafe" (Land/RTL), the vehicle calls
checkLaneSwitch(). If a healthy lane exists, it switches lanes instead of declaring a failsafe.
Common Issues & Troubleshooting
"Lane Switch" Message in Log
- Cause: One IMU disagreed with the GPS/Baro significantly more than the others.
- Analysis: Plot
XKF1.Err,XKF2.Err,XKF3.Err. IfXKF1spikes butXKF2stays low, IMU1 likely suffered vibration aliasing or a hardware fault.
"Unhealthy AHRS" / "Gyros Inconsistent"
- Cause: The gyros on startup matched poorly. ArduPilot refuses to arm if the redundant sensors disagree.
Source Code Reference
- Lane Manager:
NavEKF3::checkLaneSwitch() - Scoring:
NavEKF3_core::errorScore()
Practical Guide: Analyzing EKF Health
The "EKF Lane Switch" message often scares pilots. Here is how to verify if it was a real hardware failure or just a glitch.
The Forensic Method
- Open the
.binlog in Mission Planner. - Search for XKF1, XKF2, and XKF3 (these correspond to Lane 1, 2, and 3).
- Plot
Err(Error Score) for all three lanes on the same graph. - Interpret:
- Scenario A (One Bad Apple): Lane 1 spikes to
1.5while Lane 2 and 3 stay at0.1.- Conclusion: IMU1 is faulty or suffering vibration aliasing. The system worked correctly by switching to Lane 2.
- Scenario B (The Global Crisis): All three lanes spike to
1.0simultaneously.- Conclusion: This is NOT a sensor fault. It is an external consistency issue (e.g., Compass interference, GPS glitch, or bad vibration affecting all IMUs). Switching lanes won't help here.
- Scenario A (One Bad Apple): Lane 1 spikes to
How to Force a Lane Switch (Bench Test)
You can verify the redundancy system on the bench (Props OFF!).
- Connect via USB and monitor the "Messages" tab.
- Take a strong magnet.
- Move it close to the flight controller (disturbing the internal mag or causing gyro bias).
- You will likely see "EKF3 Lane Switch 1" messages as the EKF detects the disturbance on one sensor (due to physical placement) before the others, or as it rejects the magnetic anomaly on the active lane and tries another.
- Note: This confirms the voting logic is active.
For more details, see the ArduPilot Wiki: EKF3 Handling.