It’s a problem familiar to anyone who’s spent a decent amount of time playing with a Raspberry Pi – over time, the flash in the SD card reaches its write cycle limits, and causes a cavalcade of confusing errors before failing entirely. While flash storage is fast, compact, and mechanically reliable, it has always had a writeable lifespan much shorter than magnetic technologies.
Of course, with proper wear levelling techniques and careful use, these issues can be mitigated successfully. The surprising thing is when a major automaker fails to implement such basic features, as was the case with several Tesla models. Due to the car’s Linux operating system logging excessively to its 8 GB eMMC storage, the flash modules have been wearing out. This leads to widespread failures in the car, typically putting it into limp mode and disabling many features controlled via the touchscreen.
With the issue affecting important subsystems such as the heater, defroster, and warning systems, the NHTSA wrote to the automaker in January requesting a recall. Tesla’s response acquiesced to this request with some consternation, downplaying the severity of the issue. Now they are claiming that the eMMC chip, ball-grid soldered to the motherboard, inaccessible without disassembling the dash, and not specifically mentioned in the owner’s manual, should be considered a “wear item”, and thus should not be subject to such scrutiny.
Certainly An Odd Wear Item
Historically, major electronic parts in automobiles are not considered consumables. While it’s not uncommon for some cars to face issues with engine control units or body control modules, they’re not typically treated as wear items to be replaced at nominal intervals. Thus far, precedent has considered these parts as something to last the lifetime of the vehicle, and to be replaced in the case of unexpected malfunction. The Tesla case is different in that the eMMC failure is, by and large, inevitable. Rather than being a case of isolated malfunctions in a small percentage of cars as would be expected from the occasional manufacturing defect, this is a issue affecting every car that rolled off the line up to a certain date. Failure rates are up to 30 percent in certain build months. With the computer and touchscreen being in charge of so many vital vehicle functions, it’s not a defect that can be easily ignored by the end user.
Tesla’s assertion that the eMMC chip should be considered a ‘wear item’ is a dubious one at best. Flash memory does wear out, it’s true, as Tesla points out when discussing the limits of the technology. Many parts on a modern car wear out over time – brake pads, belts, and air filters are all common examples. The difference is that these parts are all designed to be replaced by the end user or a typical mechanic.
Trying to claim that a ball-grid array chip, permanently soldered onto a PCB and buried inside the dashboard is a wear item is patently ludicrous. If it were, we’d expect to see several things. There’d be a recommend time and mileage upon which the eMMC would be changed to avoid surprise failures, and this would be listed in the manual. Additionally, Tesla’s repair process would involve desoldering the eMMC chip from the board and replacing it directly. Given that Tesla are instead replacing the computers as a whole is indicative that the part is not being treated as a wear item by anyone, anywhere.
Obviously, the chip can be replaced, but it’s no easy job. Once the computer’s main board has been extracted from the car, the storage must be backed up over JTAG. Then, it must be carefully reflowed to remove the chip, in a delicate process that has a significant chance of damaging other components on the board. If the chip was a wear item, it wouldn’t require specialist BGA reflow equipment to change. We’d see Tesla doing it routinely, replacing a sub-$7 chip rather than swapping out entire mainboards instead at the costs of thousands of dollars. Granted, there are parts of modern cars that are also time consuming to replace – such as timing belts, water pumps, and so on. However, again, in these cases, automakers make it clear that these are wear items ahead of time, create maintenance schedules for them, and standard processes to change them.
Nobody would put up with swapping out their entire front suspension setup every time their brakes wore out – automakers realised brake pads were wear items and designed accordingly. Tesla simply dropped the ball, writing too often to the flash memory, which isn’t easily replaceable. The proper solution is trivial. Either stop logging so much to flash storage, or make it easier to swap out.
And maybe put the logs in their own partition. While SD cards probably aren’t up to snuff for storing the car’s operating system, they’d make a cheap place to store non-critical logs that probably are never read anyway. Alternatively, put the eMMC chip on a removable module, or just use an M.2 drive with automotive-rated connectors.
The issue is claimed to only effect models built prior to March 2018, which run on an NVIDIA Tegra 3. Later models are based on the Intel Atom, and feature a larger eMMC chip on board. These modules are yet to demonstrate the same failures, and Tesla claim they should not suffer the issue. We’ll see.