Mozilla engineer thinks 15% of crashes are due to memory errors!
<https://mas.to/@gabrielesvelto/116171750653898304> <https://www.tomshardware.com/pc-components/ram/bit-flips-cause-up-to-15-percent-of-firefox-crashes-asserts-mozilla-engineer-figure-inferred-from-470-000-auto-submitted-crash-reports> OK, there is the qualification "up to". This seems implausibly large. If it was even 1.5%, I'd want ECC on my RAM. Actually, I do want it but it has been quite expensive to get. And only a very few PCs support it now. I wonder how many machines had such crashes. A few bad machines could generate a lot of crash reports. He suggests that memtest86 could catch these errors so that implies they are somewhat repeatable. PS: see the same engineer complaining about Intel Raptor Lake CPUs: <https://mas.to/@gabrielesvelto/114813152373394985> <https://www.tomshardware.com/pc-components/cpus/firefox-dev-says-intel-raptor-lake-crashes-are-increasing-with-rising-temperatures-in-record-european-heat-wave-mozilla-staffs-tracking-overwhelmed-by-intel-crash-reports-team-disables-the-function>
On 2026-03-10 17:54, D. Hugh Redelmeier via Talk wrote:
<https://mas.to/@gabrielesvelto/116171750653898304> <https://www.tomshardware.com/pc-components/ram/bit-flips-cause-up-to-15-percent-of-firefox-crashes-asserts-mozilla-engineer-figure-inferred-from-470-000-auto-submitted-crash-reports>
OK, there is the qualification "up to". This seems implausibly large. Bit flips could also include memory that has been swapped out and brought back in to memory. Then your subject to a couple of copies and buss transfers.
I have a bunch of servers I monitor and there are very few ECC reported bit errors. So I would also have an issue with the 15% unless there is some quirk in the reporting process. Any chance Firefox has lots of users that are subject to increased cosmic rays like astronauts or people on high flying aircraft or inside nuclear reactors? I tend to want to have ECC on all my systems but that is mostly because way back in the days of 0.XX linux the cheap clone PCs often had problems with the memory that would show up in trying to compile the kernel. A story about RAM. In my first job the company I was working for was using core memory and developed a memory board based on 4Kbit DRAM chips. There were lots of discussions about if the chips would be reliable and a whole testing protocol and hardware were developed to stress test the chips. Some in the engineering group believed that 4Kbit was the biggest you could get because of things like radiation and refresh times would overwhelm the chips.
If it was even 1.5%, I'd want ECC on my RAM. Actually, I do want it but it has been quite expensive to get. And only a very few PCs support it now.
I wonder how many machines had such crashes. A few bad machines could generate a lot of crash reports.
He suggests that memtest86 could catch these errors so that implies they are somewhat repeatable.
PS: see the same engineer complaining about Intel Raptor Lake CPUs:
<https://mas.to/@gabrielesvelto/114813152373394985> <https://www.tomshardware.com/pc-components/cpus/firefox-dev-says-intel-raptor-lake-crashes-are-increasing-with-rising-temperatures-in-record-european-heat-wave-mozilla-staffs-tracking-overwhelmed-by-intel-crash-reports-team-disables-the-function>
I could buy the error rates increasing based on heat to the point where it would be measurable. If you have a large enough dataset you could possibly identify trends like local heat waves. If the dataset of Firefox crashes is available then someone could likely make it their university thesis to track and prove the assertions that this guy makes. -- Alvin Starr || land: (647)478-6285 Netvel Inc. || home: (905)513-7688 alvin@netvel.net ||
participants (2)
-
Alvin Starr -
D. Hugh Redelmeier