In February 2023, a Norfolk Southern freight train carrying hazardous materials derailed in East Palestine, Ohio USA, resulting in environmental contamination, infrastructure damage, and the evacuation of thousands of residents. Our example Root Cause Analysis (RCA) investigates the contributing factors, including axle failure, monitoring gaps, and inspection processes, and offers practical solutions to improve rail safety and resilience. By focusing on system improvements and proactive risk management, this RCA illustrates how critical insights can help prevent similar incidents and protect communities.
Browse examples by category
Safety RCA Examples
Dale Earnhardt Fatality
On 19 February, 2001 at approximately 5:16 PM Eastern time during the final turn of the final lap of the Daytona 500 automobile race, world-famous NASCAR driver Dale Earnhardt Sr. was killed when his car slammed into the outer barrier of the race track.
Metal Dust Flash Fires and Hydrogen Explosion
On 31 January, 2011 at around 5:00AM, the company experienced an incident that led to two fatalities. Two employees were severely burned in an iron dust fire. The employees were wearing fire protective clothing, however this was not adequate protection.
Interstate 5 Skagit River Bridge Collapse
The bridge collapse was caused by a semi truck impacting a truss and multiple braces with sufficient force to fracture the bridge. The truck impacted the truss and multiple braces because it was traveling in the outside lane which was shorter in height than the inside lane.
Near Miss Risk of Explosion
A customer at a gas station observed a construction worker smoking a cigarette within 5 feet of the pump. The convenience store building was in the process of being remodeled at the time. A group of 3 workers were working on the project.
Flint Water Crisis
This RCA breaks the Flint crisis down into two main branches: The public health hazard, as represented by the contaminated water, AND the response by all levels of government. This is a strategy we employ to examine the “error path” and the “response path” which can mitigate or exacerbate the overall impact of the problem.
Near Miss (Falling Asphalt)
A large chunk of asphalt fell from the partially demolished bridge deck to the ground between the sidewalk on C Street and the existing bridge 16/5 E Pier 16. The piece of asphalt was approximately 6' X 4' X 4" and weighed around 1800 pounds. It fell approximately 50' to the ground. Nobody was in the immediate area when the piece fell.
Ghost Ship Warehouse Fire
On 2 December, 2016 a fire broke out in an Oakland, CA warehouse during a musical performance. 36 people attending the performance died when they were unable to exit the building. However, there was potential for an additional 66 people to have been injured or killed.
Impact of Opioid Addiction
This example RCA looks at the negative public impact due to opioid addiction. As with other Sologic examples of “big” problems, we need to disclose that the primary purpose for choosing a problem of this magnitude is to demonstrate how to complete a root cause analysis on a large problem.
AA Flight 191
On 25 May, 1979 at approximately 3:02 PM, American Airlines flight 191 crashed, killing all 271 people on board and 2 on the ground. The aircraft was a DC 10, tail number N110AA. This was a regularly scheduled flight.
Hindenburg
The Hindenburg disaster is one of the best-known disasters of the 20th century. Film crews captured virtually the entire event (except the ignition!). And the dramatic images, along with the classic narrative ("Oh the humanity!") are so compelling that generations later, we still cannot look away. Therefore it's a perfect example for a root cause analysis!
West Point Treatment Plant - Sewage Released, Injury
On 9 February, 2017, the Seattle area was experiencing a significant amount of rain. Much of Seattle relies on a combined storm/waste sewage system. This means that during heavy rains, the wastewater treatment facilities experience significant volumes. In such events, roughly 90% of volume is attributed to storm water.
Notre Dame Fire
On 15 April, 2019 at about 6:20 PM a fire started in the main attic of the Notre Dame cathedral in Paris, France. No one was injured or killed. There was, however, extensive damage to the structure and significant loss and damage to historical artifacts, including the 600 year old timber framing of the attic.
Sanford & Edenville Dam Failures
On the evening of 19 May, 2020, after several days of heavy rainfall, the Edenville Dam located 20 miles north of Sologic's headquarters in Midland, Michigan failed. Billions of gallons of unrestrained water surged downstream and subsequently caused the Sandford Dam to also fail. The failure of these two dams resulted in historic flooding throughout surrounding areas. While there was excessive damage to infrastructure, including many homes, buildings, and roadways, there were no major injuries or fatalities.
CrowdStrike - Error Released to Production
On 19 July, 2024, a software update from CrowdStrike caused many Windows systems to crash unexpectedly. The issue occurred when the update made the system try to access information that wasn’t there, leading to widespread disruptions. As a result, many users experienced system failures after installing the update. This type of crash is known as BSOD (blue screen of death) in the Windows community.
Sinking of the El Faro
On 1 October, 2015, the US-flagged cargo vessel El Faro sank off the coast of the Bahamas after encountering Hurricane Joaquin. All 33 crew members lost their lives. The vessel was en route from the port of Jacksonville, Florida, to San Juan, Puerto Rico, carrying a cargo of vehicles and shipping containers.
NS Railway Derailment
In February 2023, a Norfolk Southern freight train carrying hazardous materials derailed in East Palestine, Ohio USA, resulting in environmental contamination, infrastructure damage, and the evacuation of thousands of residents. Our example Root Cause Analysis (RCA) investigates the contributing factors, including axle failure, monitoring gaps, and inspection processes, and offers practical solutions to improve rail safety and resilience. By focusing on system improvements and proactive risk management, this RCA illustrates how critical insights can help prevent similar incidents and protect communities.
Quality RCA Examples
Hubble Space Telescope
Sometime after launch on 24 April, 1990, engineers testing images from the Hubble Space Telescope (HST) noticed that the images were blurry. This was the result of spherical aberration that occurred after the primary mirror was ground too flatly at the edges.
Manufacturing NCR
On 2 May, 2012 quality control issued a non-conformance for project #1234 (brass tubesheet). The non-conformance was issued due to an incorrect bolt pattern and scratched gasket surfaces.
Pipeline weld non-conformance
In July of 2012, it was discovered that 5 miles of underground pipe that had been installed did not meet specifications due to out of spec welds and improper radiographic inspection.
Customer Perception of Quality
On 6 November, 2010 we were notified by a customer that they wanted a service credit of $20,000. The service credit was for machining of the front wheel spindles and bearing races for an ore hauler dump truck (used in mining) to bring them back into specification.
Bourbon Warehouse Collapse
Barton Brands Distillery incurred significant financial losses with the collapse of a warehouse in Bardstown, Kentucky. The collapse led to thousands of gallons of bourbon leaking from damaged barrels, with some leakage escaping the warehouse containment into the surrounding land and waterway.
Pipeline Quality Issues
On 2 June, 2013 multiple leaks were discovered in the new process water treatment line during final hydrotesting. This delayed startup by two weeks and cost an additional $525,000 to repair the leaks and replace coupling gaskets. In addition to the startup delays and added costs, a Quality problem like this could potentially lead to reportable spills, additional cleanup costs, and lost future customer contracts.
Reliability RCA Examples
Pipeline weld non-conformance
In July of 2012, it was discovered that 5 miles of underground pipe that had been installed did not meet specifications due to out of spec welds and improper radiographic inspection.
Customer Perception of Quality
On 6 November, 2010 we were notified by a customer that they wanted a service credit of $20,000. The service credit was for machining of the front wheel spindles and bearing races for an ore hauler dump truck (used in mining) to bring them back into specification.
Slurry Pump Seal Leakage
The repeat failure of the new P-105 slurry pump has caused repeat, unplanned shutdowns resulting in lost profit and excessive expenditures due to seal leaks. The slurry contains 50% methyl bad stuff which is an environmentally regulated chemical and requires the pump to be shut down upon detection of a leak greater than 2kg/hr causing production losses amounting to $240,000 thus far.
Hawaii False Alarm
On 13 January, 2018 an alert was issued in Hawaii stating:
“BALLISTIC MISSILE THREAT INBOUND TO HAWAII.
SEEK IMMEDIATE SHELTER.
THIS IS NOT A DRILL.”
People naturally experienced terror as they rushed to shelter or to be with their loved ones. Of course, we all know the rest of the story at this point: The alert was a false alarm.
Bourbon Warehouse Collapse
Barton Brands Distillery incurred significant financial losses with the collapse of a warehouse in Bardstown, Kentucky. The collapse led to thousands of gallons of bourbon leaking from damaged barrels, with some leakage escaping the warehouse containment into the surrounding land and waterway.
Sanford & Edenville Dam Failures
On the evening of 19 May, 2020, after several days of heavy rainfall, the Edenville Dam located 20 miles north of Sologic's headquarters in Midland, Michigan failed. Billions of gallons of unrestrained water surged downstream and subsequently caused the Sandford Dam to also fail. The failure of these two dams resulted in historic flooding throughout surrounding areas. While there was excessive damage to infrastructure, including many homes, buildings, and roadways, there were no major injuries or fatalities.
CrowdStrike - Error Released to Production
On 19 July, 2024, a software update from CrowdStrike caused many Windows systems to crash unexpectedly. The issue occurred when the update made the system try to access information that wasn’t there, leading to widespread disruptions. As a result, many users experienced system failures after installing the update. This type of crash is known as BSOD (blue screen of death) in the Windows community.
Sinking of the El Faro
On 1 October, 2015, the US-flagged cargo vessel El Faro sank off the coast of the Bahamas after encountering Hurricane Joaquin. All 33 crew members lost their lives. The vessel was en route from the port of Jacksonville, Florida, to San Juan, Puerto Rico, carrying a cargo of vehicles and shipping containers.
NS Railway Derailment
In February 2023, a Norfolk Southern freight train carrying hazardous materials derailed in East Palestine, Ohio USA, resulting in environmental contamination, infrastructure damage, and the evacuation of thousands of residents. Our example Root Cause Analysis (RCA) investigates the contributing factors, including axle failure, monitoring gaps, and inspection processes, and offers practical solutions to improve rail safety and resilience. By focusing on system improvements and proactive risk management, this RCA illustrates how critical insights can help prevent similar incidents and protect communities.
IT RCA Examples
Website over budget
An agency had an opportunity to build a high-end e-commerce website for a world-class arts organization. This opportunity would add a marquee client to the portfolio and hopefully gain referrals to similar organizations in the future. The agency accepted that margins would be lower than usual, but they started the project anticipating a profit. However, a series of events occurred which caused the agency to lose money and deliver the project late.
Anti-Virus Software Issue
On 21 April, 2010 at approximately 2:00PM GMT Company x released an update to it's Virus Software Enterprise 8.7 (VSE 8.7). The update added detection for variants of the W32/Wecorl.a family of malware.
Website Unavailable
IT Customer Complaints was caused by Customer contacted us regarding web access problems and A 'Complaint' occurs when customer contacts us w/ problem...
Online learning completion not saved
A customer reported that some members of their team had completed the eRCA coursework and passed the test, but management reporting did not reflect their course completion status.
Google Compute Engine Incident 16015
On Friday 5 August 2016, some Google Cloud Platform customers experienced increased network latency and packet loss to Google Compute Engine (GCE), Cloud VPN, Cloud Router and Cloud SQL, for a duration of 99 minutes. If you were affected by this issue, we apologize. We intend to provide a higher level reliability than this, and we are working to learn from this issue to make that a reality.
Amazon S3 Service Disruption
On 28-Feb-2017, Amazon Web Services (AWS) experienced a service disruption impacting the US EAST-1 Region. The disruption began at 9:37AM and lasted until service was restored at 1:54PM.
Operations RCA Examples
Manufacturing NCR
On 2 May, 2012 quality control issued a non-conformance for project #1234 (brass tubesheet). The non-conformance was issued due to an incorrect bolt pattern and scratched gasket surfaces.
Pipeline weld non-conformance
In July of 2012, it was discovered that 5 miles of underground pipe that had been installed did not meet specifications due to out of spec welds and improper radiographic inspection.
Slurry Pump Seal Leakage
The repeat failure of the new P-105 slurry pump has caused repeat, unplanned shutdowns resulting in lost profit and excessive expenditures due to seal leaks. The slurry contains 50% methyl bad stuff which is an environmentally regulated chemical and requires the pump to be shut down upon detection of a leak greater than 2kg/hr causing production losses amounting to $240,000 thus far.
Hawaii False Alarm
On 13 January, 2018 an alert was issued in Hawaii stating:
“BALLISTIC MISSILE THREAT INBOUND TO HAWAII.
SEEK IMMEDIATE SHELTER.
THIS IS NOT A DRILL.”
People naturally experienced terror as they rushed to shelter or to be with their loved ones. Of course, we all know the rest of the story at this point: The alert was a false alarm.
Facebook Stock Losses
In late July 2018 Facebook lost nearly 19% of its valuation – a loss of nearly $150 billion dollars. This was the single largest one-day loss in value in history. To put this into perspective, Facebook lost more in a single day than the total value of McDonalds, 3M, or Nike.
Sanford & Edenville Dam Failures
On the evening of 19 May, 2020, after several days of heavy rainfall, the Edenville Dam located 20 miles north of Sologic's headquarters in Midland, Michigan failed. Billions of gallons of unrestrained water surged downstream and subsequently caused the Sandford Dam to also fail. The failure of these two dams resulted in historic flooding throughout surrounding areas. While there was excessive damage to infrastructure, including many homes, buildings, and roadways, there were no major injuries or fatalities.
CrowdStrike - Error Released to Production
On 19 July, 2024, a software update from CrowdStrike caused many Windows systems to crash unexpectedly. The issue occurred when the update made the system try to access information that wasn’t there, leading to widespread disruptions. As a result, many users experienced system failures after installing the update. This type of crash is known as BSOD (blue screen of death) in the Windows community.
Sinking of the El Faro
On 1 October, 2015, the US-flagged cargo vessel El Faro sank off the coast of the Bahamas after encountering Hurricane Joaquin. All 33 crew members lost their lives. The vessel was en route from the port of Jacksonville, Florida, to San Juan, Puerto Rico, carrying a cargo of vehicles and shipping containers.
NS Railway Derailment
In February 2023, a Norfolk Southern freight train carrying hazardous materials derailed in East Palestine, Ohio USA, resulting in environmental contamination, infrastructure damage, and the evacuation of thousands of residents. Our example Root Cause Analysis (RCA) investigates the contributing factors, including axle failure, monitoring gaps, and inspection processes, and offers practical solutions to improve rail safety and resilience. By focusing on system improvements and proactive risk management, this RCA illustrates how critical insights can help prevent similar incidents and protect communities.
Environmental RCA Examples
Slurry Pump Seal Leakage
The repeat failure of the new P-105 slurry pump has caused repeat, unplanned shutdowns resulting in lost profit and excessive expenditures due to seal leaks. The slurry contains 50% methyl bad stuff which is an environmentally regulated chemical and requires the pump to be shut down upon detection of a leak greater than 2kg/hr causing production losses amounting to $240,000 thus far.
Multiple leaks in new pipeline
After over three miles of high-density polymer pipeline was installed with over 800 couplings, the initial 25 PSI hydro testing resulted in 50-60 percent of the couplings leaking.
Mass Fish & Turtle Die-Off in Peconic River
In April 2015, large numbers of Menhaden and Terrapin turtles were found dead in the Peconic River off of Flanders Bay in Long Island, NY. The die-offs were attibuted to massive algal blooms in the Peconic Estuary.
Flint Water Crisis
This RCA breaks the Flint crisis down into two main branches: The public health hazard, as represented by the contaminated water, AND the response by all levels of government. This is a strategy we employ to examine the “error path” and the “response path” which can mitigate or exacerbate the overall impact of the problem.
West Point Treatment Plant - Sewage Released, Injury
On 9 February, 2017, the Seattle area was experiencing a significant amount of rain. Much of Seattle relies on a combined storm/waste sewage system. This means that during heavy rains, the wastewater treatment facilities experience significant volumes. In such events, roughly 90% of volume is attributed to storm water.
Sanford & Edenville Dam Failures
On the evening of 19 May, 2020, after several days of heavy rainfall, the Edenville Dam located 20 miles north of Sologic's headquarters in Midland, Michigan failed. Billions of gallons of unrestrained water surged downstream and subsequently caused the Sandford Dam to also fail. The failure of these two dams resulted in historic flooding throughout surrounding areas. While there was excessive damage to infrastructure, including many homes, buildings, and roadways, there were no major injuries or fatalities.
CrowdStrike - Error Released to Production
On 19 July, 2024, a software update from CrowdStrike caused many Windows systems to crash unexpectedly. The issue occurred when the update made the system try to access information that wasn’t there, leading to widespread disruptions. As a result, many users experienced system failures after installing the update. This type of crash is known as BSOD (blue screen of death) in the Windows community.
Sinking of the El Faro
On 1 October, 2015, the US-flagged cargo vessel El Faro sank off the coast of the Bahamas after encountering Hurricane Joaquin. All 33 crew members lost their lives. The vessel was en route from the port of Jacksonville, Florida, to San Juan, Puerto Rico, carrying a cargo of vehicles and shipping containers.
NS Railway Derailment
In February 2023, a Norfolk Southern freight train carrying hazardous materials derailed in East Palestine, Ohio USA, resulting in environmental contamination, infrastructure damage, and the evacuation of thousands of residents. Our example Root Cause Analysis (RCA) investigates the contributing factors, including axle failure, monitoring gaps, and inspection processes, and offers practical solutions to improve rail safety and resilience. By focusing on system improvements and proactive risk management, this RCA illustrates how critical insights can help prevent similar incidents and protect communities.
Compliance RCA Examples
Manufacturing NCR
On 2 May, 2012 quality control issued a non-conformance for project #1234 (brass tubesheet). The non-conformance was issued due to an incorrect bolt pattern and scratched gasket surfaces.
Pipeline weld non-conformance
In July of 2012, it was discovered that 5 miles of underground pipe that had been installed did not meet specifications due to out of spec welds and improper radiographic inspection.
Flint Water Crisis
This RCA breaks the Flint crisis down into two main branches: The public health hazard, as represented by the contaminated water, AND the response by all levels of government. This is a strategy we employ to examine the “error path” and the “response path” which can mitigate or exacerbate the overall impact of the problem.
West Point Treatment Plant - Sewage Released, Injury
On 9 February, 2017, the Seattle area was experiencing a significant amount of rain. Much of Seattle relies on a combined storm/waste sewage system. This means that during heavy rains, the wastewater treatment facilities experience significant volumes. In such events, roughly 90% of volume is attributed to storm water.
Healthcare RCA Examples
Drug Reaction
On 25 July, 2010, a patient was tranfered to her room from post op. Over the course of 12 hours, the patient developed a persistant, extremely uncomfortable itch throughout her body.
Cylinder Pulled Into MRI
Recently an MRI center experienced a near-miss event when a ferrous oxygen bottle was pulled into the magnetic bore of an MRI machine.
Patient Re-admitted, 2nd Surgery Required, Stay Extended
This example examines an actual case of hospital re-admittance which included a second surgery. While recovering from the second surgery, the patient contracted a clostridium difficile infection which extended his stay for an extra two days.
Impact of Opioid Addiction
This example RCA looks at the negative public impact due to opioid addiction. As with other Sologic examples of “big” problems, we need to disclose that the primary purpose for choosing a problem of this magnitude is to demonstrate how to complete a root cause analysis on a large problem.