Machine Failure Prediction using Machine learning [Case Study] – Intelliarts

26 January 2022
10 min read
What if your manufacturing company could predict equipment failures and optimize production line performance? Learn how to use machine learning for equipment failure prediction.

The faster your product passes the stage of the production pipeline and gets shipped out the factory door, the sooner it will fall into the hands of happy customers. That’s your major goal in manufacturing — to maximize the production line performance and deliver finished goods to customers in a timely manner.

This goal is though difficult to reach with repeated manufacturing failures, which result in increasing maintenance costs and delays in manufacturing. Both expected and unexpected equipment failures can wreak havoc on your production line performance and company profits.

In this article, we describe how a machine learning (ML) approach can bring value to your manufacturing company and reduce industrial equipment failures with machine failure prediction. To explain things better, we discuss an appliance manufacturer’s case and tell how we helped this company to reduce machine failure that occurred on the factory floor using machine learning.

Business challenge

Equipment failure happens all the time, and its consequences could range from easily fixed issues with minimal maintenance costs to catastrophic breakdowns with expensive downtimes, delays in production, profit reductions, and so on. Still, we can distinguish between two main reasons why reducing machine failure makes sense in manufacturing.

Reason 1: Increasing equipment failure costs

According to NY Data Science, production line costs, also known as assembly line costs, account for 50 to 75% of all manufacturing costs. This gives the clue to high equipment failure costs as the first reason why manufacturers should avoid equipment breakdowns.

So what are those equipment failure costs composed of? Let’s analyze:

  • Labor costs for removal of the broken equipment

  • New equipment components and materials for repair or rewind

  • Freight rate to transport the new parts

Aside from these direct costs, manufacturers should also consider efficiency losses, downtime costs, including expenditure for operator wages, utilities, and most importantly, customer service. Here is a brief overview of what manufacturing downtime can cost you:

Reason 2: Inefficient production line

Machine failure not only leads to a sudden and significant increase in operating expenses but also jeopardizes manufacturers with diminished production line performance. This underlines the second reason why manufacturing companies should minimize breakdowns, emphasizing the importance of equipment failure prediction using machine learning.

When bottlenecks in the production line occur, it affects the overall plant productivity. According to the study by EAM-Mosca Corporation, disruptions in assembly lines on average reduce manufacturing capacity by 15% and affect profitability. Precisely, inconsistent production lines lead to:

  • Production schedules get affected

  • Shipping is delayed, and customers do not receive their goods on time

  • Customer satisfaction rate gets lower, and this incites buyers to look for more reliable suppliers

And vice versa: if a manufacturing company can maximize its production line output, for example, through equipment failure prediction using machine learning, it can increase its production capacity.


Using data science in production line optimization

With the increasing emphasis on eliminating equipment failure issues in manufacturing processes, data science methods and machine learning for equipment failure prediction are being offered as the next big step toward production line optimization.

Thanks to IoT and smart manufacturing, equipment in production lines is becoming increasingly connected. Plus, it produces volumes of data. Manufacturers can then collect this data, analyze and transform it into knowledge to enhance decision-making on the shop floor.

The most explicit example here is the use of big data by Garant, a Canadian manufacturer of outdoor hand tools. The manufacturer started collecting and analyzing production line performance in 2013. At some point, the company noticed that big data analytics provided the manufacturing operators with great visibility into production and factory performance. The project paid off in two years, allowing it to streamline production and optimize manufacturing processes.

This connectivity of data also makes possible real-time monitoring of manufacturing processes. This allows manufacturers to track the entire production process and see how the goods move along the assembly line. This way, a manufacturing company can identify the strong and weak sides of its production line and get ideas for performance optimization.

Read also: Digital Twin: A New Tool in Manufacturers’ Arsenal

What do equipment failures have in common with machine learning?

Based on the use of IoT and the data received from advanced monitoring, it is possible to train a machine learning model to predict equipment failure. So, manufacturers will be able to minimize equipment failures, optimize production line performance, and raise quality standards.


In their case study “Predictive Maintenance in the Metallurgical Industry”, Marta et al. discuss how machine learning and data mining can be useful in manufacturing for equipment failure prediction when components get replaced right before the machine breaks down. The study tells about a company that uses machine learning techniques to uncover best practices for operating and maintaining its equipment.

The data gathered about its specific operational parameters is used to extend the useful life of the equipment and, thus, prevent any equipment failures and slowdowns in the production line.

Another study by Vedika et al., titled “Real-World Data-Driven Machine Learning-Based Optimal Sensor Selection Approach for Equipment Fault Detection in a Thermal Power Plant”, explores the potential use of advanced ML algorithms to provide automatic early detection and diagnosis of error in thermal power plants. The scholars tell about the current fault detection systems that alarm and respond to abnormal machine behavior only after the damage happens. They contrast these systems to the ones built with advanced machine learning systems for equipment failure prediction which makes it possible to detect equipment problems before the damage occurs.

Turn Predictive Maintenance into a Success Story for Your Manufacturing Company
White Paper
Turn Predictive Maintenance into a Success Story for Your Manufacturing Company
Download now

The intricacy of the production line data

Using data science and ML methods to solve a machine failure prediction problem in the industrial setting still has its pitfalls. And the main issue here is the intricate nature of the production line data:

  • Abundant data: Multiple sensors at each station used in manufacturing automatically produce massive data that quickly reach hundreds of gigabytes.

  • High-dimensional data: Since measurements are gathered at each station, many features are collected for every sample. Most of these could strongly correlate or be unimportant.

  • Imbalanced data: Modern manufacturing processes are highly efficient, so failures happen rarely on a shop floor. Since normal cases are more common than failures, the skewed class distribution problem appears. In other words, the distribution of data is skewed towards positive samples (success), and an ML model can find it difficult to learn the patterns of negative samples (equipment failure).

  • The lack of variety: The production line data is usually got from the same machine or its family as well as under the same conditions. So, we do not have enough variety of data, which can affect the universality of an ML solution.

Despite these problems, an intelligent system for predicting equipment failures still seems like a promising option for solving such problems and increasing the productivity of production lines.

To provide an example of how equipment failure can be addressed, we discuss the case study of one appliance manufacturing giant that we once worked with. Much like other companies, this manufacturer records data at every stage along its production line, so we could use these data to build an ML model for equipment failure prediction.

Case study: Appliance manufacturing

Case overview

Our client is a global appliance manufacturing company that dealt with repeated equipment failure, which affected its production line. The manufacturer is known for its commitment to quality and safety standards, and part of achieving this is due to the company’s monitoring of the production line.


For this case study, the manufacturer provided us with a huge dataset (14.3 GB) of measurements in CSV format. This included the information about three types of feature data:

  1. 968 numerical features

  2. Almost 2140 categorical features

  3. And 1156 date features

Here is some extra information about this dataset:

  • Overall, there were 2,368,435 samples

  • The training dataset, as well as the test data, contained more than a million samples

  • There were labels to mark the sample as good or bad

  • There were also timestamps to inform when measurement took place

Exploratory data analysis (EDA)

As expected, one of the key challenges was the need to process and analyze the huge dataset. To do this effectively, we built the AWS Fargate cluster and deployed the Dask to it. Dask is an open-source library for parallel computing in Python. It’s a great data processing tool, very flexible, and it allows handling big data for machine learning more efficiently. The tool enabled us to do EDA much faster and in a more cost-efficient way.

Another problem we noticed was that the dataset was very sparse, meaning that there were many gaps present in the collected data. Sparsity was around 80%, so we decided to convert data from a dense format to sparse to make analysis more efficient.

EDA of categorical data

The categorical data had 2140 features initially, but on further evaluation, we found out that:

  • About 500 features were multivalue

  • There were 1490 features with single values

  • 150 features were just empty, so we dropped them, as they contained no information

EDA of numerical data

Based on this data, we segregated the production flow of each product. We discovered there existed 51 stations distributed between four production lines. Counting the total number of non-zero measurements in each station (Figure 1), we saw that stations 24 and 25 had the most number of measurements ( >200); station 32 had only one measurement; and the remaining stations had about 20 measurements. In manufacturing, it widely happens that stations do not have a sufficient number of sensors, and that’s why we cannot be 100% sure about precision at the end.

EDA of numerical data
Figure 1

To understand how parts are moving through the stations, the number of parts per station is shown in Figure 2. We found out that each station had a different number of parts passing through it. This could mean the existence of different product classes, each going through a certain production path.

EDA of categorical data
Figure 2

Our next intention was to explore whether particular production lines or stations correlated with higher error rates. To do this, we calculated the fraction of defective parts in each station and production line. We uncovered there were several stations with a higher error rate. However, these stations processed only a few products, hence, their impact on the production yield was minimal. We found out that such stations were reprocessing or post-processing stations.

To explore how the parts were moving through the production line, the samples were aggregated by line and station number. We discovered there were several product categories in the dataset.

Feature engineering

Having finished with data cleaning, we went over to the feature engineering stage. Not to get into more technical detail, we list only the most critical things we did (but you can ask about detailed work on equipment failure prediction in the comments)

  1. We encoded categorical data using one-hot encoding.

  2. We clustered samples by production flow using k-Means.

  3. We deleted numerical features that were highly linearly dependent.

  4. We calculated lag features as one of the most important feature types for us.


We experimented with different algorithms to solve the problem. To evaluate the results, we used the Matthews Correlation Coefficient score. We achieved the best score using the Extreme Gradient Boosting classifier.

Business value

The outcomes of our appliance manufacturing case study highlight the effectiveness of data science methods, particularly machine learning for equipment failure prediction, in addressing the challenge of minimizing equipment failure issues in the manufacturing sector.

Specifically, some of the value that our client gained from the project included:

  • Our ML model showed over 90% accuracy in machine learning failure prediction, which helped our client to reduce its maintenance cost by 5% later on

  • Since the manufacturer became able to predict which parts of the equipment were most likely to fail, the company could improve its production line performance as well as save time and money resources

  • Modeling the data would help the manufacturing company to predict better which equipment part has quality defects. Overall, this information would enable the company to adhere to its quality standards along with preventing poor quality components used in its final product

There is one more important thing to mention. Our Intelliarts team didn’t just develop the tangible machine failure prediction using machine learning for our client but engaged in a full cycle data science project. We deployed the model, set up its monitoring for the manufacturer, and built dashboards so the company could track the model results.

With production line monitoring, we tried to provide an extra benefit to the client. When they saw any change in the data or when the model worked worse, they would know it immediately, and we could retrain the model or look for where the solution failed.

NB: False defect detection is another common business challenge for manufacturing, check this case study on how we helped the company solve this issue.

Wrap up

Think your production line has zero chance of getting any more productive? Think again…

Minimizing equipment failure is a great way to improve your production line performance and, likewise the productivity of your manufacturing company in general. A reliable machine learning system could help manufacturers forget about repeated machine breakdowns, lengthy downtimes, and increasing maintenance costs.

 And in case you’re looking for untapped opportunities for improving your manufacturing performance… We at Intelliarts love to help companies solve their business challenges with the help of data science and machine learning. Reach out to us to get to know how to reduce your equipment failures and boost your business productivity.
Contact us
Let's talk


Yurii Laba
DS/ML Engineer
Rate this article
3 ratings
White paper
White paper
Turning Predictive Maintenance into a Success Story for Your Manufacturing Company
Download now
Related Posts