Neural Network Architecture2021-11-29T04:50:17+00:00

Building Neural Network Architecture to Achieve 98% Accuracy in Predicting Missing Data

Used neural network-based deep learning library to predict the missing values with up to 98% accuracy while automating the entry of missing data fields

Business Challenges

The customer is a leading semiconductor manufacturer that produces chips for markets ranging from mobility to computing and industrial IoT. Once the chips were manufactured and ready to be shipped out, data such as chip description, cost, chip size, etc., (categorical and numerical data) was collated into the product master data, used by various processes and functions. The sales team used the data for cost estimation whereas the finance team used it for revenue forecasting. However, the table had many missing values and issues with data quality due to gaps in data management, resulting in lost sales opportunities and revenue forecasting errors.

15+ global product teams across multiple geographies worked together to fill out the missing values manually. The process took weeks to fill the missing values while involving high man-hour costs. Moreover, the constantly evolving nature of the table made it an everlasting manual exercise.

Sigmoid’s Solution

We adopted a classification-based approach and used Amazon’s neural network-based deep learning library, Datawig, to create prediction models that could impute missing values. The deep learning library predicted the values of missing string data (categorical and textual data) with high accuracy, which the typical classification method fails to accomplish. It allowed us to create vectors and columns from the string data without the need to code. We also productionized classification-based ML models using SageMaker. An approval mechanism was set up to approve or reject the model based on the model score and key performance metrics.

Business Impact

Sigmoid’s solution saved time, cost, reduced manual efforts and brought a significant correction in revenue forecasting. The new solution was 90% more efficient in predicting categorical data and recorded a standard deviation of less than 0.5 for the numeric columns.

For detailed understanding and solution, please download the case study here

Go to Top