The Death of the Data Cleanse

What is data cleansing? Data cleansing is a complex, multi-step process that requires a specialized set of software, people and procedures. The data cleansing process is typically handled by internal resources or external consulting firms. Whatever route is chosen, database cleaning services are a drain on human and financial capital, which is why this antiquated process is in a death spiral.

Internal Data Cleansing Strategy Expends Valuable Human Resources

From an internal standpoint, the Master Data Management (MDM) or Information Systems (IS) teams at companies are charged with keeping materials data clean and free of redundancies, while providing guidance for strategic data opportunities. These teams are on the front lines of managing bad data, so they spend vast amounts of time, effort and money on what is a laborious manual process that, in most cases, involves serious spreadsheet management. This effort is an ongoing battle that can become a long-term drain on valuable company resources.

Costly External Data Cleansing Companies Compromise Data Integrity

Companies may choose to hire external consulting firms that often rely on offshore resources to clean data and provide data exports. Their teams spend cycles manually checking the data and then cleaning it up for export back into the company’s ERP or other inventory management system—a process that can take months. The real issue with the database cleansing approach is that once the cleansed data is uploaded back to the system, it can quickly become dirty again if no controls are put into place to avoid bad data habits, such as inconsistent part naming across the organization.

The other downside of the database cleansing approach is cost. Depending on what’s included, the typical cost for data cleansing services for a database of 10,000 records ranges from $5,000 to $15,000.

The process of cleansing data can involve any or all of the following activities at these associated costs:

  • Removing duplicate records: $1,000 to $3,000, depending on whether an automatic de-duping tool is used or a de-duplication service provider.
  • Appending missing data: $0.50 to $5.00 per record, depending on data type and source.
  • Validating and cleaning bad records: $0.05 to a $1.00, depending on the data provider and what is verified.

Although the expense is high, the cost of having unreliable data, production and deliveries is even higher. The manual nature of these short-term approaches begs for an automated, intelligent solution.

AI Cleanses Data in Real Time, Reduces Inventory for Fortune 500 Manufacturer

Artificial intelligence (AI) and machine learning (ML) as data cleaning tools offer companies the option to accurately and efficiently self-cleanse their materials data in real time.

Take for example the average Fortune 500 manufacturer which has anywhere from $40-$60 million in wasted working capital tied up in excess parts (and up to $100 million for just four weeks of inventory in some industries). “Dirty” data is a huge part of this costly problem. Redundant and disorganized materials inventory results in over-ordering, obsolete and overstocked items, and inaccurate inventory forecasts. For these reasons, manufacturers often choose to cleanse the dirty data, but the task is time and resource intensive, which is why one Fortune 500 manufacturer turned to AI for its inventory optimization project.

After several acquisitions, the Fortune 500 pulp and paper manufacturer with more than 60 North American facilities needed to cleanse data in its maintenance, repair and operating (MRO) inventory. The company’s chief supply chain officer (CSCO) introduced an aggressive initiative to reduce working capital by $5 million, obtain enterprise inventory visibility, and rebalance inventory through its virtual MRO inventory network. When the team began to execute the MRO inventory optimization plan, it quickly became apparent that there was a lack of visibility across the company’s multiple systems that contained material master (MM) data, including SAP, as well as inconsistent data quality.

After assessing the scope of the necessary data cleanse, the team estimated it would fall at least one year behind its inventory reduction targets if employing only traditional cleansing methods and technologies. As a promising alternative, the team turned to an AI-enabled platform.

With AI, the team was able to lay the appropriate data foundation by structuring its MM data in weeks instead of a full year. The platform automated the material entry process and harmonized the inventory data across all MM catalogs. It also provided an accurate, trustworthy view of enterprise MRO inventory as required.

The AI-enabled platform ultimately cleansed the MRO data, allowing the company to begin its own “self-cleansing” data strategy, as well as an internal “buy-from-self-first” inventory optimization strategy. Additionally, once the data was cleansed, the platform delivered optimized insights on where inventory could be reduced as well as how to avoid creating additional excess.

In just two months, the AI-enabled platform identified the $5 million goal for reduction in working inventory. Although the platform revealed that the company’s inventory problem was worse than estimated, its fast results and visibility into what items were truly overstocked encouraged the CSCO to raise the MRO savings target from $5 million to $20 million for the same time period as the original plan. The platform continues to learn from the company’s inventory activities, execute data cleansing and provide scaled inventory optimization insights and opportunities.

AI Tolls the Bell for the Traditional Data Cleanse

With results like this pulp and paper manufacturer had, the death knell is obviously ringing for data cleansing as it’s currently known. The static traditional methods lack intelligence and the ability to learn and infer, while AI data cleaning allows a company to continuously build knowledge and improve its inventory management without draining financial or human capital.

One comment

Comments are closed.