6 Reasons to “Clean” Industrial Data at the Edge

How much time do you spend cleaning data?

A modern industrial facility can easily produce a terabyte of data each day.If your factory is like most connected operations, you probably have tons of this raw data streaming from connected devices to existing enterprise systems, bespoke databases, and a cloud data lake. This architecture often leads to inconsistent or even unusable data for several reasons.

We know the cloud is a key tool for digital transformation. It provides the scalability and storage capacity you need to collect and interpret vast amounts of data coming from the operational technology (OT) level.

However, by nature, cloud services are IT-centric tools. They structure data differently than operational systems, which means IT must spend a lot of time cleaning the data before it can be used by line of business. And if the data moves directly to different enterprise systems, multiple teams across the organization will clean the data independently, leading to different versions of the truth.

Cleaning, contextualizing, and modeling data at the edge—before it reaches the cloud—can solve these challenges. Here are six ways an edge-native approach can take your digital transformation to the next level.

1. Give the OT Team Control
Working with the industrial data coming from PLCs, machine controllers, RTUs and smart sensors is an integration challenge. This data was put in place for process control and not the typical cloud use cases like predictive asset maintenance or traceability. Even when the data is accessed over OPC, it is structured by the underlying device protocols. Therefore, it is not standardized across similar assets or processes, it does not have standard units or units of measure, and it lacks context. All these challenges are fixable by someone who knows the production machinery and the automation devices controlling them. The OT team is the only team who can effectively decode the data—and their systems are the edge. Cleaning data at the edge puts the control and responsibility into the hands of the team most capable and efficient at accomplishing this task.

2. Optimize System Maintenance
Over time, the factory changes—you introduce new products, optimize processes, and replace machinery. Each change impacts integrated data systems across the organization. Now you need to perform updates to each individual system that collects, communicates, and processes data related to the changes on the plant floor. When companies choose to clean data in the cloud, the IT team maintains the applications and integrations without knowledge or context for these changes happening in the factor. This results in reactive maintenance when data is missing or integrations break. Conversely, a single edge-based application enables the company to only maintain one system for change management. This can dramatically increase efficiency.

3. Provide a Single Version of the Truth
Remember that meeting where everyone had a different interpretation of the same data? Maybe the operations leader based her OEE report on a single eight-hour shift, while the executive team used a cloud analytic that measured efficiency on a 24-hour day. Or maybe a supply chain executive calculated a different figure for scrap costs than the operations team after pulling the information from a cloud data lake instead of the MES platform.

Contextualizing data and defining metrics as close to the machinery as possible means that all the systems that get the data will be working off a single source of truth instead of performing custom transformations at the ingest of each system.

4. Reduce Latency Issues
Cleaning data is expensive and slow. You certainly don’t want to clean it multiple times for different applications. If you’re sending data to the cloud and then restructuring it before sending it to other systems, you’re going to encounter latency issues. Streaming data at the edge reduces latency that is common with batch-processed data in the cloud.

​Cloud and enterprise systems don’t interpret operational data very well. They typically require data to be presented in a different format, such as name-value pairs, rather than the standard operational model of ID, value, quality, and time stamp. Edge-based modeling tools designed specifically for the OT team provide a standardized way to present the data to multiple systems across the organization in the format and frequency those systems want to consume it and reduce the latency and expense of batch cleaning in each system.

5.  Minimize Costs
Transforming data in the cloud is not free. When you push “all the OT data” to the cloud without specific use cases or plans, you’re typically dealing with more data points than you actually need. This increases data ingestion, storage, and the amount of bandwidth consumed. Cleanup in the cloud also requires processing and secondary storage resulting in more cost. When you transform data at the edge for a specific use, you reduce the burden on your cloud system by only sending the data you need at the frequency required for use.

6.  Ensure Your Data is Secure
Industrial data starts at the device. Minimizing the actual distance it travels and both the hardware and software applications it passes through is important to ensure security. Pushing all data to a cloud system to be cleaned then pulled down to an on-premises application is not only expensive and slow but is also less secure than just routing data through the internal network.

Data security also requires that we leverage secure protocols and understand how system integrations impact the movement and accessibility of industrial data. Integrations are established over time to different systems. Knowing what systems are accessing what data from which systems is important information but rarely well documented. Taking a proactive approach to data integrations where a single application is the clearing house for all industrial data integrations—and administrators can see and manage what data is being sent to which systems—is critical.

Closing Thoughts

Digital transformation depends on edge and cloud technologies working together to put insights into the right hands in real time. I hope this article helped explain how the edge can play an important role in cleaning, contextualizing, and modeling industrial data to ensure it’s ready for use by IT and line of business. How will you use the edge to advance your digital transformation projects?