To get from raw information to big savings, data analytics mainly involves turning widely different sources, formats, protocols and communications systems into equivalent forms that can be compared to show trends and opportunities for improvement.
"We're seeing growing demand from users who want confidence in analytics and getting information to the presentation layer. Some system integrators are even merging to do analytics better," says Dan Riley, analytics manager at Interstates, a system integrator in Sioux Center, Iowa, and certified member of the Control System Integrators Association (CSIA). "Users want to do advanced analytics, but first they need basic analytics functions for data aggregation and cleaning before it can be presented. We pull information from hardware and software, and it's often messy, out of spec, or comes from devices that have lost calibration. The marketing hype about 'data analytics' was going full steam two years ago, along with other buzzwords like 'digital twin' and 'asset performance management.' However, it doesn't take into account the unglamorous side of methodically, intentionally and thoroughly cleaning, preparing and making data consistent for useful analysis."
Pull problems from the patchwork
Riley adds that process plants typically run multiple controllers and patchworks of control systems, but bringing them together to present their data is an even bigger challenge. "Doing overall equipment effectiveness (OEE) on one line isn't enough," he says. "Where we could just focus on return on investment (ROI) before, now we have to run operations without everyone onsite and learn to be even more resilient. In any case, the first task is still picking a problem statement, and then finding the analytics it needs and the right technical solution. This approach is important because trying to choose the technology first can send you in the wrong direction, so even if a solution is found, it may not be able to scale other parts of the process or plant."
Riley reports Interstates usually integrates continuous and batch processing applications in the food and beverage, consumer goods, specialty chemicals, and value-added agricultural feed and water industries. And, even though these applications, their problems and their analytics outcomes are different, the procedure for solving them is often the same. "The issues we see the most are condition monitoring, downtime events, taking a long time to diagnose problems, and surprises impacting the health of equipment and lines," says Riley. "Users typically want analytics that can provide anomaly detection, but they also want the Holy Grail of predictive maintenance, which can look further into the future, be available closer to real-time for faster response, and cost less."
Talk to people to determine analytics
To achieve these goals by finding the right problems and analytics that can solve them, Riley adds that system integrators and their clients must first talk to the right people on their operations, controls and engineering teams about what sensing, signals and parameters they need for better decisions. Second, they need to explore their existing systems for current problems and determine what analytics they're presently using to solve them. (There are still many clipboards out there and many users manually copying data onto Excel spreadsheets.) The results of this investigation will indicate what problems they want to solve in the future and the analytics they'll need because existing control and data systems usually aren't within each other's 'line of sight.' Third, the teams must move data from these islands of automation to a controlled location by applying middleware, such as Kepware's KEPServerEX, Telit's deviceWise, Node-RED's development tool, or free, open-source Konstanz Information Miner. These middlewares can talk to PLCs, HMIs and other plant-floor devices, and deliver production data to cloud-computing services, including server-based, on-premises versions. The data can then be analyzed by inexpensive tools like Microsoft's Power BI or other analytics software.
"This is the similar to using protocol drivers to reach devices and translate data," explains Riley. "Dealing with patchworks of devices, systems and networks used to make up 50-75% of the work on a typical integration and analytics project, but then we could get data quickly. It depends on how old or new the downstream are, though they're never as fast as the newer application. The good news is many of today's smart devices have translation and data conversion built-in, so we're not stuck in as many patchworks as before. The tools are easier and more drag-and-drop, but the key is still doing upfront integration now for better data analytics later. These fundamentals are unavoidable. You can't take a data scientist onto a messy plant-floor and expect actionable results. Raw data must be cleaned and translated by some kind of middleware."
Riley adds that the two main types of analytics messes involve transactional data, such as the instructions and transactions running in a manufacturing execution system (MES) or database, and time-series data (TSD), such as the operating conditions of devices at specific points in time. Cleaning up means getting data into consistent formats, but also accessing it close real-time in its operations database or a second database location. "We usually want to move messy data from the controls layer because queries related to analytics can potentially overwhelm operating systems. It's important to be mindful of queries and what can happen when translating information from databases," adds Riley. "Performance of queries can pose more issues for a transactional database architecture. However, because TSD is usually pulled from a historian, we're more concerned about its settings in relation to the equipment we're observing and how eventual data models will be affected. Historians are compressed databases, so they only save data when there's a degree change or if they're set to record for a scheduled period. They must operate consistently to get consistent data, and they typically handle control limits and metadata. However, if they aren't checked often, users may not see when motors or other devices migrate from the limits set for them."