The challenges of artificial intelligence: Part 2

I’m helping us laugh and possibly learn by making one of my humorous books available to download for free in this year’s first six columns. The third book is Dispersing Heat Through Conviction. Humor can open minds and it can be fun to be silly. This book contains actual quotes by operators copied from a plant’s control room logbook and top 10 lists by proteges.

Dispersing Heat Through Conviction: The Funnier Side of Process Control

Greg: Now to get serious, we’ll continue to learn what can and can’t be done with artificial intelligence (AI) to help address the challenges in our profession. We’re fortunate to have longtime associate Randolf Riess provide insights about what does and doesn’t work in AI.

The most intense and rich content of my serious books is provided by equations that often include dynamic terms (e.g., deadtime, time constants and gains) and trend charts of test results, block diagrams, and concise process and instrument diagrams (P&ID).

Get your subscription to Control's tri-weekly newsletter.

How can AI extract knowledge from equations? Must I create sentences that capture the knowledge? Does AI understand mathematical terms such as multiplication, addition and division?

Randy: I think this is a great question because it focuses on how a process control engineer can use AI in day-to-day work. I think the core of the question is how AI can help you understand instead of how AI can understand. AI won’t understand these concepts, but it can help give a process control engineer control information to understand concepts relevant to their current situation. Ideally, we want to carry a tablet PC into the plant, ask the AI questions, and get a response with answers that help.

There’s a huge wealth of process knowledge, including loop tuning and monitoring techniques, captured in books by amazing authors like yourself, Greg Shinsky, and a few other Gregs. But it's not realistic for someone to index everything in a book and access that information off the top of their head. Likewise, it's also impossible to carry five or six books with you, and stop to look up answers. People just don’t do that.

This is where AI can help. AI can be trained on the content from a set of books. Specifically, AI can be trained on question-and-answer pairs relevant to a process control engineer, where the answer is an excerpt from a book. This type of custom-trained AI takes natural language questions, and retrieves the best answers from those books.

AI can take a natural language query, such as “What do I need to know about oscillation periods in loop tuning?” and retrieve the best excerpt from the books it was trained on. AI provides a paragraph or two that explains this concept in the context of a process control engineer working on a control loop. It's like having an assistant that has memorized and understands the content of the books well enough to recite the answers most relevant to your specific question.

Greg: What If my question isn’t typed exactly as one of the questions on which the AI was trained?

Randy: This is a strong feature of AI technology. The models are trained to understand general English. That is, AI understands the same question even if it’s rephrased because it understands the semantics of the question.

I could ask the same question in a few ways:

What do I need to know about oscillation periods?
Please tell me about loop tuning and oscillation periods?
Can the oscillation period affect control of my process?
Is the period of oscillation important to loop tuning?

These are slightly different questions, all with similar semantic meanings. AI learns this when we train it on specific question-and-answer pairs from the book content. When you ask the same question differently, it still retrieves the correct book content because it understands the semantic meaning of the question as it relates to process monitoring and control.

Greg: Why not just use ChatGPT?

Randy: ChatGPT and other similar, hosted, large-language models (LLM) were trained on data from the Internet, and may have some knowledge about process control. However, ChatGPT and other LLMs have a high error rate as was mentioned in part 1 of this topic. As such, it may not return a correct answer 10-20% of the time. It’s not encouraged that you follow ChatGPT or any other generative AI (genAI) model for instruction on tuning a PID loop because it may get it wrong. This is what the AI folks call an “hallucination.”

What I described is not a genAI model. What I’m talking about is a custom, semantic, search model that can understand the special meaning of words and concepts that let it find the best answers to process control questions from a corpus of books. That is, it returns the actual excerpts from the books that best answer the user’s question, avoiding the AI making stuff up, aka, returning an error. Also, unlike genAI, semantic search is deterministic, meaning the exact same question will return the exact same answer every time.

Greg: If it’s not genAI, is it AI?

Randy: Yes. A custom, semantic search tool trains an “embedding” model, which is like the first layers of a genAI model called encoder layers. A custom, embedding model translates words into a numeric vector. When you train the embedding model on a set of questions and answers, it starts to understand the special meanings of words in your specific context. For example, when we talk about a controller “gain,” the word gain has a specific meaning that’s very different from its use in normal English conversation. The custom embedding model learns the process control meaning of the word, and is better able to understand your question, and return the best answer within the context of process control.

Greg: How do you go about building such a tool?

Randy: I’ll outline the basic steps and explain options on how to perform each step:

Identify a set of books that we believe holds the information we want to query.
Chunk up the book into sections, about one to three paragraphs in size, trying as best as possible to retain semantically consistent sections. The idea is to capture one topic per segment.
Generate probable questions that each segment answers. I find you can use an LLM to do this with a prompt that explains the role and context of the question to be generated. However, as mentioned before, LLMs are error prone. You’ll need to filter the questions, and look for patterns to remove common errors.
Once you have the questions generated for each chunk of the book, construct a data set with each row containing the book segment and one question that’s answered by that same book segment. This is a training set that will be used to train a custom embeddings model.
Train a custom embedding model by downloading a pre-trained embedding model from HuggingFace, and train with your question-and-answer pair data set. Training a custom embedding model is how to get AI to learn the special meanings of words in the context of process control.
Once there is a custom embedding model that performs reasonably well, compute the embedding for each book segment, and store it in a data set along with the full text for that segment.
The last step is to deploy that model to a website with an interface that takes a natural language input, computes the embedding, and performs a simple “nearest neighbor” search for the best book segments that answer the question, and displays them to the user, ranked by relevance.

Greg: Sounds like I need a team of data scientists and software engineers to build it. Are there ways to leverage existing tools to build a book query?

Randy: Yes, the process is very technical and requires Python programming and website skills to build it yourself. However, in the last year, several companies have figured out this pattern, and built tools and products that perform most of the process. However, very few allow training of a custom embedding model, which is important for niche use cases. Instead, they use an off-the-shelf embedding model trained on the general English language. So, it’s expected to miss the complex nuances of the questions and answers. The main purpose for the custom embedding model is to learn about questions specific to what a process control engineer would ask, and associate with the book segments that answer those questions.

You can use “PDF-to-query” products built by several companies that do everything except the custom embedding model. However, don’t expect it to return answers to complex questions using process control terms that are rarely used in general English. Other tools do parts of the process and allow for custom steps, such as training the custom embedding model. One example of a tool that will do most of this process and allow for custom embeddings is Unstructured.IO . However, it still requires users to build the website and real-time query code.

Greg: Once we build this tool to return answers to process control questions, how can it be used in day-to-day operations?

Randy: Once deployed to a web interface, a process control engineer can take a tablet PC into the plant and, as they’re troubleshooting an issue, ask the tool questions like, “What are recommendations on tuning a loop with overshoot and oscillation with process with deadtime?” The input can be typed or spoken. This gives process control engineers access to relevant expert information in books without leaving the plant, finding the books, and paging through them to find specific answers. The real-world benefits are quicker and better resolution of process issues. This is a portable AI assistant that can fetch the right information from a set of books, quickly and in real-time.

Greg: Could this be made into a “Ask Greg AI”?

Randy: Yes, wouldn’t that be totally cool! We can take several of your books and enable them to be queried by asking questions. This would be like having you, Greg, as an assistant for a process control engineer as they troubleshoot a problem—well, except without your sense of humor and personality. However, there are emerging AI tools that can make an AI chatbot that sounds and acts just like a specific person. An LLM chatbot based on your responses can be trained to sound like you, and could present the answers from your books in a way that it sounds like you. That would be the ultimate in cool, an “Ask Greg AI” that people could carry around with them and use as an assistant. I think this is the future of technical books: to become AI voice chatbots that will carry the knowledge in books into the future.

Greg: If the answers include the most relevant page numbers in the books from which the answers were extracted, particularly those showing equations and figures, the process control engineer can gain at his leisure more detailed understanding of the best solutions. This can also help promote the value of these books as a source of deeper knowledge possibly making publishers willing to enable their books to be used for AI. We need to realize that AI does not replace books but provides more effective use of books.

Top 10 ways to impress your management with trend charts

Make large, setpoint changes that zip past valve deadband and nonlinearities.
Change the setpoint to operate on the flat part of the titration curve.
Select the tray with minimum process sensitivity for column temperature control.
Pick periods when the unit was down.
Decrease the time span, so that just a couple of data points are trended.
Increase the reporting interval, so just a couple of data points are trended.
Use thick line sizes.
Add huge signal filters.
Increase the process variable scale span, so it’s at least 10 times the region of interest.
Increase the historian’s data compression, so most changes are screened out as insignificant.