From Concept to Reality: Understanding the Power of Digital Twins—Part I.
Although today’s tech media is dominated by news such as the production of AR glasses for the metaverse, or the emergence of newer and newer Large Language Models (LLMs) and image-generating models, there are also several interesting developments lurking in the background that are perhaps less often heard about. One of these is the Digital Twin systems, which are mainly used in industrial and engineering fields. But what exactly does the term mean, where does it come from and what could its significance be soon? In today’s post, we explore the basics behind such integration of the physical and the digital world.
The concept, which today is commonly referred to as a Digital Twin (DT), appeared a long time ago, although of course in a very different form compared to today. In fact, the first such device was not even digital, but “analog”, existing in the concrete physical world. It was used by engineers after the launch of Apollo 13 in April 1970. During the mission, one of the spacecraft’s oxygen tanks exploded, preventing it from landing on the lunar surface and seriously endangering the safety of the crew. At this point, the mission became a battle for survival and a famous rescue mission, during which engineers had to solve technical problems from a distance of 200 000 miles. But the key to the rescue mission was that NASA had a twin model of Apollo 13 on Earth, which allowed engineers to test possible solutions remotely. The fact that each possible solution did not have to be tested immediately under live conditions contributed significantly to the crew’s successful return to Earth.
Of course, today’s systems are predominantly virtual, rather than physical simulations. But the basic idea remains the same; to create models that can be used to simulate, or even predict, the behavior of a system in operation. The term “digital twin” itself was first mentioned in 1998, referring to a digital replica of the voice of actor Alan Alda in the film “Alan Alda Meets Alan Alda 2.0”. Although DTs have been in widespread use since 2002, it is only recently that the design of similar systems has been part of the mainstream since around 2017.
Of course, the trends that permeate the technological world and create an environment in which the development of DTs can become profitable play a significant role. One such example is the Internet of Things (IoT), which provides access to vast amounts of data through the communication of smart devices. This abundance of data has greatly contributed to making DTs cost-effective and thus indispensable for business.
But what do we literally mean by Digital Twin? In its most concise formulation, it is a virtual model of a physical object. In operation, it can track the lifecycle of an object and use real-time data sent by sensors on the object to simulate behavior and verify operations. Digital Twins can replicate a wide range of real-world objects, from individual pieces of factory equipment to entire facilities such as wind turbines or even complete cities. This technology allows you to monitor the performance of an asset, identify potential faults, and make more informed decisions about maintenance and lifecycle.
The technological concept of a DT thus creates a virtual replica of a physical object, system, or process that simulates reality and supports decision-making. Its key components in practice are:
- the physical object you want to model,
- the sensors responsible for collecting the data,
- the communication infrastructure that transmits this data to the DT,
- the infrastructure that stores and processes the data, and
- the model(s) responsible for the interpretability of the behavior of the object and possibly for predictions about it.
At first glance, there are many similarities between the components of a digital twin and those of any Machine Learning (ML) system. Both are data-driven, i.e. individual decisions are determined by patterns that can be extracted directly from the data. Both use mathematical and statistical modeling techniques. In ML, these can include regression models, decision trees, and even neural networks to identify and predict patterns, while in DTs the models are often based on physical laws. Of course, the situation is slightly more complex in that DTs can also integrate machine learning analysis techniques to learn from data and fine-tune models.
To illustrate the difference, let’s first take a closer look at some of the basic features of DT and ML systems.
In general, the data used for ML models is much more heterogeneous and varied, and the goal of data collection is to build databases from which these models can learn. ML models today are increasingly generalized, meaning that they can provide knowledge applicable to different situations and circumstances. This is particularly true, for example, of the Foundational Models that are in widespread use today. These are in fact a collective name for large-scale deep learning neural networks that cover ML models trained on a wide spectrum of generalized and unlabeled data. Thanks to this diverse set of training data, they can perform a wide range of common tasks, such as understanding human languages, generating text and images, and making conversations purely in human language. This kind of flexibility also distinguishes them from classical ML models, which were typically only applicable to a single task (e.g. clustering for a specific problem).
The data used for DTs is more specific, in a sense more concrete. To operate them, data about an object or process is usually needed to create a virtual model that can represent the behavior or state of a real-world object or process. The reason is that a DT always functions (as already mentioned) as a “simulation” of a concrete physical object. For this reason, the data collected may be related to the life cycle of the object and may include design specifications, manufacturing processes, or engineering information. Similarly, the data may also include manufacturing information, including the equipment, materials, components, methods, and quality control used to produce the object. Data can also be related to operations, such as real-time feedback, historical analysis, and maintenance records. Other data used in digital twinning may include business data or end-of-life procedures.
István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the HUN-REN Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.