SHION (Smart tHermoplastic InjectiON): An Interactive Digital Twin Supporting Real-Time Shopfloor Operations

Injection molding is widely used to produce plastic components with large lot size. However, quality failures occur during molding cycles. These can be minimized through real-time process monitoring. This article reports on a cloud-based digital twin (DT) that is supported by A-based control of process parameters and can be used to help companies detect product failures in real time. Process parameters and their interrelationship with quality failure were studied and used to generate models for real-time prediction of part quality. Two injection manufacturing lines in industry were chosen for data acquisition, implementation, and validation of the DT. While the DT successfully predicted faulty products in real time, adoption of traditional cloud-centric Internet of Things (IoT) approaches poses unforeseen practical challenges, such as the risk of losing data due to network issues and the prohibitive cost of regularly transferring a large amount data to cloud services.

T he plastic industry in Europe provided employment to more than 1.6 million and contributed €28.8 billion to the economy in 2018. Injection molding is the most used manufacturing process to produce plastic components. It includes four main stages: plasticization, injection, cooling, and ejection. Plastic material is fed by gravity from a hopper into a heated barrel. The plastic material melts as it moves forward in the heated barrel and then injected under high pressure into the molding. The melt is cooled in the molding to ensure dimensional stability and the part is ejected once it is cooled.
Part quality and dimensional integrity define injection molding productivity. The quality of molded products depends on the complex interaction of multiple factors, such as accuracy of an injection molding machine, quality or grades of plastic material, and process parameters. It is challenging to guarantee consistency and quality of produced parts in injection molding since quality failures may occur due to the variation of the aforementioned factors during molding cycles. Therefore, process monitoring and anomaly detection of the produced parts are essential to ensure that the quality of the delivered products meet customers specifications. Shop-floor workers usually conduct quality check on each piece and the quality department periodically check on some sample parts. Automatizing this process will spare material and time. Thus, the ability to detect a production failure in real time will have a high impact on production quality and productivity.
A DT plays an important role in enabling companies to gain insight on the actual processes at the shop floor. It is defined as a high-fidelity digital representation of the operational dynamics of its physical counterpart and requires near real-time synchronization between the physical and digital counterparts. 1 The data flow between the physical and digital counterparts is the key characteristic of a DT. The adoption of DT by companies is expected to bring higher efficiency and accuracy as well as economics benefit. Lu et al.'s 1 only found a small number of DT application scenarios and none of them (e.g., Dimitris Mourtzis et al. 2  This article reports an actual implementation of a DT, so called SHION, 4 in an industrial setting that allows capturing of real implementation issues, challenges, and benefits of a DT. Its main contribution is to provide an approach for real-time product quality monitoring that can be used to help companies detecting product failures in real time and enables an injection molding operator to take swift action and minimize waste. This article starts with a literature review of DT in the "Literature Review" section and is followed by a context overview of SHION implementation and detailed description of SHION architecture in the "SHION Concepts" section. The "Implementation of SHION" section explains the implementation results. The "Discussion" section provides discussion and reflection of SHION implementation, whereas the "Conclusion" section concludes the research work.

Definition of DTs
Trauer et al.'s 5 reviewed existing DT definitions based on an industrial use case and defined a DT as a virtual dynamic representation of a physical system, which is connected to it over the entire life cycle for bidirectional (and automated) data exchange. They identified two aspects. First, physical twin transfers data and information from the real space to a virtual digital space when needed. Second, a DT identifies product or process-oriented improvements, control demands based on the current situation, or predictions of the near future and sends them back to the real space so the physical product adapts accordingly. DTs are still in their infancy and many researchers are currently starting to derive appropriate concepts as a first step toward applying DTs in practice. 6 A DT can be seen as a collection of use cases contributing to an overall product life cycle management (PLM) strategy. 5 They are intended to be used through all the phases of the PLM 7 : requirements capture, product design, project/production planning, reliability of an engineering project, training or real-time decisionmaking, 7 and customer support. 8 Stanford-Clark et al.'s 7 identified three types of DT based on product life cycle phases: production (monitoring manufacturing/quality, forecast cycle times, etc.); engineering (replace physical test, optimize product features through simulation, etc.); and operation twins (monitoring use phase, product improvement, etc.).
A DT requires a bidirectional automated integration of data between physical and virtual worlds. 6 This is the feature that differentiates the DT concept from digital model (in which data integration is performed manually) and digital shadow (in which, data integration from physical world is automated but data from virtual world to physical world must be manually integrated).

Architectures for Implementing DT
According to the definition of DT, any system architecture must be able to represent the physical product, the virtual product, and the communication between them. However, the most important element is data. The elements of a DT and their interactions are reflected in Figure 1 adapted from Stanford-Clark et al.'s 7 work. Figure 1 also describes the requirements of the architecture of a DT: it must allow the high frequency of real-virtual data synchronization for fidelity guarantee; it must be possible to present a holistic view of available real data, synthetic data, and knowledge of experts to a user; it must be able to automatically evaluate its behavior, and enable real data feed to virtual models and continuous improvement of models by comparing their results with the physical space. Although DT community agrees on these features, the creation of a reference models for DT is still a challenge. 1 Figure 2 summarizes different proposals of DT reference architectures (RA). [9][10][11] According to Redelinghuys et al.'s, 9 the modules of a RA must offer their functionalities as services both to other components and external systems. It allows a distributed deployment of the architecture and, so it is possible a DT takes advantage of edge computing capacities. 10 Finally, this RA allows to implement the Reference Architectural Model Industrie 4.0 model. 11 Horizontal components of the proposed RA are related with: the bidirectional exchange of data between physical and virtual worlds (IoT stack); storage of raw physical data, data generated by AI and simulation models, and the models themselves (data); interaction with existing systems, such as enterprise resource planning (ERP)s, manufacturing execution system (MES)s, PLMs, or computer-aided design systems (systems of record); management of simulation models (simulation modeling); big data and machine learning (ML) and deep learning algorithms, which can be used for real-time monitoring, detecting abnormal patterns, or recommending solutions to problems (analytics and AI); interacting with users of the Digital Twin (DT) (visualization); and orchestration and management of the services implementing the given DT (process management).
Vertical components are perpendicular to horizontal ones providing access to services that must be used by the other to simplify the interaction between a DT components and of this with external systems (integration); to make this integration secure (security) and guaranteeing that the rules and policies of the company and laws of the place where the DT is located are complied with (governance).

Data-Driven DT
Data-driven modeling as the basis of DT is becoming very popular. 12 These approaches apply IoT, big data and artificial intelligence, such as ML, technologies and methods for capturing, exchanging, and transforming data between the virtual and physical parts of the twin.
In particular, ML models are used for monitoring and detecting undesired behaviors, such as faults in production. Faults prediction takes advantage of the fact that (appropriate) data obtained from physical twin can be a manifestation of both known and unknown physics, so ML models can account for the full physics and use it for predicting the errors. 12 While ML can be applied to predict behaviors it is also used to understand the structure of time series, reduce the number of variables to be considered, classify images, and create synthetic data that increases available training datasets and addresses a lack of data. 12 An advantage of data-driven models is that they will improve with new data. Although training of datadriven modeling might have issues associated with instabilities, they are quite stable for making predictions 12 once models are fully trained.

Context of SHION
SHION was implemented within the scope of the Clou-diFacturing research project. 13 The mission of Cloudi-Facturing is to use cloud/HPC-based modeling and simulation to foster the implementation of I4.0 in manufacturing small and medium enterprises. Cloudi-Facturing invited participation of third-party consortia to seek innovative use cases that fitted with the project's mission; and SHION was one of them. SHION consortium is composed of: 1) a research center, the Technological Institute of Arag on (ITAINNOVA), with expertise in big data and artificial intelligence services, 2) an independent software provider, BMS Vision (BMS), with expertise in offering a wide variety of industrial sensors, and 3) an end user, Thermolympic, which specializes in manufacturing thermal injection plastics products and provides the industrial problem to be solved.

Industrial Problems at Thermolympic
Maximizing production quality is crucial for Thermolympic. A machine operator visually reviewed all the produced parts, which are stored in a buffer before being checked, looked for general deviations, and reported the defects. In addition, a quality staff conducted statistical process control on some sample parts every four hours. While current quality rate was at 98.8%, a further quality increase was important. Automated failure detection would spare material and time by avoiding reworking and reducing customer penalties because of delays and delivery of nondetected rejected parts to the customer. The proposed approach was to take an advantage of cognitive technologies, in particular ML, to extract knowledge and generate a predictive model to detect when a defect in the production was going to happen. Constant evolution of the cognitive models generated would also be important. To realize SHION, information from the injection molding process itself as well as the context information, such as environmental conditions, operators' reviews, quality laboratory inspections, and piece weight were required. All data were collected via automatic and real-time devices using both IoT devices provided by BMS, which was installed at Thermolympic. Instead of visually reviewing every part, an operator would only review the parts when SHION anticipated the presence of a possible fault. Consequently, the operator could perform additional tasks in between the possible fault notifications that they received. Thus, SHION would change the work nature of a machine operator, a team leader and/or quality staff from corrective/contention to preventive/predictive flow.

CloudiFacturing Platform
CloudiFacturing integrates several existing software and hardware platforms for cloud-based engineering and manufacturing. One of them is CloudBroker application. CloudBroker application uses the Cloud-Broker Platform, which is a back-end tool for the deployment, management, and running of computeintensive software on various cloud infrastructures. For SHION implementation, persistent instance storage is required. If the connection to the instance was lost unexpectedly or a new instance had to be launched, the part of previous data stored on the initial instance had to be present on the new machine. Thus, a functionality to attach an Amazon S3storage to an instance as a disk was implemented in Cloud-Broker application to persist the changes kept in the mounted area.  box shows that the IoT and MES infrastructure are deployed on Thermolympic facilities and supported by BMS. This subsystem monitors machine parameters and environmental conditions. It sends the parameters to the cloud with a minimum injection cycle of 14 s. It also interacts with machine operators both through the DU11 terminal of BMS and through a traffic light alarm system, which allows operators to easily view the state of the machine and its products.

Architecture of SHION
The right box shows the SHION intelligent modules, which are deployed on a CloudBroker instance. SHION is powered by Argon, a Docker-supported system supporting the customization of ITAINNOVA deployments. The Kong container controls the access to the REST interface of the predictive monitoring system, and to the Mosquitto messages queues. The Influxdb container stores time series of data processing from machine and environmental sensors and operators interactions with machines (e.g., declaring parts with defect and the cause of it). The MongoDB container stores Moriarty workflows that train predictive models (gray arrows) and supports predictions in real-time (green arrows). These workflows record data received in the Influxdb repository. The Voil a container implements the interface supporting real-time monitoring of the machine, displaying the prediction results as well as, allowing the creation of new models, and the retraining of the new ones. This interface is used by team leaders and quality staff.
Real-time monitoring system is fed through a REST API. Received parameters, in a JSON message, launch the next workflow.
› The data are stored in an Influxdb database and prepared for predicting defects in the injected part. The parameters are identified by the machine identifier, the part identifier, and the timestamp when parameters are taken. › The predictions are performed using a model selected dynamically using the machine and the part identifiers. First, machine and part model are searched, if not found a machine model is searched and, in the case no machine model is found, a general model is applied. SHION allows experts users (team leaders and quality staff) to modify this default behavior by manually assigning an already existing model to a machine-part pair or to a machine: existing know-how can be incorporated on production monitoring and failure prediction (i.e., when a part injection is moved from a machine to another machine).
› The results are communicated to workers together with the expected probability of the prediction using MQTT using the Mosquitto container. These messages are shown using BMS Vision DU11 devices: it turns ON required light (green, yellow, and red) and a message is shown to operator requesting him/her feedback about the existence or not of a real failure and the cause of it (see Figure 5) While Figure 3 shows a monolithicarchitecture, the architecture is highly flexible. The versatility of Docker containers and the implementation of replicating data services allow the deployment in a scenario in which modules in green dotted line are deployed on the cloud and used on demand, while the monitoring modules can be deployed on Thermolympic facilities. This can reduce the cost of the computational resources because high-performance computing can be used on demand for training the models.

ML Prediction Models Creation and Retraining
The first step was to normalize and to create differentiated variables from the raw datasets. For each variable and its successive measurements, the variation between the couple of values was calculated. A second transformation was applied to these derived measurements: moving averages, maximum dispersion in a time window, and minimum value were calculated to obtain a summarizing set of statistical descriptive variables, which were then normalized. The third step was to reduce the number of variables by employing principal component analysis algorithm.
In step four, a final variable selection process was carried out using random forest algorithms with crossvalidation.
Once main injection features were selected, we trained different algorithms, such as artificial neural networks, support vector machine, and tree-based models (random forest and extra random forest). With the algorithms providing the best results, optimal hyperparameters sets were searched and used to train the prediction models for manufacturing lines.

IMPLEMENTATION OF SHION
The findings showed that prediction models accuracy varies depending on the quality of the data given to the system and with the existence or not of a model for a machine-part pair. Data quality is affected by the state of the sensors, some sensor fails were detected during the selection of features, by number of injections, but also by the operators revision processes, which makes it difficult to determine the exact injection cycle (the machine parameters) producing the failing parts. SHION obtained lower precision rates in machine (general) models (<75%), while it achieved higher scores in machine-part models (>75%): by considering just a machine-part pair data ML models are able to better capture the physic of their injection process.
SHION offers to the quality staff or to team leaders the possibilities of creating new models and of retraining existing models with new available data. The user can set the following training parameters in order to determine the dataset to use for training: start date/ end date; machine/part; kind of algorithm to be used. These parameters are combined and they are sent to the cloud for obtaining new prediction models or newer versions of existing ones.
Once the new models or new versions of already existing models are available, the qualified users can check and compare them based on their quality (accuracy, retention, etc.). They can modify default model selection and they can also check in real time the current status of the machines (to see the parameters used in previous injections) together with the results of the failure predictions (see Figure 4).
Some network issues were experienced during implementation. Synchronization problems between the machines and the cloud resources appeared due to the use and availability of CloudBroker resources and persistence instances. An alternative solution to provide optimal use of resources and functionality was adopted by setting up a two-server model at Thermolympic. The prediction monitor service was deployed locally (running in persistence at 7 days 24 hours) and reserved the use of cloud only for training purposes at selected periods. This alternative solution significantly reduced the use of bandwidth and the risk of losing manufacturing data.
An iterative usability evaluation was also performed to ensure that SHION could be used effectively by users. A usability expert observed a quality manager and an injection molding operator while interacting and using SHION. Usability issues were identified and recommendations to resolve them were proposed and communicated to be acted on.

DISCUSSION
SHION implementation demonstrated the capability to integrate real-time production intelligence data analytic on the shop-floor by providing feedback to both machine operators and to quality monitoring staff. Although many studies have demonstrated this capability in closed conditions at labs, it is very rare to see its application in a real industrial setting. During the K-Messe 2019, SHION was the only solution, which showed a functioning solution that worked on actual production environments. Since SHION was validated in an actual injection molding production site, its effect on skills change demand in Thermolympic employees could be observed. While no skill change was necessary for shop-floor workers, there was a need for a quality manager to obtain new skills and knowledge. The skills and knowledge in how to use the technologies related to the functioning of the Cyber Physical System effectively were required. In the case of SHION, a basic but broad understanding of the principles behind ML prediction models and ML retraining were necessary. The need for this type of skill and knowledge was also reported by Saniuk et al. 15 Nowadays, most of the new (injection) machines can provide smart (proprietary) modules (both on cloud and on board) and there are several commercial IoT platforms (such as, Mind Sphere 14 ). However, this may not be the case in most SMEs, which commonly utilize older type machines (between 15 and 30-years old), and thus limits their ability to adopt I4.0 solutions. Fortunately, it is possible to send data to the cloud for older machines by using standard interfaces (such as Euromap 83), which then allow additional solutions, such as SHION to generate intelligent models to monitor the production line in real time and upgrade or improve the models by taking advantage of new data captured from the production line. SHION implementation required companies that already had an I4.0 basic setup like Thermolympic. For companies without I4.0 basic setup, SHION solution could be costly to be implemented as most of the cost of the solution was related to IoT readiness, 16 which required purchasing hardware and software costing around 250000€ to provide data for the solution. Although SHION was just implemented in two production lines, Thermolympic estimated the reduction in quality costs (customer penalties, waste material, etc.) in 50000€ of during the execution of the experiment.
Collecting and analyzing information in real time while avoiding excessive data transfer and data processing delays is the core principle of I4.0. Cloud has traditionally been used to process IoT data since they provide cheaper and virtually unlimited computing power. However, the burden of uploading data to remote cloud could lead to inefficient uses of bandwidth and energy, exposure to risk of losing data and/ or services due to network issues, and prohibitive cost of regularly transferring so much data over the cloud. As we have experienced during SHION implementation, the traditional cloud-centric IoT approach was required to shift toward a distributed model so we could take advantage of smart and programmable cloud services at the network edge. These cloud services on the network offered computing and storage capabilities on a smaller scale and provided benefits, such as saving energy and network bandwidth consumption by avoiding continuously uploading data to the cloud. This also means a reduction in communication delays and the overall size of the data that needs to be migrated across the Internet. As in the case of SHION, we employed a set of mechanisms to process data on behalf of the IoT device and effectively send data to the cloud only when more complex analysis is required. Thus, edge computing, fog computing, and cloud computing offer ideal technical solutions for different level of DT requirements. 17 In SHION implementation, the created models were supervised models. While their accuracy is acceptable, it is highly dependent on machine operator availability to provide feedback on quality issues reporting data. The presence of unbalanced data can bias the results models, especially in the cases where there is a lack of labeled data or data labeling is insufficient. 12 This could provide a barrier for improving the accuracy of models. However, the analysis we performed on data for creating the models showed us possible measures that could be implemented in the future to mitigate this risk: › To include automatic detection of unexpected stoppage of machines: stop events are automatically logged and their causes should be declared by operators before restarting the machine.
› To detect undesired behavior trends of machine parameters: use past events (part faults, machine stops, etc.) for detecting trends in parameters that are conducted to them. › To use previous (success) machine settings for a given part for recommending parameters when setting a new production bath or a new part (never injected before) is being setup.
› To use operator feedback to model-generated alarms for retraining the models and to support decisions of the quality manager about activating, deleting, retraining of a given model. › To consider the use of unsupervised algorithms, such as self-organized map, or clustering analysis algorithms, such as K-mean or t-distributed stochastic neighbor embedding. 12 › To improve the capabilities of the virtual twin by including interaction with physical plastic injection models for supporting hybrid analysis and modeling.
To successfully implement DT solutions, artificial model results must be presented in such a way that they are understandable 18 by users. Data presentation has to allow users perceiving the presented facts in the context they happen, easing taking decisions. 19 Understanding the ML models is also important to demonstrate that they are not biased. 12 Explainable AI (XAI) is a very state-of-the-art trend and aims to explain models in an understandable way and trustworthy to users. XAI aims to answer questions about how and why certain results are obtained, allowing users to justify and explain their data-driven decisions 20 and, in some scenarios, assess legislation compliance. 19 CONCLUSION SHION provide real-time detection of production failure in injection molding. It has successfully included human involvement in DT and built a functioning DT architecture, which can be deployed both in server based and edge based. SHION implementation has shown cloud computing was more suitable for training/retraining and local installations for realtime solutions suit better because they were cheaper and less risky. Our implementation also showed that easily scale on resources and the processes were not required to run continuously (24 h Â 7 d). The implementation also showed the feasibility of utilizing sensors available in the market to implement DT.