Analysis of Wire Rolling Processes Using Convolutional Neural Networks

This study leverages machine learning to analyze the cross-sectional profiles of materials subjected to wire-rolling processes, focusing on the specific stages of these processes and the characteristics of the resulting microstruc - tural profiles. The convolutional neural network (CNN), a potent tool for visual feature analysis and learning, is utilized to explore the properties and impacts of the cold plastic deformation technique. Specifically, CNNs are constructed and trained using 6400 image segments, each with a resolution of 120 × 90 pixels. The chosen ar - chitecture incorporates convolutional layers intercalated with polling layers and the “ReLu” activation function. The results, intriguingly, are derived from the observation of only a minuscule cropped fraction of the material’s cross-sectional profile. Following calibration two distinct neural networks, training and validation accuracies of 97.4%/97% and 79%/75% have been achieved. These accuracies correspond to identifying the cropped image’s location and the number of passes applied to the material. Further improvements in accuracy are reported upon integrating the two networks using a multiple-output setup, with the overall training and validation accuracies slightly increasing to 98.9%/79.4% and 94.6%/78.1%, respectively, for the two features. The study emphasizes the pivotal role of specific architectural elements, such as the rescaling parameter of the augmentation process, in attaining a satisfactory prediction rate. Lastly, we delve into the potential implications of our findings, which shed light on the potential of machine learning techniques in refining our understanding of wire-rolling processes and guiding the development of more efficient and sustainable manufacturing practices.


INTRODUCTION
In the pursuit of environmental sustainability and energy efficiency, there is a growing demand for the fabrication of lightweight components in the automotive, construction, and aerospace industries.Among various strategies, the use of steel wire with reduced cross-sectional area presents a promising approach to achieve weight reduction while conserving metal resources [1,2].As a crucial material in contemporary industries, a vast quantity of wire rods is employed in diverse applications, from fencing to bridge and hoist cables.The user-end products typically necessitate precise dimensions, superior surface finish, and specific mechanical properties.Furthermore, the range of wire sizes required is vast, extending from fractions of an inch to several hundred feet [3].In manufacturing, a variety of processes, including roller cassette die formation, drawing, and swaging, are utilized to produce wires of different sizes.Consequently, the specific mechanical characteristics of different materials significantly influence their suitability for these metalworking processes.As such, understanding and optimizing these properties is paramount to the successful production of highquality, lightweight components.Roller-die cassette is a promising alternative to the conventional wire processes such as drawing dies.The method is a metal-reducing process in which a wire rod is pulled or drawn through two pairs of rolls.As the wire rod rolls directly through round-shaped rolls, the first pair gives rise to an oval form in the transversal direction, which might also subsequently cause finning.The second pair of rolls applies further deformation and restores the wire to a round shape, as shown in Figure 1.Studies have revealed many beneficial characteristics that suggest the technique is capable of increasing productivity and product quality [1,4].Moreover, regarding the mechanical properties of the wires, extensive studies [4][5][6][7][8] have been carried out, and it was observed that roller-die drawn wires, when compared against the conventional drawn wires, are featured by a decrease in tensile and yield strengths, considerably lower power consumption, possibility to increase the reduction in the rolling process for each pass without the risk of occurrence of central cracks in the wire.
Although previous studies have reported improvements in the roller-die cassette process, there are still many aspects that still need to be explored in the literature.It is interesting to analyze the percentage reduction of cross-section, recognize the impact of successive passes, and explore the localized feature of the cross-sectional area of the processed wire rod.Specifically, the varying strain might cause inhomogeneity on the wire cross-sectional plane, resulting in the occurrence of differences in material hardening, structure, texture, distribution of residual stresses, and the geometric structure of the wire surface.In this context, it introduces an extra layer of difficulty to estimate the reduction (deformation) ratio performed at room temperature, assess the cross-sectional profile's texture, and analyze the materials' tensile properties in the roller die drawing process.Also, the use of machine learning [9] (ML) and automatic optimization techniques for such topics is largely missing in the literature.
ML, a vibrant branch of artificial intelligence (AI), is driven by the creation of robust mathematical models through a distinctive "learning" process.The resulting models capitalize on data to accomplish specified tasks, with their architecture typically being formed by ML algorithms through a training process on designated sample data, known as training data.Intriguingly, the hallmark of an ML algorithm is its ability to develop models that are not pre-programmed for predictions or decision-making but rather learn from the data they are trained on.ML algorithms have seen widespread application across diverse sectors, including but not limited to medicine, email filtering, speech recognition, agriculture, and computer vision.Traditional approaches often falter in these fields, their performance undermined by the intricate and dynamic nature of the tasks at hand.ML has been particularly successful in mitigating such challenges.One of the most remarkable breakthroughs in the field of ML is in the domain of pattern recognition, which has seen significant advancements in recent years.Theoretically, the challenge is framed as a mathematical optimization problem concerning a loss function that forms part of the model's construct.Artificial neural networks, particularly convolutional neural networks [9] (CNNs), often serve as the backbone of these models, designed to emulate the interconnectivity of neurons within a biological brain.These networks have found crucial applications across a variety of fields, outshining conventional approximation methods, such as polynomial fit via Lagrange cardinal functions, owing to their inherent nonlinearity and complexity.One of the prominent aspects of CNN stems from its application for image processing.Various application has flourished in pertinent areas, notably in radiology [10][11][12][13][14], visual pattern recognition [15,16], and tracking [17][18][19].Nonetheless, despite the extensive utilization of ML algorithms, especially CNNs, their deployment in the sphere of wire plastic deformation remains relatively preliminary [20].Therefore, exploring the properties of different wire deformation techniques using ML, coupled with the cross-sectional profiles of the processed material, presents an intriguing opportunity.In particular, the global texture of these materials, which has been scarcely analyzed quantitatively from a pattern recognition standpoint, warrants further investigation.
Regarding the above considerations, the main goal of the present study is to investigate the material's microstructure, such as texture and inhomogeneity in grain morphology, associated with specific cumulative deformation.By employing the CNN, the analyses are performed in unstructured data in the form of optical microscopy images of the materials subject to different cumulative cold degree deformation.In particular, a CNN is established and trained by observing only an insignificant cropped fraction of the material's cross-sectional profile.The study aims to quantitatively identify the location of the cropped image and the number of passes to the material regarding the deformation process using the roller-die cassette.The choice of carbon steel is motivated mainly by the fact that the material remains the most used in industry, owing to its competent price-performance ratio.Besides, the material offers a wide range of properties deriving from its numerous microstructure and texture variants.Specifically, carbon steel's underlying mechanical properties present it as a powerful industrial solution to many practical demands and requests.Besides the results reported in the literature [20][21][22][23][24], the present study aims to tackle the problem from a quantitative perspective using machine learning techniques.
The remainder of the paper is organized as follows.In the following section, the material and the deformation process are presented.In Sec.III, we carry out the model validation to tune the model into the optimized state and perform the analysis.Two different approaches are developed.For the first approach, one calibrates and trains two separated neural networks, aiming to identify the localization of the cropped crosssectional profile and the number of passes.For the second one, the two networks are merged into a multi-output configuration and perform the same task.Reasonable and consistent results are obtained for both setups.The specific architecture of the CNN is scrutinized, from which one concludes that the rescaling parameter plays an essential role in guaranteeing a satisfactory outcome.Further discussions and possible implications of the current approach are given in the last section.

MATERIAL AND CROSS-SECTIONAL PROFILES
In this study, we employed a wire rod of Ø6.65 mm with a rolled from commercial AISI 1008 carbon steel, whose chemical compositions are presented in Table 1.The process began with the chemical pickling of the wire rod in an aqueous solution of sulphuric acid and coated with zinc phosphate as a lubricant carrier.
Subsequently, the cold deformation process is carried out using a multi-pass roller-die drawing machine at a speed of 1.6 ms −1 .The process will be carried out on various specimens.The roller-die cassette utilized for the cold wire rolling experiments is a commercial microcassette consisting of two sets of three rolls, as shown in Figure 1 and 2. The schedules of passes and mechanical properties are shown in Table 2.
For individual passes, measurements of the cross-section reduction are carried out.The total reduction in cross-section associated with wire rolling is found to be 57.85%, as shown in Table 2.These processes are understood to affect the material's microstructure significantly.Also, before investigating the microstructures of the samples, the standard metallographic preparation was implemented.Specifically, the transverse sections from the truncated specimen are embedded, grounded, and polished.Optical microscopy is then used to examine and record the details of the profiles in order to study the effect of the processes on the microstructure.
In Figure 3, one presents the cross-sectional profiles of the wire rods subject to the rolling processes.Before proceeding to the analysis using the ML technique, it is noted that qualitative studies on the texture of the cross-sectional profile of the wires also revealed the difference between different deformation passes.Relevant studies have been performed regarding the properties of both the border and the center of the wire [4,26].Nonetheless, to achieve a better and quantitative understanding of  the matter, the present study aims to analyze the microstructures from the perspective of ML, as elaborated further in the following section.

CNN MODELS, TRAINING, VALIDATION, AND TEST
To explore the material's microstructure, different CNN models are constructed and trained on the data furnished by the cross-sectional profiles.The latter is obtained by dividing the images of the cross-sectional profiles of the processed material into more miniature figures.In particular, the original profile is arbitrarily divided into images of 120×90 pixel size.
The obtained cropped figures are further separated into training, validation, and test data sets.A total of 6400 image fractions are utilized in the present study, from which 4000 are randomly selected for the training process, 1760 for the validation, and the remaining ones are attributed to the test set.In Figure 4, one shows eight arbitrary cropped figures used for the training set.Such a   process is motivated by one of our goals to extract relevant information from observing only a tiny fraction of the cross-sectional profile.As a result, the data sets' resolution is much inferior compared to the original cross-sectional profile, as demonstrated, for instance, by Figures 3 and  4. Most standard PYTHON libraries can readily use these images, providing the interface to use most image formats as input directly.In this work, one concentrates on the task of identifying the localization and how many passes the wire rod has been processed for a given cropped figure chosen arbitrarily from the test set.The main goal is to achieve reasonable accuracy in the above tasks through an adequately designed CNN.
To this end, one utilizes two different approaches.For the first approach, one calibrates and trains two separated CNNs, referred to as "CNN A" and "CNN B", respectively, aiming to identify the localization of the cropped cross-sectional profile and the number of passes.For the second one, the two networks are merged into a multi-output configuration, denoted as "CNN C," which is subsequently employed to perform the same task.The architectures of the CNNs explored in the present study are illustrated in Figure 5.
To calibrate the model, the following aspects are analyzed: the data augmentation options, the architecture in terms of the number and size of the layers, the loss function, and optimizer schemes.The results of the above characteristics of the model are elaborated in what follows.As an example, in what follows, one mostly elaborates on the model calibration using the dedicated CNN to identify the localization, shown in Figure 5 as "CNN A".The data augmentation options employed in the present study are illustrated in Figure 6.The results are illustrated for some of the augmentations applied to a random image (shown in the top-left panel) from the training dataset.The bottom-right panel shows the overall result when all augmentations are applied simultaneously.Among different options, the color pixel range normalization, namely, the "rescale", is implemented intentionally, aiming to prevent the network from making an easy identification based on the color depth.Table 3 shows the resultant accuracy when one uses different values for the rescale parameter.As the rescale parameter increases, the training and validation accuracies increase and decrease.In other words, a moderate normalization in pixel range facilitates the training, but as it continues to increase, the network will eventually become over fitted, and the model loses its predicting power.The choice of the rescale parameter is preferable to picking a more significant value of the parameter, as our primary goal is to distinguish between different figures primarily based on the texture of the cropped profiles but not the color pixel depth.When compared to others, the "fill" augmentation option turns out to play a crucial role.Although intuitively, "constant", "nearest", or "reflect" modes do not seem to make much of a difference in terms of information on the profiles' texture.In practice, however, "constant" mode is found to furnish better performance regarding the training and validation accuracies by 1-2 percent for CNN A, as indicated in Table 4.
Neural network A's specific architecture is analyzed and presented in Table 5 and 6 The neural network is primarily composed of a few repeated As described in the main text, these images are of 120×90 pixel size, and their lower resolution is defined by the neural network's architecture.The first row shows the cropped figures near the border of the profile, while the second row corresponds to those close to the center of the cross-sectional area.From left to right, the figures are from the profile taken from the material that has passed the roller die for zero, once, twice, and three times structures of the "convolutional layer + nonlinear ReLu layer + pooling layer", and then the data go through a flatten layer and a few fully connected layers before the final output is proceeded by a logistic layer.A convolutional layer is implemented by F filters with respective weight W, and each S× S filter is characterized by its size S.The activation function ReLu sneaks an insignificant but essential amount of nonlinearity into the regression, which is typically used in the application of CNN in place of its alternatives, such as Sigmoid.The pooling layer reduces the size of the data passing through the network while maintaining the crucial information.The architecture is denoted by the form of a ratio of the size of the convolutional layer's output shape and weight to the number of convolutional kernel instances, namely, "(size, weight)/no.kernel".The intermediate pooling and fully connected layers are implied and not explicitly counted.Specifically, the architecture of a CNN will be denoted as: where: N -the total number of convolutional layers.
We show the resultant accuracies by exploring different architectures using different numbers and sizes of layers.Unless specified, the calculations are carried out using alternated filter sizes of 3 × 3 and 2 × 2 pixels with 500 epochs, a batch size of 25 for train and 11 for validation, and an "Adam" optimizer.In Table 5 different architectures are implemented by altering the dimensions of the pooling layers.In Table 6 one explores the sensitivity of the size of the neural network by using different numbers of layers.It is observed that a more significant number of layers does, in general, improve performance.However, the effect is gradual, and in particular, a network with three layers with a proper architecture already provides reasonable precision.In particular, when more sophisticated architectures are employed, the training and validation accuracies mostly become convergent and show slight improvement.
Different training strategies for a given network architecture also lead to substantial differences.As shown in Table 7, the training accuracy increases monotonically as one increases the batch size when maintaining the network's layout.Conversely, the validation accuracy increases and then decreases, indicating overfitting.This behavior is not observed for the number of epochs, as shown below in Figure 7, and the loss function tends to converge when trained by a reasonably large epoch number.
Regarding the loss function, one employs the "sparse categorical cross-entropy", essentially a Softmax activation plus a cross-entropy loss.Last but not least, regarding the optimizers, we have individually considered the following choices: adaptive learning rate optimization algorithm (Adam), root mean squared propagation (RM-Sprop), and standard stochastic gradient descent (SGD).The results for four different choices of optimizers are presented in Figure 7 and Table 8.Generally speaking, apart from "RMSprop", all the remaining three types of optimizers give rise Table 3.The effect of the pixel range rescale in the data augmentation in CNN A. The obtained accuracies are for a CNN of the architecture [(120, 90)/32, (60, 45)/64, (28, 21)/128] with 500 epochs and a batch size of (25,11).to reasonable trends.Nonetheless, as indicated by both Figure 7 and Table 8, it is apparent that "Adam" offers a faster convergence than other candidates, which is subsequently chosen for the current CNN architectures.Through the validation process, the optimal architectures for the three CNNs are encountered as follows: These architectures are implemented with the alternated filter sizes of 3× 3 and 2× 2 pixels with 500 epochs and a batch size of (25,11).The "Adam" optimizer and "sparse categorical crossentropy" loss function are employed for CNN A. For CNN B, the "SGD + momentum" optimizer is utilized instead.The architectures of the two networks, A and B, are not entirely identical, but they have similar filter sizes so that a merger shown in the last row of Figure 5 becomes feasible.For CCN C, one uses 300 epochs and a batch size of (200, 44), and the remaining configurations are identical to CNN B. There is a subtle difference between the two choices: either to use two CNNs A & B or CNN C. First, the architectures of the two choices are mainly identical, and therefore, the computational time to calibrate and train the networks is of the same order of magnitude.Second, as shown in Figure 5 CNN C consists of CNN A and B in a parallel fashion.The input layer is identical to CNN A and B, and the output layer features a multi-output setup.The two networks do not interact with each other except in the last layer, which might lead to an overall better performance.As discussed below, this is indeed observed in our case.
Upon the completion of the model validation process, as discussed above, the resultant CNNs are put into action using the test set.Figures 8,9,10,and 11 showcase several instances where the trained networks successfully identify the deformation approach and instances where they fall short.In terms of performance, the two separate CNNs achieve training and validation accuracies of 97%/97.4% and 79%/75%, respectively.The accuracies obtained from the test set read 95.8% and 74.7%.As expected, the accuracy rates obtained using the test set are in reasonable agreement with those from the validation process.Moreover, we performed a cross-validation     obtain consistent values of 96% and 94.4% for the training and validation accuracies.These accuracies reflect the network's ability to pinpoint the cropped image's location and determine the number of passes applied to the material.Meanwhile, the merged multiple-output CNN, designated as CNN C, exhibits training and validation accuracies of 98.9%/79.4% and 94.1%/78.1%,respectively.A closer examination of these results reveals a consistent and reasonable accuracy pattern across both models.This suggests that the models used in this study are sufficiently robust and capable of identifying key features in the data.Despite instances of misidentification, the reasonable level of accuracy achieved in most cases indicates the potential of these models for practical applications.Nonetheless, it is also important to note that model performance can be influenced by various factors, including the quality of the input data, the complexity of the task, and the appropriateness of the model architecture for the task at hand.Therefore, while these results are encouraging, further testing and refinement may be necessary to ensure that the models maintain high levels of accuracy when applied to new or more complex datasets.

CONCLUSIONS
This study showed that the machine learning algorithm is competent in quantitatively analyzing the material's microstructure in the cold plastic deformation process.Our analysis was performed for processes involving rollerdie cassettes, but it is straightforward for other approaches, such as rotary swaging.This was demonstrated by successfully classifying wire plastic deformation techniques using two CNN models.The first model involves the training and validation of two distinct neural networks that aim to identify the localization of the cropped cross-sectional profile and the number of passes in a separate fashion.The second model unites the two networks into a multioutput configuration and performs the same task.By properly tuning the networks, reasonable training and validation accuracies were achieved.Specifically, it is shown that architecture is essential in guaranteeing a satisfactory accuracy rate.Machine learning techniques have enjoyed a significant boost and have aroused much attention in recent years.In particular, the robustness of a binary prediction network has been manifestly demonstrated in different areas by many authors.To our knowledge, it has yet to be widely applied to wire plastic deformation.Although not explicitly presented in this paper, the algorithm has been successfully applied to different experimental setups and aimed to extract distinct features.For the present study, the analysis, namely, the training, validation, and test processes, has been exclusively carried out using the existing drawing parameters, which underscore the feasibility of the obtained CNN.Therefore, the present study serves as an initial attempt to demonstrate that the algorithm possesses promising potential.Also, the neural networks we have developed are not generative, such as variational autoencoder (VAE) or generative adversarial networks (GANs).In this regard, the established architectures were only desired to make predictions but not generate similar patterns to the training data.To generate a meaningful profile is indeed an intriguing line of research that is worth exploring.Other applications and possible implications are rather inviting for future studies.

Figure 2 .
Figure 2. The wire rolling equipment utilized in the present study

Figure 3 .
Figure 3.The cross-sectional profiles of the wires subject to different processes.(a) The profile of the material near the border, (b) the profile of the material near the center a) b)

Figure 4 .
Figure 4.The cropped figures from the cross-sectional profiles constitute the training set for the CNN.As described in the main text, these images are of 120×90 pixel size, and their lower resolution is defined by the neural network's architecture.The first row shows the cropped figures near the border of the profile, while the second row corresponds to those close to the center of the cross-sectional area.From left to right, the figures are from the profile taken from the material that has passed the roller die for zero, once, twice, and three times

Figure 5 .Figure 6 .
Figure 5.The schematic architectures of the two models utilized in the present study.CNN A is constituted primarily by convolutional layers in junction with ReLu activation and pooling layers, and the multidimensional network is then flattened and turned into two possible categories.CNN B is designed similarly but with additional dropout layers.CNN C is essentially a merger of A and B, as it contains the main architecture of both networks [(120, 90)/64, (60, 45)/128, (28, 21)/256, (14, 11)/256)]

Figure 9 .Figure 10 .Figure 11 .
Figure 9.A few cropped figures from the test set correctly identified by CNN B. The four figures on the top row correspond to the border, while those on the bottom row correspond to the center.From left to right, the plots correspond to the profiles where the material has passed through the roller die for zero, once, twice, and three times

Table 2 .
The measures and mechanical properties of AISI 1008 steel wire rods are subject to different passes in roller-die drawings.The resulting total reduction is 57.85%

Table 5 .
The validation of the network regarding different architectures.

Table 6 .
(25,11)idation of the network regarding different numbers of layers.The obtained accuracies are evaluated using different numbers of layers with 500 epochs and a batch size of(25,11)

Table 8 .
The validation of the network regarding different optimizers