Using Kolmogorov-Arnold Networks for Discrepancy Modeling of Lateral Flow in Hot Rolling of Steel Slabs

Article Preview

Abstract:

Kolmogorov-Arnold networks (KANs) have emerged as a promising counterpart to multi-layer perceptrons (MLPs) which offer a more interpretable functionality for different machine learning(ML) applications. Their main difference lies in the definition of KAN layers, using learnable activa-tion functions, which has made these networks optimal for physics-based applications. In this work,we focus on analyzing the performance of KANs in capturing the physics of the hot rolling process,which is an integral part of steel manufacturing industry. Initially, we introduce non-dimensional pa-rameters to encapsulate geometrical factors in the process. We perform space-filling sampling in thespace spanned by these parameters. The sampled points yield the necessary parameters for the finite el-ement (FE) simulations, forming the ground truth (GT) data for the network. A closed-form analyticalmodel for spread is considered from previous studies in the literature, and its predictive performanceis assessed against the FE results. In defining the input space for the network, different alternatives arecompared and it was seen that input space containing the non-dimensional features and the predictionsof the analytical model reduced overfitting and better generalization. The effect of KAN hyperparam-eters are evaluated, and the network with tuned parameters demonstrate optimal performance on thetest set. Lastly, after applying symbolification for this network, a closed-form expression is obtainedthat captures the discrepancy between the analytical model and the GT results, and its performance istested against test set data.

You have full access to the following eBook

Info:

* - Corresponding Author

[1] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359–366, 1989.

DOI: 10.1016/0893-6080(89)90020-8

Google Scholar

[2] Friedemann Zenke, Ben Poole, and Surya Ganguli. Continual learning through synaptic intelli gence. In International conference on machine learning, pages 3987–3995. PMLR, 2017.

Google Scholar

[3] Zhi-Qin John Xu,Yaoyu Zhang, and Yanyang Xiao. Training behavior of deep neural network in frequency domain. In International Conference on Neural Information Processing, pages 264 274. Springer, 2019.

Google Scholar

[4] Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the spectral bias of neural networks. In International conference on machine learning, pages 5301–5310. PMLR, 2019.

Google Scholar

[5] Wei Cai and Zhi-Qin John Xu. Multi-scale deep neural networks for solving high dimensional pdes. ArXiv, abs/1910.11710, 2019.

Google Scholar

[6] George Philipp, Dawn Song, and Jaime G Carbonell. The exploding gradient problem demystified-definition, prevalence, impact, origin, tradeoffs, and solutions. arXiv preprint arXiv:1712.05577, 2017.

Google Scholar

[7] Miles Cranmer. Interpretable machine learning for science with pysr and symbolicregression.jl. ArXiv, abs/2305.01582, 2023.

Google Scholar

[8] Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, and MaxTegmark. Kan:Kolmogorov-arnold networks. ArXiv, abs/2404.19756, 2024.

Google Scholar

[9] A. K. Kolmogorov. On the representation of continuous functions of several variables by super position of continuous functions of one variable and addition. Doklady Akademii Nauk SSSR, 114:953–956, 1957.

DOI: 10.1007/978-94-011-3030-1_56

Google Scholar

[10] Amir Noorizadegan,Sifan Wang,and Leevan Ling. Apractitioner's guide to kolmogorov-arnold networks. ArXiv, abs/2510.25781, 2025.

Google Scholar

[11] Xiong Xiong, Kang Lu, Zhuo Zhang,Zheng Zeng, Sheng Zhou, Zichen Deng, and Rongchun Hu. J-pikan: A physics-informed kan network based on jacobi orthogonal polynomials for solving fluid dynamics. Communications in Nonlinear Science and Numerical Simulation, 152:109414, 2026.

DOI: 10.1016/j.cnsns.2025.109414

Google Scholar

[12] Juan Diego Toscano, Theo Käufer, Zhibo Wang, Martin Maxey, Christian Cierpka, and George Em Karniadakis. Aivt: Inference of turbulent thermal convection from measured 3d velocity data by physics-informed kolmogorov-arnold networks. Science Advances, 11, 2025.

DOI: 10.1126/sciadv.ads5236

Google Scholar

[13] SS Sidharth and R Gokul. Chebyshev polynomial-based kolmogorov-arnold networks: An effi cient architecture for nonlinear function approximation. ArXiv, abs/2405.07200, 2024.

Google Scholar

[14] Qi Qiu, Tao Zhu, Helin Gong, Liming Luke Chen, and Huansheng Ning. Relu-kan: New kolmogorov-arnold networks that only need matrix addition, dot multiplication, and relu. ArXiv, abs/2406.02075, 2024.

DOI: 10.1109/swc65939.2025.00262

Google Scholar

[15] Ali Kashefi. Kolmogorov–arnold pointnet: Deep learning for prediction of fluid fields on ir regular geometries. Computer Methods in Applied Mechanics and Engineering, 439:117888, 2025.

DOI: 10.1016/j.cma.2025.117888

Google Scholar

[16] Ziyao Li. Kolmogorov-arnold networks are radial basis function networks. arXiv preprint arXiv:2405.06721, 2024.

Google Scholar

[17] Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Wei Wang, Xiping Hu, and Edith C-H Ngai. Fourierkan-gcf: Fourier kolmogorov-arnold network–an effective and efficient feature transfor mation for graph collaborative filtering. arXiv preprint arXiv:2406.01034, 2024.

Google Scholar

[18] Danli Li, Bo Yan, Quan Long, and Bin Wang. De-kan: A differential evolution-based opti mization framework for enhancing kolmogorov-arnold network sincomplex nonlinear modeling. 2025 IEEE Congress on Evolutionary Computation (CEC), pages 1–8, 2025.

DOI: 10.1109/cec65147.2025.11043029

Google Scholar

[19] Lin Zhang, Lei Chen, Fuxiang An, Zixuan Peng, Yuhang Yang, Tingting Peng, Yongshi Song, and Yanzheng Zhao. A physics-informed neural network for nonlinear deflection prediction of ionic polymer-metal composite based on kolmogorov-arnold networks. Eng. Appl. Artif. Intell., 144:110126, 2025.

DOI: 10.1016/j.engappai.2025.110126

Google Scholar

[20] Prakash Thakolkaran, Yaqi Guo, Shivam Saini, Mathias Peirlinck, Benjamin Alheit, and Sid dhant Kumar. Can kan cans? input-convex kolmogorov-arnold networks (kans) as hyperelastic constitutive artificial neural networks (cans). Computer Methods in Applied Mechanics and En gineering, 443:118089, 2025.

DOI: 10.1016/j.cma.2025.118089

Google Scholar

[21] Amanda A. Howard, Bruno Jacob, Sarah H. Murphy, Alexander Heinlein, and Panos Stinis. Finite basis kolmogorov-arnold networks: domain decomposition for data-driven and physics informed problems. ArXiv, abs/2406.19662, 2024.

Google Scholar

[22] Mehrdad Kiamari, Mohammad Kiamari, and Bhaskar Krishnamachari. kolmogorov-arnold networks. ArXiv, abs/2406.06470, 2024. Gkan: Graph.

Google Scholar

[23] Y. Xin, Z. Zhang, Z. Zhong, and Y. Li. Lateral spread prediction based on hybrid CNN–LSTM model for hot strip finishing mill. Materials Letters, 378:137594, 2025.

DOI: 10.1016/j.matlet.2024.137594

Google Scholar

[24] Yanjiu Zhong, Jingcheng Wang, Jiahui Xu, Jun Rao, and Kangbo Dang. Data-driven width spread prediction model improvement and parameters optimization in hot strip rolling process. Applied Intelligence, 53:25752–25770, 2023.

DOI: 10.1007/s10489-023-04818-8

Google Scholar

[25] Tomaso Poggio. How deep sparse networks avoid the curse of dimensionality: Efficiently com putable functions are compositionally sparse. CBMM Memos, (138), October 2022.

Google Scholar

[26] E. Buckingham. On physically similar systems; illustrations of the use of dimensional equations. Physical Review, 4(4):345–376, 1914.

DOI: 10.1103/physrev.4.345

Google Scholar

[27] Kai-Tai Fang, Runze Li, and Agus Sudjianto. Design and Modeling for Computer Experiments, volume 6 of Computer Science and Data Analysis Series. Chapman & Hall/CRC, Boca Raton, FL, 2005.

Google Scholar

[28] Mario F. Buchely, Shouvik Ganguly, David C. Van Aken, Ronald O'Malley, Simon Lekakh, and K. Chandrashekhara. Experimental development of johnson–cook strength model for differ ent carbon steel grades and application for single-pass hot rolling. steel research international, 91(7):1900670, 2020.

DOI: 10.1002/srin.201900670

Google Scholar

[29] ABAQUS, Inc. Abaqus analysis user's manual, version 6.6. https://classes. engineering.wustl.edu/2009/spring/mase5513/abaqus/docs/v6.6/books/usb/ default.htm?startat=pt04ch11s08aus64.html.

Google Scholar

[30] Takashi Shibahara, Yoshisuke Misaka, Teruo Kono, Mitsuru Koriki, and Hiroshi Takemoto. Edger set-up model at roughing train in hot strip mill. Tetsu To Hagane-journal of The Iron and Steel Institute of Japan, 67:2509–2515, 1981.

DOI: 10.2355/tetsutohagane1955.67.15_2509

Google Scholar