Application of Q-Learning Controller for Processes with Dead Time

Jakub Musiał

Statystyki

Pobierz cytowanie

Application of Q-Learning Controller for Processes with Dead Time

Jakub Musiał ¹

Więcej

Ukryj

Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, Gliwice 44-100, Poland.

Autor do korespondencji

Jakub Musiał

Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, Gliwice 44-100, Poland.

SŁOWA KLUCZOWE

online learning

process control

intelligent control

reinforcement learning

Q-learning algorithm

DZIEDZINY

STRESZCZENIE

This paper presents an extension of a self-improving, model-free Q-learning controller for industrial processes characterized by significant dead time. While conventional Q-learning-based control approaches have demonstrated effectiveness for systems without delay, their direct application to time-delay processes is difficult by the mismatch between control actions and their delayed observable effects. To address this limitation, the proposed method introduces a modified Q-learning update mechanism based on FIFO buffers that delay Q-value updates in accordance with the process dead time, ensuring proper correlation between state–action pairs and resulting system responses. Additionally, a reward policy is reformulated for the delayed update structure to support stable and convergent learning. The controller preserves key practical advantages of the original Q2d framework, including model-free operation, bumpless initialization from existing PI controller parameters, and the ability to learn online during normal operation without additionally applied external excitations. The approach is validated through simulation studies using the benchmark first-order plus dead time (FOPDT) processes with different delay times. Results demonstrate that the proposed method enables effective online performance improvement in setpoint tracking and disturbance rejection over a range of time delay value and for different accuracies of the delay time estimation. Overall, the proposed modification extends the applicability of Q-learning-based control to a wider class of industrial processes with time delay, providing a practical possibility for applying reinforcement learning controllers in systems where transport delay is unavoidable.

Wyślij swój artykuł

Udostępnij

Wyślij mailem

ARTYKUŁ POWIĄZANY

Edge Dynamic Matrix Control for energy-efficient control of heat distribution system

Optimizing continuous integration and continuous deployment pipelines with machine learning: Enhancing performance and predicting failures

Przetwarzamy dane osobowe zbierane podczas odwiedzania serwisu. Realizacja funkcji pozyskiwania informacji o użytkownikach i ich zachowaniu odbywa się poprzez dobrowolnie wprowadzone w formularzach informacje oraz zapisywanie w urządzeniach końcowych plików cookies (tzw. ciasteczka). Dane, w tym pliki cookies, wykorzystywane są w celu realizacji usług, zapewnienia wygodnego korzystania ze strony oraz w celu monitorowania ruchu zgodnie z Polityką prywatności. Dane są także zbierane i przetwarzane przez narzędzie Google Analytics (więcej).

Możesz zmienić ustawienia cookies w swojej przeglądarce. Ograniczenie stosowania plików cookies w konfiguracji przeglądarki może wpłynąć na niektóre funkcjonalności dostępne na stronie.

Zgadzam się Nie zgadzam się