机器学习控制

机器学习控制（Machine learning control、MLC）是机器学习、智能控制及控制理论中的一部分，是由机器学习的方式来求解最优控制问题。主要的应用是一些不适用控制系统方法的复杂非线性系统。

问题和任务的分类

以下是四种常用机器学习控制来处理的问题。

控制参数识别；若控制律的结构已知，但其参数未知，机器学习控制会转换为参数识别^[1]。其中一个例子是PID控制器的参数利用遗传算法进行最佳化^[2]，或是离散时间最佳控制的相关应用^[3]。
第一类回归问题的控制设计：只要每一个状态的感测器讯号以及最佳的致动器命令是已知的，机器学习控制可以针对感测器讯号到致动器命令之间关系，近似一个泛用的非线性映射。例子是从已知的全状态回授计算感测器回授。此应用中常会用到神经网络^[4]。
第二类回归问题的控制设计：机器学习控制也可以识别将受控体的支出函数最小化的任意非线性控制律。此情形下，不需要知道模型，也不用知道控制律结构或是最佳的致动器命令。此最佳化只以受控体量测到的控制性能为其基础。遗传编程是这种应用的有力回归工具^[5]。
强化学习控制：可以透过强化学习，依量测到的性能变化（奖赏）持续的更新控制律^[6]。

机器学习控制包括神经网络控制、基于遗传算法的控制、遗传编程控制、强化学习控制等，和其他资料驱动的控制（例如人工智能及机器人控制（英语：robot control））在方向论上有重叠之处。

应用

机器学习控制已应用在许多非线性控制问题上，探索许多未知且未预期的动作机制。以下是一些应用案例：

卫星姿态控制^[7]。
大楼温度控制^[8]。
回授紊流控制^[2]^[9]。
水下载具遥控^[10]。
在PJ Fleming和RC Purshouse 2002年发表的回顾论文中有许多机器学习控制应用在工程上的例子^[11]。

机器学习控制有些方向类似其他非线性方法：对于在许多不同的应用条件下，无法保证收敛性、最佳解或是强健性。

参考资料

^ Thomas Bäck & Hans-Paul Schwefel (Spring 1993) "An overview of evolutionary algorithms for parameter optimization", Journal of Evolutionary Computation (MIT Press), vol. 1, no. 1, pp. 1-23
^ ^2.0 ^2.1 N. Benard, J. Pons-Prats, J. Periaux, G. Bugeda, J.-P. Bonnet & E. Moreau, (2015) "Multi-Input Genetic Algorithm for Experimental Optimization of the Reattachment Downstream of a Backward-Facing Step with Surface Plasma Actuator", Paper AIAA 2015-2957 at 46th AIAA Plasmadynamics and Lasers Conference, Dallas, TX, USA, pp. 1-23.
^ Zbigniew Michalewicz, Cezary Z. Janikow & Jacek B. Krawczyk (July 1992) "A modified genetic algorithm for optimal control problems", [Computers & Mathematics with Applications], vol. 23, no 12, pp. 83-94.
^ C. Lee, J. Kim, D. Babcock & R. Goodman (1997) "Application of neural networks to turbulence control for drag reduction", Physics of Fluids, vol. 6, no. 9, pp. 1740-1747
^ D. C. Dracopoulos & S. Kent (December 1997) "Genetic programming for prediction and control", Neural Computing & Applications (Springer), vol. 6, no. 4, pp. 214-228.
^ Andrew G. Barto (December 1994) "Reinforcement learning control", Current Opinion in Neurobiology, vol. 6, no. 4, pp. 888–893
^ Dimitris. C. Dracopoulos & Antonia. J. Jones (1994) Neuro-genetic adaptive attitude control, Neural Computing & Applications (Springer), vol. 2, no. 4, pp. 183-204.
^ Jonathan A. Wright, Heather A. Loosemore & Raziyeh Farmani (2002) "Optimization of building thermal design and control by multi-criterion genetic algorithm, [Energy and Buildings], vol. 34, no. 9, pp. 959-972.
^ Steven J. Brunton & Bernd R. Noack (2015) Closed-loop turbulence control: Progress and challenges, Applied Mechanics Reviews, vol. 67, no. 5, article 050801, pp. 1-48.
^ J. Javadi-Moghaddam, & A. Bagheri (2010 "An adaptive neuro-fuzzy sliding mode based genetic algorithm control system for under water remotely operated vehicle", Expert Systems with Applications （页面存档备份，存于互联网档案馆）, vol. 37 no. 1, pp. 647-660.
^ Peter J. Fleming, R. C. Purshouse (2002 "Evolutionary algorithms in control systems engineering: a survey" Control Engineering Practice （页面存档备份，存于互联网档案馆）, vol. 10, no. 11, pp. 1223-1241

延伸阅读

Dimitris C Dracopoulos （页面存档备份，存于互联网档案馆） (August 1997) "Evolutionary Learning Algorithms for Neural Adaptive Control" （页面存档备份，存于互联网档案馆）, Springer. ISBN 978-3-540-76161-7.
Thomas Duriez （页面存档备份，存于互联网档案馆）, Steven L. Brunton （页面存档备份，存于互联网档案馆） & Bernd R. Noack (November 2016) "Machine Learning Control - Taming Nonlinear Dynamics and Turbulence" （页面存档备份，存于互联网档案馆）, Springer. ISBN 978-3-319-40624-4.

[Baeck1993-1] Thomas Bäck & Hans-Paul Schwefel (Spring 1993) "An overview of evolutionary algorithms for parameter optimization", Journal of Evolutionary Computation (MIT Press), vol. 1, no. 1, pp. 1-23

[Benard2015aiaa-2] 2.0 ^2.1 N. Benard, J. Pons-Prats, J. Periaux, G. Bugeda, J.-P. Bonnet & E. Moreau, (2015) "Multi-Input Genetic Algorithm for Experimental Optimization of the Reattachment Downstream of a Backward-Facing Step with Surface Plasma Actuator", Paper AIAA 2015-2957 at 46th AIAA Plasmadynamics and Lasers Conference, Dallas, TX, USA, pp. 1-23.

[3] Zbigniew Michalewicz, Cezary Z. Janikow & Jacek B. Krawczyk (July 1992) "A modified genetic algorithm for optimal control problems", [Computers & Mathematics with Applications], vol. 23, no 12, pp. 83-94.

[4] C. Lee, J. Kim, D. Babcock & R. Goodman (1997) "Application of neural networks to turbulence control for drag reduction", Physics of Fluids, vol. 6, no. 9, pp. 1740-1747

[5] D. C. Dracopoulos & S. Kent (December 1997) "Genetic programming for prediction and control", Neural Computing & Applications (Springer), vol. 6, no. 4, pp. 214-228.

[6] Andrew G. Barto (December 1994) "Reinforcement learning control", Current Opinion in Neurobiology, vol. 6, no. 4, pp. 888–893

[7] Dimitris. C. Dracopoulos & Antonia. J. Jones (1994) Neuro-genetic adaptive attitude control, Neural Computing & Applications (Springer), vol. 2, no. 4, pp. 183-204.

[8] Jonathan A. Wright, Heather A. Loosemore & Raziyeh Farmani (2002) "Optimization of building thermal design and control by multi-criterion genetic algorithm, [Energy and Buildings], vol. 34, no. 9, pp. 959-972.

[9] Steven J. Brunton & Bernd R. Noack (2015) Closed-loop turbulence control: Progress and challenges, Applied Mechanics Reviews, vol. 67, no. 5, article 050801, pp. 1-48.

[10] J. Javadi-Moghaddam, & A. Bagheri (2010 "An adaptive neuro-fuzzy sliding mode based genetic algorithm control system for under water remotely operated vehicle", Expert Systems with Applications （页面存档备份，存于互联网档案馆）, vol. 37 no. 1, pp. 647-660.

[11] Peter J. Fleming, R. C. Purshouse (2002 "Evolutionary algorithms in control systems engineering: a survey" Control Engineering Practice （页面存档备份，存于互联网档案馆）, vol. 10, no. 11, pp. 1223-1241

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]