小宇分享(四):HJB方程与海塞矩阵

B站影视 内地电影 2025-08-12 10:24 1

摘要:Share interest, spread happiness,Increase knowledge, leave a beautiful!Dear, this is Learning Yard Academy.

分享兴趣,传播快乐,

增长见闻,留下美好!

亲爱的您,这里是LearningYard新学苑。

今天小编为大家带来文章

“小宇分享(四):

“HJB方程与海塞矩阵“

Share interest, spread happiness,
Increase knowledge, leave a beautiful!
Dear, this is Learning Yard Academy.

Today, I'm sharing the article:

"Xiaoyu Shares (Ⅳ):"

"HJB Equations and Hessian Matrix"

一、思维导图(Mind mapping)

二、精读内容(Intensive reading content)

HJB方程(HJB equation)

HJB方程的定义(Definition of HJB equation)

Hamilton-Jacobi-Bellman方程(简称HJB方程)是最优控制理论中的核心方程。它以偏微分方程的形式刻画了最优值函数,通过求解这一方程可以得到对应的最优控制策略。HJB方程的理论基础源于动态规划原理,由Richard Bellman提出,是描述和求解最优控制问题的重要工具。通过HJB方程,能够系统地分析控制系统在不同状态下的最优决策,从而实现性能指标的最优化。考虑到文献《上下游联合减排与低碳宣传的微分博弈模型》的模型为连续时间模型,所以本推文仅介绍连续时间解法:

The Hamilton-Jacobi-Bellman equation (HJB equation for short) is the core equation in optimal control theory. It describes the optimal value function in the form of a partial differential equation, and by solving this equation, the corresponding optimal control strategy can be obtained. The theoretical basis of the HJB equation is derived from the principle of dynamic programming. It was proposed by Richard Bellman and is an important tool for describing and solving optimal control problems. Through the HJB equation, the optimal decision of the control system under different states can be systematically analyzed, thereby achieving the optimization of performance indicators. Considering that the model in the document "Differential Game Model for Joint Upstream and Downstream Emission Reduction and Low-Carbon Promotion" is a continuous-time model, this tweet only introduces the continuous-time solution:

HJB方程的数学形式(Mathematical form of the HJB equation)

考虑一个受控动态系统,其状态随时间变化满足微分方程 x'(t) = f(x(t),u(t))初始状态为 x(0)=x_0。系统的控制目标是通过选择合适的控制输入u(t),最小化性能指标,如下所示:

Consider a controlled dynamic system whose state changes over time satisfying the differential equation x'(t) = f(x(t),u(t)). The initial state is x(0) = x_0. The control objective of the system is to minimize the performance index by selecting the appropriate control input u(t), as shown below:

其中,x(t)为系统状态变量,u(t)为控制输入,L是即时的损失或成本函数,Φ是终端状态的成本函数。为了刻画在任意时间t和状态x下的最优代价,定义了值函数,如下图所示:

Where x(t) is the system state variable, u(t) is the control input, L is the immediate loss or cost function, and Φ is the cost function of the terminal state. In order to characterize the optimal cost at any time t and state x, a value function is defined, as shown in the figure below:

根据动态规划原理,值函数满足Hamilton-Jacobi-Bellman(HJB)方程:

According to the principle of dynamic programming, the value function satisfies the Hamilton-Jacobi-Bellman (HJB) equation:

同时伴有终端条件,如下所示:

通过求解该偏微分方程,可以获得最优值函数V以及对应的最优控制策略。

By solving the partial differential equation, the optimal value function V and the corresponding optimal control strategy can be obtained.

求解步骤(Solution steps)

HJB方程的具体求解过程,如下图所示:

The specific solution process of the HJB equation is shown in the figure below:

海塞矩阵(Hessian matrix)

海塞矩阵定义(Hessian matrix definition)

海塞矩阵(Hessian matrix)是多变量函数的二阶偏导数组成的方阵,用于描述函数在某一点的二阶局部曲率性质。具体来说,假设有一个标量函数 f: R^n → R,其变量为 x=(x1,x2,…,xn),则海塞矩阵定义如下所示:

The Hessian matrix is a square matrix of the second-order partial derivatives of a multivariate function, used to describe the second-order local curvature properties of the function at a certain point. Specifically, assuming there is a scalar function f: R^n → R, whose variables are x=(x1,x2,…,xn),the Hessian matrix is defined as follows:

海塞矩阵性质(Hessian matrix properties)

当海塞矩阵在某点为正定矩阵时,函数在该点局部是严格凸的,该点为局部极小点。

When the Hessian matrix is positive definite at a point, the function is strictly convex locally at that point and the point is a local minimum.

当海塞矩阵为负定时,该点为局部极大点。

When the Hessian matrix is negative, the point is a local maximum.

当海塞矩阵既非正定也非负定时,该点可能是鞍点。

When the Hessian matrix is neither positive definite nor negative definite, the point may be a saddle point.

此外,海塞矩阵是梯度的雅可比矩阵,也是泰勒展开中二阶项的系数矩阵,广泛应用于二阶优化算法(如牛顿法)和多变量微积分中。

In addition, the Hessian matrix is the Jacobian matrix of the gradient and the coefficient matrix of the second-order terms in the Taylor expansion. It is widely used in second-order optimization algorithms (such as Newton's method) and multivariable calculus.

今天的分享就到这里了。
如果您对文章有独特的想法,
欢迎给我们留言,
让我们相约明天。
祝您今天过得开心快乐!
That's all for today's sharing.
If you have a unique idea about the article,
please leave us a message,
and let us meet tomorrow.
I wish you a nice day!

翻译:谷歌翻译

资料来源:ChatGPT、百度百科

参考文献:[1]徐春秋,赵道致,原白云,等.上下游联合减排与低碳宣传的微分博弈模型[J].管理科学学报,2016,19(02):53-65.

本文由LearningYard学苑整理并发出,如有侵权请后台留言沟通。

文案:qiao

排版:qiao

审核:李杰

来源:LearningYard学苑

相关推荐