矩阵求导详解

矩阵求导详解矩阵求导 矩阵求导

大家好,欢迎来到IT知识分享网。

基础

矩阵求导的本质 : d A d B \frac{dA}{dB} dBdA :矩阵A的每个元素对矩阵B的每个元素进行求导。
 
 假设矩阵A为 1 × 1 1\times 1 1×1,矩阵B为 1 × n 1\times n 1×n, 则 d A d B \frac{dA}{dB} dBdA 1 × n 1\times n 1×n
 假设矩阵A为 q × p q\times p q×p,矩阵B为 m × n m\times n m×n, 则 d A d B \frac{dA}{dB} dBdA q × p × m × n q\times p\times m\times n q×p×m×n


标量不变,向量拉伸
前面横向拉伸,后面纵面拉伸

示例

例1. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为标量函数, f ( x ) = f ( x 1 , x 2 , x 3 . . . x n ) f(x)=f(x_1, x_2,x_3…x_n) f(x)=f(x1,x2,x3xn) x x x为向量, x = [ x 1 x 2 . . . x n ] x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} x=
x1x2...xn

解: d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix} dxdf(x)=
x1f(x)x2f(x)...xnf(x)

标量 f ( x ) f(x) f(x)不变,向量 x x x纵向拉伸

例2. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为向量函数, f ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f n ( x ) ] f(x)=\begin{bmatrix} f_1(x)\\ f_2(x)\\ .\\ .\\ .\\ f_n(x)\\ \end{bmatrix} f(x)=
f1(x)f2(x)...fn(x)
, x x x为标量。

解: d f ( x ) d x = [ ∂ f 1 ( x ) ∂ x ∂ f 2 ( x ) ∂ x . . . ∂ f n ( x ) ∂ x ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f_1(x)}{\partial x} && \frac{\partial f_2(x)}{\partial x} &&… && \frac{\partial f_n(x)}{\partial x} \end{bmatrix} dxdf(x)=[xf1(x)xf2(x)xfn(x)]

向量函数横向拉伸,标量x不变

例3. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为向量函数, f ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f n ( x ) ] f(x)=\begin{bmatrix} f_1(x)\\ f_2(x)\\ .\\ .\\ .\\ f_n(x)\\ \end{bmatrix} f(x)=
f1(x)f2(x)...fn(x)
, x x x为向量, x = [ x 1 x 2 . . . x n ] x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} x=
x1x2...xn

解: d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ ∂ f 1 ( x ) ∂ x 1 ∂ f 2 ( x ) ∂ x 1 . . . ∂ f n ( x ) ∂ x 1 ∂ f 1 ( x ) ∂ x 2 ∂ f 2 ( x ) ∂ x 2 . . . ∂ f n ( x ) ∂ x 2 . . . ∂ f 1 ( x ) ∂ x n ∂ f 2 ( x ) ∂ x n . . . ∂ f n ( x ) ∂ x n ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}= \begin{bmatrix} \frac{\partial f_1(x)}{\partial x_1}&\frac{\partial f_2(x)}{\partial x_1}&…&\frac{\partial f_n(x)}{\partial x_1}\\ \frac{\partial f_1(x)}{\partial x_2}&\frac{\partial f_2(x)}{\partial x_2}&…&\frac{\partial f_n(x)}{\partial x_2}\\ …\\ \frac{\partial f_1(x)}{\partial x_n}&\frac{\partial f_2(x)}{\partial x_n}&…&\frac{\partial f_n(x)}{\partial x_n}\\ \end{bmatrix} dxdf(x)=
x1f(x)x2f(x)...xnf(x)
=

x1f1(x)x2f1(x)xnf1(x)x1f2(x)x2f2(x)xnf2(x)x1fn(x)x2fn(x)xnfn(x)

常见矩阵求导公式

例1: 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) = A T X f(x)=A^TX f(x)=ATX A = [ a 1 a 2 . . . a n ] n × 1 A=\begin{bmatrix} a_1\\ a_2\\ .\\ .\\ .\\ a_n\\ \end{bmatrix}_{n\times 1} A=
a1a2...an
n×1
, X = [ x 1 x 2 . . . x n ] n × 1 X=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}_{n\times 1} X=
x1x2...xn
n×1

解: f ( x ) = A T X = ∑ i = 1 n a i x i f(x)=A^TX=\sum^n_{i=1}a_ix_i f(x)=ATX=i=1naixi
d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ a 1 a 2 . . . a n ] = A \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}=\begin{bmatrix} a_1\\ a_2\\ .\\ .\\ .\\ a_n\\ \end{bmatrix}=A dxdf(x)=
x1f(x)x2f(x)...xnf(x)
=

a1a2...an
=
A


由于{标量 T = ^T= T=标量},所以 f ( x ) = A T X = X T A f(x)=A^TX=X^TA f(x)=ATX=XTA,所以 d A T X d x = d X T A d x = A \frac{dA^TX}{dx}=\frac{dX^TA}{dx}=A dxdATX=dxdXTA=A

例2: 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) = X T A X f(x)=X^TAX f(x)=XTAX, X = [ x 1 x 2 . . . x n ] n × 1 X=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}_{n\times 1} X=
x1x2...xn
n×1
A = [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] A=\begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix} A=
a11a21an1a12a22an2a1na2nann

解: f ( x ) = X 1 × n T A n × n X n × 1 f(x)=X^T_{1\times n }A_{n\times n}X_{n\times 1} f(x)=X1×nTAn×nXn×1,为标量
f ( x ) = [ x 1 x 2 . . . x n ] [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] [ x 1 x 2 . . . x n ] = ∑ i = 1 n ∑ j = 1 n a i j x i x j f(x)=\begin{bmatrix} x_1& x_2& …& x_n& \end{bmatrix}\begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}=\sum^n_{i=1}\sum^n_{j=1}a_{ij}x_ix_j f(x)=[x1x2xn]
a11a21an1a12a22an2a1na2nann

x1x2...xn
=
i=1nj=1naijxixj


d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ ∑ j = 1 n a 1 j x j + ∑ i = 1 n a i 1 x i ∑ j = 1 n a 2 j x j + ∑ i = 1 n a i 2 x i . . . ∑ j = 1 n a n j x j + ∑ i = 1 n a i n x i ] = [ ∑ j = 1 n a 1 j x j ∑ j = 1 n a 2 j x j . . . ∑ j = 1 n a n j x j ] + [ ∑ i = 1 n a i 1 x i ∑ i = 1 n a i 2 x i . . . ∑ i = 1 n a i n x i ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}=\begin{bmatrix} \sum^n_{j=1}a_{1j}x_j+\sum^n_{i=1}a_{i1}x_i\\ \sum^n_{j=1}a_{2j}x_j+\sum^n_{i=1}a_{i2}x_i\\ …\\ \sum^n_{j=1}a_{nj}x_j+\sum^n_{i=1}a_{in}x_i\\ \end{bmatrix}= \begin{bmatrix} \sum^n_{j=1}a_{1j}x_j\\ \sum^n_{j=1}a_{2j}x_j\\ …\\ \sum^n_{j=1}a_{nj}x_j\\ \end{bmatrix}+\begin{bmatrix} \sum^n_{i=1}a_{i1}x_i\\ \sum^n_{i=1}a_{i2}x_i\\ …\\ \sum^n_{i=1}a_{in}x_i\\ \end{bmatrix} dxdf(x)=
x1f(x)x2f(x)...xnf(x)
=

j=1na1jxj+i=1nai1xij=1na2jxj+i=1nai2xij=1nanjxj+i=1nainxi
=

j=1na1jxjj=1na2jxjj=1nanjxj
+

i=1nai1xii=1nai2xii=1nainxi
= [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] [ x 1 x 2 . . . x n ] \begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}
a11a21an1a12a22an2a1na2nann

x1x2...xn
+ [ a 11 a 21 . . . a n 1 a 12 a 22 . . . a n 2 . . . a 1 n a 2 n . . . a n n ] [ x 1 x 2 . . . x n ] = A X + A T X \begin{bmatrix} a_{11}&a_{21}&…&a_{n1}\\ a_{12}&a_{22}&…&a_{n2}\\ …\\ a_{1n}&a_{2n}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}=AX+A^TX
a11a12a1na21a22a2nan1an2ann

x1x2...xn
=
AX+ATX

参考

https://www.bilibili.com/video/BV1xk4y1B7RQ?p=4

免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://haidsoft.com/118188.html

(0)
上一篇 2025-11-15 08:33
下一篇 2025-11-15 09:00

相关推荐

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注微信