大家好,欢迎来到IT知识分享网。
基础
矩阵求导的本质 : d A d B \frac{dA}{dB} dBdA :矩阵A的每个元素对矩阵B的每个元素进行求导。
假设矩阵A为 1 × 1 1\times 1 1×1,矩阵B为 1 × n 1\times n 1×n, 则 d A d B \frac{dA}{dB} dBdA为 1 × n 1\times n 1×n。
假设矩阵A为 q × p q\times p q×p,矩阵B为 m × n m\times n m×n, 则 d A d B \frac{dA}{dB} dBdA为 q × p × m × n q\times p\times m\times n q×p×m×n。
标量不变,向量拉伸
前面横向拉伸,后面纵面拉伸
示例
例1. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为标量函数, f ( x ) = f ( x 1 , x 2 , x 3 . . . x n ) f(x)=f(x_1, x_2,x_3…x_n) f(x)=f(x1,x2,x3…xn), x x x为向量, x = [ x 1 x 2 . . . x n ] x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} x=⎣
⎡x1x2...xn⎦
⎤。
解: d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix} dxdf(x)=⎣
⎡∂x1∂f(x)∂x2∂f(x)...∂xn∂f(x)⎦
⎤
标量 f ( x ) f(x) f(x)不变,向量 x x x纵向拉伸
例2. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为向量函数, f ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f n ( x ) ] f(x)=\begin{bmatrix} f_1(x)\\ f_2(x)\\ .\\ .\\ .\\ f_n(x)\\ \end{bmatrix} f(x)=⎣
⎡f1(x)f2(x)...fn(x)⎦
⎤, x x x为标量。
解: d f ( x ) d x = [ ∂ f 1 ( x ) ∂ x ∂ f 2 ( x ) ∂ x . . . ∂ f n ( x ) ∂ x ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f_1(x)}{\partial x} && \frac{\partial f_2(x)}{\partial x} &&… && \frac{\partial f_n(x)}{\partial x} \end{bmatrix} dxdf(x)=[∂x∂f1(x)∂x∂f2(x)…∂x∂fn(x)]
向量函数横向拉伸,标量x不变
例3. 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) f(x) f(x)为向量函数, f ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f n ( x ) ] f(x)=\begin{bmatrix} f_1(x)\\ f_2(x)\\ .\\ .\\ .\\ f_n(x)\\ \end{bmatrix} f(x)=⎣
⎡f1(x)f2(x)...fn(x)⎦
⎤, x x x为向量, x = [ x 1 x 2 . . . x n ] x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} x=⎣
⎡x1x2...xn⎦
⎤。
解: d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ ∂ f 1 ( x ) ∂ x 1 ∂ f 2 ( x ) ∂ x 1 . . . ∂ f n ( x ) ∂ x 1 ∂ f 1 ( x ) ∂ x 2 ∂ f 2 ( x ) ∂ x 2 . . . ∂ f n ( x ) ∂ x 2 . . . ∂ f 1 ( x ) ∂ x n ∂ f 2 ( x ) ∂ x n . . . ∂ f n ( x ) ∂ x n ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}= \begin{bmatrix} \frac{\partial f_1(x)}{\partial x_1}&\frac{\partial f_2(x)}{\partial x_1}&…&\frac{\partial f_n(x)}{\partial x_1}\\ \frac{\partial f_1(x)}{\partial x_2}&\frac{\partial f_2(x)}{\partial x_2}&…&\frac{\partial f_n(x)}{\partial x_2}\\ …\\ \frac{\partial f_1(x)}{\partial x_n}&\frac{\partial f_2(x)}{\partial x_n}&…&\frac{\partial f_n(x)}{\partial x_n}\\ \end{bmatrix} dxdf(x)=⎣
⎡∂x1∂f(x)∂x2∂f(x)...∂xn∂f(x)⎦
⎤=⎣
⎡∂x1∂f1(x)∂x2∂f1(x)…∂xn∂f1(x)∂x1∂f2(x)∂x2∂f2(x)∂xn∂f2(x)………∂x1∂fn(x)∂x2∂fn(x)∂xn∂fn(x)⎦
⎤
常见矩阵求导公式
例1: 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) = A T X f(x)=A^TX f(x)=ATX, A = [ a 1 a 2 . . . a n ] n × 1 A=\begin{bmatrix} a_1\\ a_2\\ .\\ .\\ .\\ a_n\\ \end{bmatrix}_{n\times 1} A=⎣
⎡a1a2...an⎦
⎤n×1, X = [ x 1 x 2 . . . x n ] n × 1 X=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}_{n\times 1} X=⎣
⎡x1x2...xn⎦
⎤n×1。
解: f ( x ) = A T X = ∑ i = 1 n a i x i f(x)=A^TX=\sum^n_{i=1}a_ix_i f(x)=ATX=∑i=1naixi
d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ a 1 a 2 . . . a n ] = A \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}=\begin{bmatrix} a_1\\ a_2\\ .\\ .\\ .\\ a_n\\ \end{bmatrix}=A dxdf(x)=⎣
⎡∂x1∂f(x)∂x2∂f(x)...∂xn∂f(x)⎦
⎤=⎣
⎡a1a2...an⎦
⎤=A
由于{标量 T = ^T= T=标量},所以 f ( x ) = A T X = X T A f(x)=A^TX=X^TA f(x)=ATX=XTA,所以 d A T X d x = d X T A d x = A \frac{dA^TX}{dx}=\frac{dX^TA}{dx}=A dxdATX=dxdXTA=A
例2: 求 d f ( x ) d x \frac{df(x)}{dx} dxdf(x),其中 f ( x ) = X T A X f(x)=X^TAX f(x)=XTAX, X = [ x 1 x 2 . . . x n ] n × 1 X=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}_{n\times 1} X=⎣
⎡x1x2...xn⎦
⎤n×1, A = [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] A=\begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix} A=⎣
⎡a11a21…an1a12a22an2………a1na2nann⎦
⎤
解: f ( x ) = X 1 × n T A n × n X n × 1 f(x)=X^T_{1\times n }A_{n\times n}X_{n\times 1} f(x)=X1×nTAn×nXn×1,为标量
f ( x ) = [ x 1 x 2 . . . x n ] [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] [ x 1 x 2 . . . x n ] = ∑ i = 1 n ∑ j = 1 n a i j x i x j f(x)=\begin{bmatrix} x_1& x_2& …& x_n& \end{bmatrix}\begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}=\sum^n_{i=1}\sum^n_{j=1}a_{ij}x_ix_j f(x)=[x1x2…xn]⎣
⎡a11a21…an1a12a22an2………a1na2nann⎦
⎤⎣
⎡x1x2...xn⎦
⎤=∑i=1n∑j=1naijxixj
d f ( x ) d x = [ ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 . . . ∂ f ( x ) ∂ x n ] = [ ∑ j = 1 n a 1 j x j + ∑ i = 1 n a i 1 x i ∑ j = 1 n a 2 j x j + ∑ i = 1 n a i 2 x i . . . ∑ j = 1 n a n j x j + ∑ i = 1 n a i n x i ] = [ ∑ j = 1 n a 1 j x j ∑ j = 1 n a 2 j x j . . . ∑ j = 1 n a n j x j ] + [ ∑ i = 1 n a i 1 x i ∑ i = 1 n a i 2 x i . . . ∑ i = 1 n a i n x i ] \frac{df(x)}{dx}=\begin{bmatrix} \frac{\partial f(x)}{\partial x_1}\\ \frac{\partial f(x)}{\partial x_2}\\ .\\ .\\ .\\ \frac{\partial f(x)}{\partial x_n}\\ \end{bmatrix}=\begin{bmatrix} \sum^n_{j=1}a_{1j}x_j+\sum^n_{i=1}a_{i1}x_i\\ \sum^n_{j=1}a_{2j}x_j+\sum^n_{i=1}a_{i2}x_i\\ …\\ \sum^n_{j=1}a_{nj}x_j+\sum^n_{i=1}a_{in}x_i\\ \end{bmatrix}= \begin{bmatrix} \sum^n_{j=1}a_{1j}x_j\\ \sum^n_{j=1}a_{2j}x_j\\ …\\ \sum^n_{j=1}a_{nj}x_j\\ \end{bmatrix}+\begin{bmatrix} \sum^n_{i=1}a_{i1}x_i\\ \sum^n_{i=1}a_{i2}x_i\\ …\\ \sum^n_{i=1}a_{in}x_i\\ \end{bmatrix} dxdf(x)=⎣
⎡∂x1∂f(x)∂x2∂f(x)...∂xn∂f(x)⎦
⎤=⎣
⎡∑j=1na1jxj+∑i=1nai1xi∑j=1na2jxj+∑i=1nai2xi…∑j=1nanjxj+∑i=1nainxi⎦
⎤=⎣
⎡∑j=1na1jxj∑j=1na2jxj…∑j=1nanjxj⎦
⎤+⎣
⎡∑i=1nai1xi∑i=1nai2xi…∑i=1nainxi⎦
⎤= [ a 11 a 12 . . . a 1 n a 21 a 22 . . . a 2 n . . . a n 1 a n 2 . . . a n n ] [ x 1 x 2 . . . x n ] \begin{bmatrix} a_{11}&a_{12}&…&a_{1n}\\ a_{21}&a_{22}&…&a_{2n}\\ …\\ a_{n1}&a_{n2}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} ⎣
⎡a11a21…an1a12a22an2………a1na2nann⎦
⎤⎣
⎡x1x2...xn⎦
⎤+ [ a 11 a 21 . . . a n 1 a 12 a 22 . . . a n 2 . . . a 1 n a 2 n . . . a n n ] [ x 1 x 2 . . . x n ] = A X + A T X \begin{bmatrix} a_{11}&a_{21}&…&a_{n1}\\ a_{12}&a_{22}&…&a_{n2}\\ …\\ a_{1n}&a_{2n}&…&a_{nn}\\ \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}=AX+A^TX ⎣
⎡a11a12…a1na21a22a2n………an1an2ann⎦
⎤⎣
⎡x1x2...xn⎦
⎤=AX+ATX
参考
https://www.bilibili.com/video/BV1xk4y1B7RQ?p=4
免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://haidsoft.com/118188.html