大家好,欢迎来到IT知识分享网。
-
前言
高斯过程回归的和其他回归算法的区别是:一般回归算法给定输入X,希望得到的是对应的Y值,拟合函数可以有多种多样,线性拟合、多项式拟合等等,而高斯回归是要得到函数f(x)的分布,那么是如何实现的呢?
对于数据集 ![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图1 D:(X,Y)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图3 f(x^{_{i}})=y_{i}](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图5 f=[f(x_1),f(x_2),...,f(x_n)]](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图7 x_{i}](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图9 X*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图11 f*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
高斯回归首先要计算数据集中样本之间的联合概率分布, ![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图15 f\sim N(\mu ,K)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图17 \mu](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图19 f(x_{1}),f(x_{2}),...,f(x_{n})](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图11 f*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图21 f*\sim N(\mu* ,K*)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图15 f\sim N(\mu ,K)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图11 f*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
其中共有两个核心问题:(1)如何计算和方差矩阵(2)如何具体如何计算![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图11 f*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
{
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图23 x_1,x_2,...,x_n](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图25 x*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图27 f(x*)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图25 x*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图23 x_1,x_2,...,x_n](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
-
协方差矩阵的计算
定义函数 ![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图29 m(x)=E(f(x))](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图31 k(x,x^{^{T}})=K](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
利用这样的原始公式来计算协方差矩阵K是十分不方便的,我们先来理解一下高斯过程是如何利用Gaussian distribution 来描述样本的,先来看图1和图2:
![]() |
![]()
|
图1中很明显可以看出来y值的大小与自变量x值的取值相关性很小,对于这样的数据,我们可以给出f(x)的先验分布:
其协方差矩阵为对角阵,任意不同x值之间的协方差均设为0
对于图2,f(x)受x值影响较大,x值相近,y值也相近,呈现出较高的相关性,我们可以给出其先验分布,![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图43 f(x)\sim N(0,K)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图45 f(x)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图23 x_1,x_2,...,x_n](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图47 k(x,x^{^{T}})](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图45 f(x)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
我们需要知道以下两个定理:
(1)协方差矩阵必须是半正定阵
(2)kernel fountion都是半正定阵。这就意味着我们在学习SVM的时候所学过的核函数形式都可以用
当然应用最广的是RBF kernel,即如下式:
其中![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图51 \alpha](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图53 l](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图55 f](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
如何学习kernel的参数?很简单kernel ![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图57 k(x,{x}')](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图33 f(x)\sim N(m(x),k(x,x^{^{T}}))\\](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图59 p(Y|X)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图61 logp(Y|X)=log N(\mu ,K_{y})](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
-
后验概率分布的计算
图3
如图所示,红色的点代表已知的数据点,即训练集
,给出
的先验分布:
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图69 f(x)\sim N(\mu,K)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
绿色的叉代表需要估计的点的X值,我们给定其先验分布![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图73 f(x*)\sim N(\mu *,K(x*,x*))](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
已知 ![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图69 f(x)\sim N(\mu,K)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图73 f(x*)\sim N(\mu *,K(x*,x*))](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
其中![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图77 K**](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图27 f(x*)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图79 K**=k(X*,X*)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图83 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/7490c088-f665-4f32-b981-c409b5e41074202301050cab28b3-83d1-44cf-bd48-8a003ec761f91.jpg)
有了![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图85 p(f)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图87 p(f,f*)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图89 p(f*|f)](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
从而得出对于![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图11 f*](https://haidsoft.com/wp-content/uploads/2022/11/2022112316405970.jpg)
图3拟合的结果如图5所示,图5展示的拟合曲线是仅显示均值的曲线。图6、图7展现了高斯过程回归对于模型误差的估计能力,在图6中在后半部分数据波动较大时,其估计的方差表征了在这一部分的波动情况;在图7中也较好的展示了,在位置数据点,高斯过程回归对于该点函数值均值和方差的估计。
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图97 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/995a676c-e444-4212-878d-287cf78655342023010542da9238-a790-4156-ab09-c6b9bfd4b7231.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图99 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/42877067-5b94-4f0b-9db3-a51e6145f1cb2023010542f30574-243a-4797-b8fb-36f5a692f5b11.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图101 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/168ceb33-a408-476f-9871-4c8a0f52f83e2023010572cb3f39-9c81-40f5-b52f-952edfc184e81.jpg)
-
求解高斯过程的工具sklearn.gaussian_process
在sklearn中提供了十分方便的gaussian_process库,可以用来进行高斯过程求解,我们可以调用GaussianProcessRegressor来进行高斯过程回归的求解,函数具体介绍如下:
| 参数: |
kernel:内核对象
alpha:float或array-like,可选(默认值:1e-10)
优化器:字符串或可调用,可选(默认值:“fmin_l_bfgs_b”)
normalize_y:boolean,optional(默认值:False)
copy_X_train:bool,optional(默认值:True)
random_state:int,RandomState实例或None,可选(默认值:None)
|
|---|---|
| 属性: |
X_train_:类似数组,shape =(n_samples,n_features)
y_train_:array-like,shape =(n_samples,[n_output_dims])
kernel_:内核对象
L_:类似数组,shape =(n_samples,n_samples)
alpha_:array-like,shape =(n_samples,)
log_marginal_likelihood_value_:float
|
fit(X,y) |
拟合高斯过程回归模型。 |
get_params([深]) |
获取此估算工具的参数。 |
log_marginal_likelihood([theta,eval_gradient]) |
返回训练数据的theta的log-marginal似然。 |
predict(X [,return_std,return_cov]) |
使用高斯过程回归模型进行预测 |
sample_y(X [,n_samples,random_state]) |
从高斯过程中抽取样本并在X处进行评估。 |
score(X,y [,sample_weight]) |
返回预测的确定系数R ^ 2。 |
set_params(** PARAMS) |
设置此估算器的参数。 |
提供了一个简单的小程序,大家可以用来做一些小实验:
import numpy as np
import matplotlib.pyplot as plt
import random
from sklearn.gaussian_process import GaussianProcessRegressor
a=np.random.random(50).reshape(50,1)
b=a*2+np.random.random(50).reshape(50,1)
plt.scatter(a,b,marker = 'o', color = 'r', label='3', s = 15)
plt.show()
gaussian=GaussianProcessRegressor()
fiting=gaussian.fit(a,b)
c=np.linspace(0.1,1,100)
d=gaussian.predict(c.reshape(100,1))
plt.scatter(a,b,marker = 'o', color = 'r', label='3', s = 15)
plt.plot(c,d)
plt.show()
IT知识分享网
免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://haidsoft.com/7487.html
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图37 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/386391db-a90d-44ad-b7d0-fa0c45cf8170202301054d653197-a1f4-4709-9fd6-e9b32dea06be1.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图39 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/c732cf63-6298-49ba-a751-bc3cb44d0a66202301055e53387e-0aba-4d93-9b5e-c3f6409c45a91.jpg)
![浅析高斯过程回归(Gaussian process regression)[通俗易懂]插图65 浅析高斯过程回归(Gaussian process regression)[通俗易懂]](http://qn.javajgs.com/20230105/0df2d909-14b7-4b46-8df0-3d0afa39eddf202301053fd178f5-eeee-4d31-ad43-5401dcb090bb1.jpg)