主要内容

优化一个增强回归集合

这个例子展示了如何优化一个增强回归集合的超参数。该优化最小化了模型的交叉验证损失。

问题是要根据汽车的加速度、发动机排量、马力和重量,以每加仑汽油行驶的英里数为单位对效率进行建模。加载carsmall数据,其中包含这些和其他预测器。

负载carsmallX =[加速度位移马力重量];Y = MPG;

函数将回归集合与数据拟合LSBoost算法,并使用代理分割。通过改变学习周期的数量、代理分割的最大数量和学习速率来优化得到的模型。此外,允许优化在每次迭代之间重新划分交叉验证。

为了再现性,设置随机种子并使用“expected-improvement-plus”采集功能。

rng (“默认”) Mdl = fitrensemble(X,Y,...“方法”“LSBoost”...“学习者”templateTree (“代孕”“上”),...“OptimizeHyperparameters”,{“NumLearningCycles”“MaxNumSplits”“LearnRate”},...“HyperparameterOptimizationOptions”结构(“再分配”,真的,...“AcquisitionFunctionName”“expected-improvement-plus”))
|====================================================================================================================| | Iter | Eval |目的:| |目的BestSoFar | BestSoFar | NumLearningC - | LearnRate | MaxNumSplits | | | |结果日志(1 +损失)运行时| |(观察)| (estim) |永昌龙  | | | |====================================================================================================================| | 最好1 | | 3.5219 | 15.193 | 3.5219 | 3.5219 | 383 | 0.51519 | 4 | | 2 |最好| 3.4752 | 0.67582 | 3.4752 | 3.4777 | 16 | 0.66503 | 7 | | 3 |的| 3.1575 | 1.1182 | 3.1575 | 3.1575 | 33 | 0.2556 | 92 | | 4 | | 6.3076接受13 | | 0.64472 | 3.1575 | 3.1579 | 0.0053227 | 5 | | 5 |接受| 3.4449 | 7.82 | 3.1575 | 3.1579 | 277 | 0.45891 | 99 | | 6 |接受| 3.9806 | 0.39166 | 3.1575 | 3.1584 | 33 10 | 0.13017 | | | 7最好| | 3.059 | 0.39893 | 3.059 | 3.06 | 10 | 0.30126 | 3 | | |接受8 | 3.1707 | 0.45922 | 3.059 | 3.1144 | 10 | 0.28991 | 16 | | | 9日接受| 3.0937 | 1.1831 | 3.059 | 3.1046 | 10 | 0.31488 | 13 | | |接受10 | 3.196 | 0.34082 | 3.059 | 3.1233 | 10 | 0.32005 | 11 | | 11 | | 3.0495 | 0.44997最好| 3.0495 | 3.1083 | 10 | 0.27882 | 85 | | 12 | Best | 2.946 | 0.85364 | 2.946 | 3.0774 | 10 | 0.27157 | 7 | | 13 | Accept | 3.2026 | 0.42448 | 2.946 | 3.0995 | 10 | 0.25734 | 20 | | 14 | Accept | 5.7151 | 9.5581 | 2.946 | 3.0996 | 376 | 0.001001 | 43 | | 15 | Accept | 3.207 | 11.856 | 2.946 | 3.0937 | 499 | 0.027394 | 18 | | 16 | Accept | 3.8606 | 1.7111 | 2.946 | 3.0937 | 36 | 0.041427 | 12 | | 17 | Accept | 3.2026 | 10.836 | 2.946 | 3.095 | 443 | 0.019836 | 76 | | 18 | Accept | 3.4832 | 4.9994 | 2.946 | 3.0956 | 205 | 0.99989 | 8 | | 19 | Accept | 5.6285 | 4.4005 | 2.946 | 3.0942 | 192 | 0.0022197 | 2 | | 20 | Accept | 3.0896 | 7.3584 | 2.946 | 3.0938 | 188 | 0.023227 | 93 | |====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | NumLearningC-| LearnRate | MaxNumSplits | | | result | log(1+loss) | runtime | (observed) | (estim.) | ycles | | | |====================================================================================================================| | 21 | Accept | 3.2654 | 4.1374 | 2.946 | 3.0951 | 167 | 0.023242 | 86 | | 22 | Accept | 6.1202 | 0.56028 | 2.946 | 3.0904 | 16 | 0.010203 | 42 | | 23 | Accept | 2.9963 | 9.0864 | 2.946 | 3.0985 | 440 | 0.076162 | 1 | | 24 | Accept | 3.2801 | 4.1207 | 2.946 | 3.097 | 171 | 0.067074 | 69 | | 25 | Accept | 3.47 | 11.084 | 2.946 | 3.0968 | 497 | 0.13969 | 6 | | 26 | Accept | 3.4413 | 11.519 | 2.946 | 3.0945 | 497 | 0.051993 | 50 | | 27 | Best | 2.9095 | 4.5561 | 2.9095 | 2.9126 | 216 | 0.036052 | 1 | | 28 | Accept | 3.0866 | 1.8208 | 2.9095 | 2.9153 | 78 | 0.24579 | 1 | | 29 | Accept | 3.0473 | 5.6002 | 2.9095 | 2.9713 | 239 | 0.032173 | 1 | | 30 | Accept | 3.0383 | 0.73582 | 2.9095 | 2.972 | 25 | 0.31894 | 1 |

图中包含一个axes对象。标题为Min objective vs. Number of function求值的axis对象包含两个类型为line的对象。这些对象表示最小观测目标,估计最小目标。

__________________________________________________________ 优化完成。最大目标:达到30。总函数评估:30总运行时间:159.745秒总目标函数评估时间:133.894最佳观察可行点:NumLearningCycles LearnRate maxnum分裂_________________ _________ ____________ 216 0.036052 1观察目标函数值= 2.9095估计目标函数值= 2.972函数评估时间= 4.5561最佳估计可行点(根据模型):NumLearningCycles LearnRate maxnumfragments _________________ _________ ____________ 216 0.036052 1估计的目标函数值= 2.972估计的函数计算时间= 5.6436
Mdl = RegressionEnsemble ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' NumObservations: 94 HyperparameterOptimizationResults: [1x1 BayesianOptimization] numtrainationresults: 216 Method: 'LSBoost' LearnerNames: {'Tree'} reasonforterminate: '完成所请求的训练周期数后正常终止。'FitInfo: [216x1 double] FitInfoDescription: {2x1 cell

将损失与增强的、未优化的模型和默认集成的损失进行比较。

损失= kfoldLoss (crossval (Mdl,“kfold”10))
损失= 18.2889
Mdl2 = fitrensemble (X, Y,...“方法”“LSBoost”...“学习者”templateTree (“代孕”“上”));loss2 = kfoldLoss (crossval (Mdl2“kfold”10))
loss2 = 29.4663
Mdl3 = fitrensemble (X, Y);loss3 = kfoldLoss (crossval (Mdl3“kfold”10))
loss3 = 37.7424

关于优化这个集合的另一种方法,请参见使用交叉验证优化回归集合

Baidu
map