首页下载资源后端2.使用scikit-learn和Python进行超参数调整(Python代码,包括数据集)

ZIP2.使用scikit-learn和Python进行超参数调整(Python代码,包括数据集)

weixin_43271399148.73KB需要积分:1

资源文件列表:

2.intro-hyperparameter-tuning.zip 大约有7个文件
  1. intro-hyperparameter-tuning/
  2. intro-hyperparameter-tuning/train_svr.py 1.18KB
  3. intro-hyperparameter-tuning/pyimagesearch/
  4. intro-hyperparameter-tuning/train_svr_grid.py 1.79KB
  5. intro-hyperparameter-tuning/abalone_train.csv 142.5KB
  6. intro-hyperparameter-tuning/train_svr_random.py 1.85KB
  7. intro-hyperparameter-tuning/pyimagesearch/config.py 225B

资源介绍:

在本教程中,您将学习如何使用 scikit-learn 和 Python 调整模型超参数。 我们将从讨论什么是超参数调整以及它为什么如此重要来开始本教程。 从那里,我们将配置您的开发环境并检查项目目录结构。 然后我们将执行三个 Python 脚本: 1.无需调整超参数即可训练模型(这样我们就可以获得基线) 2.一种是利用一种称为“网格搜索”的算法来详尽检查所有超参数组合的方法——这种方法保证对超参数值进行全面扫描,但速度也很慢 3.最后一种方法是使用“随机搜索”,从分布中抽取各种超参数(不能保证覆盖所有超参数值,但在实践中通常与网格搜索一样准确,而且运行速度更快)
# USAGE # python train_svr_random.py # import the necessary packages from pyimagesearch import config from sklearn.model_selection import RandomizedSearchCV from sklearn.model_selection import RepeatedKFold from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from sklearn.model_selection import train_test_split from scipy.stats import loguniform import pandas as pd # load the dataset, separate the features and labels, and perform a # training and testing split using 85% of the data for training and # 15% for evaluation print("[INFO] loading data...") dataset = pd.read_csv(config.CSV_PATH, names=config.COLS) dataX = dataset[dataset.columns[:-1]] dataY = dataset[dataset.columns[-1]] (trainX, testX, trainY, testY) = train_test_split(dataX, dataY, random_state=3, test_size=0.15) # standardize the feature values by computing the mean, subtracting # the mean from the data points, and then dividing by the standard # deviation scaler = StandardScaler() trainX = scaler.fit_transform(trainX) testX = scaler.transform(testX) # initialize model and define the space of the hyperparameters to # perform the grid-search over model = SVR() kernel = ["linear", "rbf", "sigmoid", "poly"] tolerance = loguniform(1e-6, 1e-3) C = [1, 1.5, 2, 2.5, 3] grid = dict(kernel=kernel, tol=tolerance, C=C) # initialize a cross-validation fold and perform a grid-search to # tune the hyperparameters print("[INFO] grid searching over the hyperparameters...") cvFold = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1) randomSearch = RandomizedSearchCV(estimator=model, n_jobs=-1, cv=cvFold, param_distributions=grid, scoring="neg_mean_squared_error") searchResults = randomSearch.fit(trainX, trainY) # extract the best model and evaluate it print("[INFO] evaluating...") bestModel = searchResults.best_estimator_ print("R2: {:.2f}".format(bestModel.score(testX, testY)))
100+评论
captcha