Hyper-parameters Optimization using Gridsearch and Crossvalidation

Cross validation and grid search are two very important ways to optimize hyperparameters for a model to get the best performance. Typically hyperparameters need manual fine tuning to get optimal results.

In this article we will do a demonstration of how to do hyperparameter optimization for a Support Vector Classifier (SVC) to do classification on Iris dataset. Some of the key parameters for a SVC are- C values, Gamma values and Kernel Function.

  • Import necessary packages including GridSearchCV

Slide1

  • Import Iris dataset and create features (x) and label (y).

Slide2-3.png

  • Instantiate the classifier (Support Vector Machine) and get parameters for this function. Even though there are many hyperparameters for this classifier, we would focus on optimizing the three main ones- C, Gamma, and Kernel.
  • We will try 5 difference values for ‘C’ , 4 values for ‘Kernel’ and 5 values for gamma as listed out in the param_grid below.
  • Please note that the desired parameters possible values need to be passed on in a dictionary format (Key-Value Pair)

Slide3

  • Run possible models and do 10 fold cross validation. Please read here more on what is the utility of the cross validation. CAUTION– both gridsearch and cross-validation are resource intensive operations. Please use them carefully for bigger datasets.
  • In this example, the best parameters come out to be C= 10, gamma = 0.01 and Kernel = linear to give the best accuracy score of 0.98 on this data.

Slide4

Finally, here is an excellent link to know more about the cross-validation-

https://scikit-learn.org/stable/modules/cross_validation.html

Thanks for reading. Please share!