Linear Discriminant Analysis ( LDA) with Scikit

Linear Discriminant Analysis (LDA) is similar to Principal Component Analysis (PCA) in reducing the dimensionality. However, there are certain nuances with LDA that we should be aware of-

  • LDA is supervised (needs categorical dependent variable) to provide the best linear combination of original variables while providing the maximum separation among the different groups. On the other hand, PCA is unsupervised
  • LDA can be used for classification also, whereas PCA is generally used for unsupervised learning
  • LDA doesn’t need the numbers of discriminant to be passed on ahead of time. Generally speaking the number of discriminant will be lower of the number of variables or number of categories-1.
  • LDA is more robust and can be conducted without even standardizing or normalizing the variables in certain cases
  • LDA is preferred for bigger data sets and machine learning

Let the action begin now-

lda1LDA2LDA3LDA4LDA5

Cheers!

Logistic Regression using Scikit Python

If you are not familiar with logistics regression, please read this article first. Moreover, if you are not familiar with the sklearn machine learning model building process, please read this article also.

Assuming you are now familiar, this is how you can build a logistic regression model in Python using machine learning library Scikit.  Please read here about the dataset and dummy coding. 

clf1clf2clf3clf4clf5clf6clf7

clf8clf9clf10

Cheers!