Feature Selection in Machine Learning - RFECV (PRACTICAL)

Note: With more recent versions of scikit-learn (> 1.0) we need to make some changes to the code in this tutorial for it to work as we want...

If you find that the plotting of results is resulting in an error- please refer to the note below the video!

Note on updates to scikit-learn. If you are using sklearn version 1.0 or greater - this will most likely apply

Instead of fit.grid_scores_ to get (and then plot) our results we now must use fit.cv_results_['mean_test_score']

In the tutorial, we use fit._grid_scores_ three times in our plotting code - each of these need to be updated to the new code.

Your plot code will now need to look like this instead:


plt.plot(range(1, len(fit.cv_results_['mean_test_score']) + 1), fit.cv_results_['mean_test_score'], marker = "o")
plt.ylabel("Model Score")
plt.xlabel("Number of Features")
plt.title(f"Feature Selection using RFE \n Optimal number of features is {optimal_feature_count} (at score of {round(max(fit.cv_results_['mean_test_score']),4)})")
plt.tight_layout()
plt.show()


Happy learning!