Skip to main content

Recognizing Handwritten Digits with scikit-learn

 

Recognizing Handwritten Digits with scikit-learn


In today’s blog, we are going to analyze the digits data-set of the Sci-Kit learn library. We will train a Support Vector Machine, and then we will be predicting the values of a few unknown Handwritten digits.

Let us start by importing our libraries.

Image for post

Our data-set is stored in digits.

Following is an example of a digit in our dataset. It consists of 64 pixels (8X8).

Image for post

The 1792nd element in our data-set

Let us train our SVM with the first 1790 images in our data-set. After that, we will use the remaining Data-set as our test data and check our training machine's accuracy.

Image for post

Both predicted, and target values are the same.

As we can see, we have achieved 100% accuracy. Let us now define a function that will find the accuracy of our SVM and train our model with varying data-set. We will start with 3 elements in our training data and work our way up to 1790 data and store our models' accuracy in a dictionary.

Image for post

The values dictionary holds all the accuracies.

Let us plot our dictionary.

Image for post

accuracy vs. size of training-set

As we can clearly see, for well above 95% of our models, the achieved accuracy is 100%. Hence we can easily conclude that our model works for more than 95% of the time.




Contributed by 

Vikash Patel 

Comments

Popular posts from this blog

Performing Analysis of Meteorological Data

  Analysis of Meteorological data In this blog, we are going to analyze the data from the Weather data-set of Finland, a country in Northern Europe. You can find the data-set on Kaggle ( https://www.kaggle.com/muthuj7/weather-dataset ). We are going to use the numpy, pandas, and the matplotlib libraries of Python. Following is the Hypothesis of the Analysis:  “Has the Apparent temperature and humidity compared monthly across 10 years of the data indicate an increase due to Global warming.” Let us start by importing the required libraries and our data-set: Libraries required for analysis Importing our data-set Here is a small preview of how our data-set looks: First 5 entries of our data-set Now we  n eed to drop the unwanted data, convert the data into our need, and resample our data : Here is how the data looks after resampling: First 5 entries of resampled data-set Now let us plot our data in a line graph As we can see, both the peaks and the troughs are almost the same...