How to Use MinMax Scaler in Python

Shashanka Shekhar
Stackademic
Published in
4 min readMar 18, 2024

--

Python is a high-level, general-purpose and interpreted programming language. It is known for its ease of use, powerful standard library and dynamic semantics. Python is widely used in various sectors including machine learning, artificial intelligence, data analysis, web development and many more. Its simple, easy-to-learn syntax emphasizes readability and therefore reduces the cost of program maintenance.

What is a MinMaxScaler?

In Python, the MinMaxScaler is a preprocessing utility in the sklearn.preprocessing module of the scikit-learn library. It scales each feature (i.e., column in your data) individually such that it is in the given range on the training set, typically between zero and one.

The transformation is given by:

where min, max = feature_range, X.min(axis=0) is the minimum feature value, and X.max(axis=0) is the maximum feature value.

This transformation is often used as an alternative to zero mean, unit variance scaling. While MinMaxScaler doesn’t reduce the effect of outliers, it linearly scales them down into a fixed range.

Photo by victor muñoz on Unsplash

The problem we will be solving?

A snapshot of our data

This is Rain in Australia dataset in which on the basis of a number of parameters we decide whether on a given day it will rain or not. The shape of data being (145460, 23).

RIAUS.info()
There are total 145460 rows and 23 columns

RIAUS is our DataFrame storing the data having 23 different columns shown by RIAUS.info()

We will first extract all the numeric columns in our DataFrame. Now numeric columns are all those columns which have Dtype of int64 or float64. Then we will use MinMaxScaler on the numeric columns.

1.Using list comprehension to extract numeric columns:

num_cols = [cols for cols in RIAUS.columns if RIAUS[cols].dtype in ['int64', 'float64']]

So, from the above code the num_cols will be a list of the names of all numerical columns in the DataFrame RIAUS. To learn more about how this code works refer to this link. It’s short and easy.

2.Using MinMaxScaler for the numeric columns:

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

for cols in num_cols:
k = np.array(RIAUS[cols])
k = k.reshape(-1, 1)
k = scaler.fit_transform(k)
RIAUS[cols] = k
  1. scaler = MinMaxScaler(): This line creates an instance of the MinMaxScaler class.
  2. for cols in num_cols:: This line starts a loop that will iterate over each column name in the list num_cols.
  3. k = np.array(RIAUS[cols]): This line converts the data in the current column of the DataFrame RIAUS into a numpy array and assigns it to the variable k.
  4. k = k.reshape(-1, 1): This line reshapes the array k into a 2D array with one column. The -1 in the reshape function means that the size in that dimension is inferred from the length of the array and the remaining dimensions.
  5. k = scaler.fit_transform(k): This line fits the scaler to the data in k (i.e., it computes the minimum and maximum values), and then it transforms k by scaling it to the range [0, 1]. The transformed data is assigned back to k.
  6. RIAUS[cols] = k: This line replaces the original data in the current column of RIAUS with the scaled data in k .

So, in summary, this code scales the data in each numerical column of the DataFrame RIAUS to the range [0, 1] using the MinMaxScaler.

With this we are done now you can use the scaled numeric columns for your preferred ML algorithms.

To learn how to use StandardScaler in python refer to this link.

To learn how to use Ordinal encoding on python refer to this link.

To extract categorical columns in python refer to this link.

To learn how to use OneHotEncoder in python refer to this link.

To learn how to use PCA in python refer to this link.

To read more stories like this you can follow me with this link.

References:

  1. https://www.geeksforgeeks.org/what-is-python/
  2. https://www.python.org/doc/essays/blurb/
  3. https://www.britannica.com/technology/Python-computer-language
  4. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html

Stackademic 🎓

Thank you for reading until the end. Before you go:

--

--

Contributor for Microsoft Power BI. I like Data Analysis and Data Science. Also I enjoy sports, videogames and Japanese Anime in my free time.