Image for post
Image for post

Defining the type of variable you are working with is always the first step in the data analysis process. Later on, this makes it easy to determine which type of analysis is the most appropriate.

In its most general form, the data can be divided into quantitative and qualitative.

Image for post
Image for post

Quantitative, as the name implies, is a data type where numbers have a mathematical value, they indicate a quantity, amount, or measurement of a characteristic.

When we go to quantitative measures, numbers mean themselves. That is, there is no additional information needed: 1.5 is 1.5, 5 is 5, 100 is 100.

A discrete scale is one that is quantitative, but it does not take up all the space. Let’s take the number of children in the family as an example — we may have 1 child, 3 children, 5 children and even 10, but we cannot have 1.5 or 3.75. That is, these are some point-like discrete values.

A continuous scale is a scale that takes up all the space, it can be anything from -∞ to +∞, can be fractional. For example, we can measure time in days, hours, seconds, milliseconds and so on. The continuous scale is determined throughout all possible values.

Qualitative variables are variables that reflect the property or quality of objects. And numbers here mean not themselves, as in the quantitative case, but they mean some qualities or properties of objects. In other words, they serve as markers for some categories.

For example, let’s say we compare people living in one state to people living in another state. We can encode people from California by 1, New Yorkers by 2, one and two wouldn’t mean anything except that they denote these categories, which are the center of our analysis.

Qualitative variables are divided into nominal and ordinal types.

Let’s take a closer look at what each of these types means. Let’s start with the nominal variables, it’s the most basic, the easiest scale. The only information it contains is information about an object belonging to a certain class or group. It means that these variables can only be measured in terms of belonging to some significantly different classes, and you will not be able to determine the order of these classes.

For example, we can study people from different states, or people with different colors of eyes — blue eyes, green eyes, brown eyes. These will all be nominal variables — it does not matter what color your eyes are — there is no order in these values.

Ordinal variables differ slightly from nominal variables by the fact that order appears. So, values not only divide objects into classes or groups but also order them in a certain way.

For example, we have grades at school — A, B, C, D, F. And in this case we can say for sure that the person who has A most likely more prepared for the test than the person who received F. In this case, we cannot say to what extent, but we can say for sure that A is better than D.

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel.
Plan your best!

Written by

helping robots conquer the earth and trying not to increase entropy using Python, Big Data, Machine Learning. Check out my blog —

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store