2. Explain the concept of data set obtained by a population of statistical units and a set of variables (attributes)

Data set is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable or attribute, and each row corresponds to a given record of the data set in question, which is the called observation, that is the statistical unit. The data set lists values for each of the variable, such as hobby of some one. A data set can be uni-variate , one variable and multi-variate, which is multiple attributes

Univariate data set:- When we conduct a study that looks at only one variable, we say that we are working with univariate variable.Suppose, for example, that we conduct a survey to estimate the average weight of students in this class.

Bivariate data set:- When we conduct a study that examines the relationship between two variables, we are working with bivariate data. Suppose we conducted a study to see if there is a relationship between the height and weight of students in this class.

Here is one practical example:

let’s take a group of 30 people at class, that is our statistical population. Each person in the population/class in this case represents a statistical unit. Let’s imagine that a survey is constructed and it consists of 3 questions:

  • What is your age?
  • What is your gender?
  • what is your favorite programming language?

Each one of these questions represent a variable which will be added to our dataset. Each person’s answer to one of these questions represents an observation. All of this information will form our dataset.