1.1. Things to know
• In its plural sense, it refers to the data itself or to some numerical computations derived from a set of data that are systematically collected and analyzed. • In its singular sense, it refers to the scientific discipline consisting of the theory and methods for processing collections of quantitative and qualitative data useful when making decisions in the face of uncertainty.
Statistics as a science is basically concerned with the understanding of some structures in a data set. As such, statisticians are involved with methods of data collection, data organization, and analyses as well as interpretation of the results.
However, uncovering patterns embedded under the backdrop of uncertainty involves not just science but also art.
• describe or characterize persons, objects, situations, and some phenomena with some reliability; • make assessments and comparisons in an objective manner; • make evidence-based decisions.
Some Applications of Statistics • Determining the level of patient’s satisfaction on the nursing care administered by student nurses at Central Mindanao University. • Determining the distribution of the number of text messages sent per day of CMU students enrolled in Statistics subjects. • Comparing the exam results in Statistics of the different CMU colleges. • Relationship of faculty status and work commitment. • Prediction of the number of CMU students for the next school year 2016-2017.
Major Categories of Statistics
Descriptive Statistics – methods concerned with collecting, describing, and analyzing a set of data without drawing conclusions (or inferences) beyond the data.
Inferential Satistics – methods concerned with the analysis of a subset of data leading to predictions or inferences about the entire set of data, that is, to generalize results beyond the data collected provided that the data collected is a part (sample) of a large set of items (population).
Examples of Descriptive Statistics • Total number of CMU students that are university scholar. • The CMU registrar cited statistics showing an increase number of CMU students during the past five years.
Example of Inferential Statistics • A new milk formulation designed to improve the psychomotor of infants was tested on randomly selected infants. Based on the results, it was concluded that the new milk formulation is effective in improving the psychomotor development of infants.
Key Definitions
Universe – is the set of all entities under study, that is, the collection of things or observational units under study.
Variable – is a characteristic observed or measured on every unit of the universe.
Population - is the set of all possible values of the variable.
Sample – is a subset of the population.
Parameters – are numerical measures that describe the population or universe of interest.
Statistics – are numerical measures of a sample.
Frame – a listing of all the elements in a population.
Census – the process in which information is gathered for all units in the population.
Sample survey or sampling – the process in which information obtained is only a part of the population.
“A statistic is to a sample as a parameter is to a population”.
Qualitative variables – These are variables that yield observations by which individuals can be categorized according to some characteristic or quality.
e.g., gender, marital status and blood type; they are expressed in categories
Are expressed in categories.
Quantitative variables – These are variables that yield observations that can be measured.
Constant – This is a variable or a variable that only assumes one value.
Data collected on particular variables are classified as either qualitative or quantitative. Qualitative data (e.g., gender, marital status and blood type), are data obtained on particular variables that are usually expressed in categories. Quantitative data are expressed in numbers (e.g., weight, height, systolic blood pressure and body mass index); data collected in these cases are measured and counted.
Quantitative data is either classified as discrete or continuous data.
• Discrete data – This refers to any data that can be counted, e.g., number of patients in a hospital, number of students who obtained 1.0 grade in Math 15 and Math 34. These data assume only a countable number of values.
• Continuous data – This refers to any data that can be measured, e.g., systolic blood pressure, weight and height. These data result from infinitely many possible values that can be associated with points on a continuous scale in such a way that there are no gaps or interruptions.
Note: Arithmetical operations for quantitative data have some physical interpretation. Some variables may take numerical values, but it does not make the variable quantitative, e.g., sum of two zip codes or the difference of your cellular phone number to your seatmate. Thus, the arithmetic operations of the above examples do not make sense. The issue is whether performing arithmetical operations on these data would make any sense. The figure in the next page illustrates the classification of data collected on particular variables.
Identity – the property that enables a person to distinguish one number from the other. They are recognized by the shapes of the way they are written.
Order – the property that numbers of observations are arranged in
a sequence. For any integers \(A, B,\)
we can determine whether \(A>B\),
\(A=B\), or \(B<A\).
Addititvity – the property that allows us to add two or more numbers. For any real numbers \(A, B, C\), and \(D\), because of the equality of scale, we can determine if \(A-B=C-D, A-B>C-D.\)
Absolute zero property means that there is a level at which there is nothing of the characteristic being measured.
• Nominal scale – the lowest level of measurement and is most often used with variables that are qualitative in nature, rather than quantitative.
- Examples: gender, eye color, smoking status and nationality.
• Ordinal scale – data in this case possesses the property of identity and order.
• Interval scale – Data in this level possesses the properties of identity, order and additivity but do not have the absolute zero property.
• Ratio scale – Data at this level possesses the properties of identity, order, equality of scale and absolute zero. - Examples: weight and height of persons.