Measurement Scales & Data Presentation in Stata


Giovanni Minchio
Yuxin Zhang


Quantitative Methods Lab, Lesson 2.1
08 Oct. 2024

Variables

In scientific research, a variable refers to anything that can take on different values across a data set.

Scales of measurement


Measurement scales, also called levels of measurement, indicate how variables are recorded.


Summary of variable types & scales

Check your understanding

Choose appropriate measurement scales to record your data:

- Competition ranking (first/second/third...)
- Temperature (in Fahrenheit/Celsius)
- Age
- Human height (in centimeters)
- Human weight (in kilograms)
- Preferred political party
- Sex (male/female) 
- Number of siblings
- Occupations
- Year of birth
- Number of pages in the last book you read
- Household income
- Place of residence (countryside/town/city/metropolis)
- Mortality rate (from 0% to 100%)
- Favorite ice cream flavor
- Number of cigarettes smoked


Game time!

https://create.kahoot.it/details/303928b3-92ad-416a-80de-73ad67c01752

Commands in Stata

Now let’s explore how to present different types of variables in Stata.

Do

cd "/Users/yuxin/Documents/STATALAB2024-25"

P.s., cd stands for “change directory”

You can also open a log file, which saves your commands and outputs all together.

help log
clear

You can click on the Stata interface:



Or:

log using "output/lesson2.log", replace

“output” is one of my subfolder in the folder “STATALAB2024-25”, where we have set our working directory. We do not need to repeat that path again once we set up directory using cd "".

“lesson2” is the name I assigned to this log file, but you can give any name you prefer for your own log file.

use "datafile/ESS10.dta", clear

“datafile” is one of my subfolder in the folder “STATALAB2024-25”, and “ESS10.dta” is the name of the data set in this subfolder.

describe
codebook agea
log off

When you log off, no input or output is recorded in your log text file. Try it, and you will not see anything saved in the log file.

describe cntry
codebook cntry
log on
gen x = 1
log close

Download data

https://ess.sikt.no/en/?tab=overview

Load data into Stata

use "datafile/ESS10.dta", clear
browse

Data presentation by distributions

We can present variables by their distributions to illustrate how the values of the variables are spread across different categories or ranges.

Let’s try to present some variables in the ESS 10 data set we downloaded. The variables we will use in this section are:

When we explore a variable:

Present a numerical variable by summarize

codebook hhmmb
hhmmb                     Number of people living regularly as member of household
----------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: hhmmb, but 13 nonmissing values are not labeled

                 Range: [1,13]                        Units: 1
         Unique values: 13                        Missing .: 0/37,611
       Unique mv codes: 2                        Missing .*: 144/37,611

              Examples: 1     
                        2     
                        3     
                        4     
summarize hhmmb
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       hhmmb |     37,467    2.546534    1.341311          1         13
codebook happy
happy                                                            How happy are you
----------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: happy, but 9 nonmissing values are not labeled

                 Range: [0,10]                        Units: 1
         Unique values: 11                        Missing .: 0/37,611
       Unique mv codes: 3                        Missing .*: 90/37,611

              Examples: 6     
                        7     
                        8     
                        9     
summarize happy
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       happy |     37,521    7.241918    1.933552          0         10
sum happy, detail
                      How happy are you
-------------------------------------------------------------
      Percentiles      Smallest
 1%            1              0
 5%            4              0
10%            5              0       Obs              37,521
25%            6              0       Sum of wgt.      37,521

50%            8                      Mean           7.241918
                        Largest       Std. dev.      1.933552
75%            8             10
90%           10             10       Variance       3.738622
95%           10             10       Skewness       -.932757
99%           10             10       Kurtosis       4.086627

(p.s., sum is an abbreviation for summarize, it can also be su summ, summari, etc. Try them out by yourself and see when and what does not work.)

codebook yrbrn
yrbrn                                                                Year of birth
----------------------------------------------------------------------------------

                  Type: Numeric (int)
                 Label: yrbrn, but 77 nonmissing values are not labeled

                 Range: [1931,2007]                   Units: 1
         Unique values: 77                        Missing .: 0/37,611
       Unique mv codes: 3                        Missing .*: 292/37,611

              Examples: 1953  
                        1964  
                        1976  
                        1989  
sum yrbrn
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       yrbrn |     37,319    1970.412    18.41365       1931       2007
sum yrbrn, d
                        Year of birth
-------------------------------------------------------------
      Percentiles      Smallest
 1%         1934           1931
 5%         1941           1931
10%         1946           1931       Obs              37,319
25%         1956           1931       Sum of wgt.      37,319

50%         1970                      Mean           1970.412
                        Largest       Std. dev.      18.41365
75%         1985           2006
90%         1997           2007       Variance       339.0624
95%         2001           2007       Skewness       .0467735
99%         2005           2007       Kurtosis       2.066355
codebook polintr
polintr                                                 How interested in politics
----------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: polintr

                 Range: [1,4]                         Units: 1
         Unique values: 4                         Missing .: 0/37,611
       Unique mv codes: 3                        Missing .*: 88/37,611

            Tabulation: Freq.   Numeric  Label
                        3,430         1  Very interested
                       11,836         2  Quite interested
                       13,412         3  Hardly interested
                        8,845         4  Not at all interested
                           30        .a  Refusal
                           45        .b  Don't know
                           13        .c  No answer
sum polintr
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     polintr |     37,523    2.737468    .9208133          1          4

Present categorical variables


Frequency table (one-way) by tabulate

codebook cntry
cntry                                                                      Country
----------------------------------------------------------------------------------

                  Type: String (str2)

         Unique values: 22                        Missing "": 0/37,611

              Examples: "CZ"
                        "GR"
                        "IS"
                        "MK"
tabulate cntry
    Country |      Freq.     Percent        Cum.
------------+-----------------------------------
         BE |      1,341        3.57        3.57
         BG |      2,718        7.23       10.79
         CH |      1,523        4.05       14.84
         CZ |      2,476        6.58       21.42
         EE |      1,542        4.10       25.52
         FI |      1,577        4.19       29.72
         FR |      1,977        5.26       34.97
         GB |      1,149        3.05       38.03
         GR |      2,799        7.44       45.47
         HR |      1,592        4.23       49.70
         HU |      1,849        4.92       54.62
         IE |      1,770        4.71       59.33
         IS |        903        2.40       61.73
         IT |      2,640        7.02       68.75
         LT |      1,659        4.41       73.16
         ME |      1,278        3.40       76.55
         MK |      1,429        3.80       80.35
         NL |      1,470        3.91       84.26
         NO |      1,411        3.75       88.01
         PT |      1,838        4.89       92.90
         SI |      1,252        3.33       96.23
         SK |      1,418        3.77      100.00
------------+-----------------------------------
      Total |     37,611      100.00
* single condition:
tabulate cntry if cntry == "IT"
    Country |      Freq.     Percent        Cum.
------------+-----------------------------------
         IT |      2,640      100.00      100.00
------------+-----------------------------------
      Total |      2,640      100.00
* multiple conditions:
tabulate cntry if inlist(cntry, "IT", "FR", "NL")
    Country |      Freq.     Percent        Cum.
------------+-----------------------------------
         FR |      1,977       32.48       32.48
         IT |      2,640       43.37       75.85
         NL |      1,470       24.15      100.00
------------+-----------------------------------
      Total |      6,087      100.00
describe prtcleit
Variable      Storage   Display    Value
    name         type    format    label      Variable label
----------------------------------------------------------------------------------
prtcleit        byte    %4.0g      prtcleit   Which party feel closer to, Italy
codebook prtcleit
prtcleit                                         Which party feel closer to, Italy
----------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: prtcleit

                 Range: [1,40]                        Units: 1
         Unique values: 19                        Missing .: 34,971/37,611
       Unique mv codes: 5                        Missing .*: 2,054/37,611

              Examples: .     
                        .     
                        .     
                        .     
labelbook prtcleit
Value label prtcleit 
----------------------------------------------------------------------------------

      Values                                    Labels
       Range:  [1,40]                    String length:  [4,36]
           N:  27                Unique at full length:  yes
        Gaps:  yes                 Unique at length 12:  yes
  Missing .*:  4                           Null string:  no
                               Leading/trailing blanks:  no
                                    Numeric -> numeric:  no
  Definition
           1   Movimento 5 Stelle
           2   Partido Democratico
           3   Lega
           4   Forza Italia
           5   Fratelli d'Italia con Giorgia Meloni
           6   Liberi e Uguali (LEU)
           7   + Europa
           8   Noi con l'Italia - UDC
           9   Potere al popolo
          10   Casapound Italia
          11   Italia Europa Insieme
          12   Il popolo della famiglia
          13   Civica Popolare Lorenzin
          14   SVP-PATT
          31   Altro
          33   Italia Viva
          34   Unione Valdotaine
          35   Partito Comunista
          36   Vox Italia
          37   Partito Socialista
          38   Verdi/ Europa Verde
          39   Italexit
          40   Azione di Calenda
          .a   Not applicable
          .b   Refusal
          .c   Don't know
          .d   No answer

   Variables:  prtcleit
label list prtcleit
prtcleit:
           1 Movimento 5 Stelle
           2 Partido Democratico
           3 Lega
           4 Forza Italia
           5 Fratelli d'Italia con Giorgia Meloni
           6 Liberi e Uguali (LEU)
           7 + Europa
           8 Noi con l'Italia - UDC
           9 Potere al popolo
          10 Casapound Italia
          11 Italia Europa Insieme
          12 Il popolo della famiglia
          13 Civica Popolare Lorenzin
          14 SVP-PATT
          31 Altro
          33 Italia Viva
          34 Unione Valdotaine
          35 Partito Comunista
          36 Vox Italia
          37 Partito Socialista
          38 Verdi/ Europa Verde
          39 Italexit
          40 Azione di Calenda
          .a Not applicable
          .b Refusal
          .c Don't know
          .d No answer
tab prtcleit
   Which party feel closer to, Italy |      Freq.     Percent        Cum.
-------------------------------------+-----------------------------------
                  Movimento 5 Stelle |        101       17.24       17.24
                 Partido Democratico |        188       32.08       49.32
                                Lega |         78       13.31       62.63
                        Forza Italia |         57        9.73       72.35
Fratelli d'Italia con Giorgia Meloni |        108       18.43       90.78
               Liberi e Uguali (LEU) |          9        1.54       92.32
                            + Europa |          2        0.34       92.66
              Noi con l'Italia - UDC |          4        0.68       93.34
                    Potere al popolo |          6        1.02       94.37
                            SVP-PATT |          8        1.37       95.73
                               Altro |          3        0.51       96.25
                         Italia Viva |          5        0.85       97.10
                   Unione Valdotaine |          1        0.17       97.27
                   Partito Comunista |          5        0.85       98.12
                          Vox Italia |          1        0.17       98.29
                  Partito Socialista |          2        0.34       98.63
                 Verdi/ Europa Verde |          2        0.34       98.98
                            Italexit |          3        0.51       99.49
                   Azione di Calenda |          3        0.51      100.00
-------------------------------------+-----------------------------------
                               Total |        586      100.00
codebook vote
vote                                                  Voted last national election
----------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: vote

                 Range: [1,3]                         Units: 1
         Unique values: 3                         Missing .: 0/37,611
       Unique mv codes: 3                        Missing .*: 459/37,611

            Tabulation: Freq.   Numeric  Label
                       26,794         1  Yes
                        7,764         2  No
                        2,594         3  Not eligible to vote
                          193        .a  Refusal
                          221        .b  Don't know
                           45        .c  No answer
tab vote
 Voted last national |
            election |      Freq.     Percent        Cum.
---------------------+-----------------------------------
                 Yes |     26,794       72.12       72.12
                  No |      7,764       20.90       93.02
Not eligible to vote |      2,594        6.98      100.00
---------------------+-----------------------------------
               Total |     37,152      100.00
tab vote, missing
 Voted last national |
            election |      Freq.     Percent        Cum.
---------------------+-----------------------------------
                 Yes |     26,794       71.24       71.24
                  No |      7,764       20.64       91.88
Not eligible to vote |      2,594        6.90       98.78
             Refusal |        193        0.51       99.29
          Don't know |        221        0.59       99.88
           No answer |         45        0.12      100.00
---------------------+-----------------------------------
               Total |     37,611      100.00
tab1 cntry vote, m
-> tabulation of cntry  

    Country |      Freq.     Percent        Cum.
------------+-----------------------------------
         BE |      1,341        3.57        3.57
         BG |      2,718        7.23       10.79
         CH |      1,523        4.05       14.84
         CZ |      2,476        6.58       21.42
         EE |      1,542        4.10       25.52
         FI |      1,577        4.19       29.72
         FR |      1,977        5.26       34.97
         GB |      1,149        3.05       38.03
         GR |      2,799        7.44       45.47
         HR |      1,592        4.23       49.70
         HU |      1,849        4.92       54.62
         IE |      1,770        4.71       59.33
         IS |        903        2.40       61.73
         IT |      2,640        7.02       68.75
         LT |      1,659        4.41       73.16
         ME |      1,278        3.40       76.55
         MK |      1,429        3.80       80.35
         NL |      1,470        3.91       84.26
         NO |      1,411        3.75       88.01
         PT |      1,838        4.89       92.90
         SI |      1,252        3.33       96.23
         SK |      1,418        3.77      100.00
------------+-----------------------------------
      Total |     37,611      100.00

-> tabulation of vote  

 Voted last national |
            election |      Freq.     Percent        Cum.
---------------------+-----------------------------------
                 Yes |     26,794       71.24       71.24
                  No |      7,764       20.64       91.88
Not eligible to vote |      2,594        6.90       98.78
             Refusal |        193        0.51       99.29
          Don't know |        221        0.59       99.88
           No answer |         45        0.12      100.00
---------------------+-----------------------------------
               Total |     37,611      100.00
describe edulvlb
Variable      Storage   Display    Value
    name         type    format    label      Variable label
----------------------------------------------------------------------------------
edulvlb         int     %6.0g      edulvlb    Highest level of education
labelbook edulvlb
Value label edulvlb 
----------------------------------------------------------------------------------

      Values                                    Labels
       Range:  [0,5555]                  String length:  [5,69]
           N:  31                Unique at full length:  yes
        Gaps:  yes                 Unique at length 12:  no
  Missing .*:  3                           Null string:  no
                               Leading/trailing blanks:  no
                                    Numeric -> numeric:  no
  Definition
           0   Not completed ISCED level 1
         113   ISCED 1, completed primary education
         129   Vocational ISCED 2C < 2 years, no access ISCED 3
         212   General/pre-vocational ISCED 2A/2B, access ISCED 3 vocational
         213   General ISCED 2A, access ISCED 3A general/all 3
         221   Vocational ISCED 2C >= 2 years, no access ISCED 3
         222   Vocational ISCED 2A/2B, access ISCED 3 vocational
         223   Vocational ISCED 2, access ISCED 3 general/all
         229   Vocational ISCED 3C < 2 years, no access ISCED 5
         311   General ISCED 3 >=2 years, no access ISCED 5
         312   General ISCED 3A/3B, access ISCED 5B/lower tier 5A
         313   General ISCED 3A, access upper tier ISCED 5A/all 5
         321   Vocational ISCED 3C >= 2 years, no access ISCED 5
         322   Vocational ISCED 3A, access ISCED 5B/ lower tier 5A
         323   Vocational ISCED 3A, access upper tier ISCED 5A/all 5
         412   General ISCED 4A/4B, access ISCED 5B/lower tier 5A
         413   General ISCED 4A, access upper tier ISCED 5A/all 5
         421   ISCED 4 programmes without access ISCED 5
         422   Vocational ISCED 4A/4B, access ISCED 5B/lower tier 5A
         423   Vocational ISCED 4A, access upper tier ISCED 5A/all 5
         510   ISCED 5A short, intermediate/academic/general tertiary below
               bachelor
         520   ISCED 5B short, advanced vocational qualifications
         610   ISCED 5A medium, bachelor/equivalent from lower tier tertiary
         620   ISCED 5A medium, bachelor/equivalent from upper/single tier
               tertiary
         710   ISCED 5A long, master/equivalent from lower tier tertiary
         720   ISCED 5A long, master/equivalent from upper/single tier tertiary
         800   ISCED 6, doctoral degree
        5555   Other
          .a   Refusal
          .b   Don't know
          .c   No answer

   Variables:  edulvlb
tab edulvlb
             Highest level of education |      Freq.     Percent        Cum.
----------------------------------------+-----------------------------------
            Not completed ISCED level 1 |        317        0.85        0.85
   ISCED 1, completed primary education |      2,252        6.01        6.86
Vocational ISCED 2C < 2 years, no acces |          8        0.02        6.88
General/pre-vocational ISCED 2A/2B, acc |        223        0.60        7.48
General ISCED 2A, access ISCED 3A gener |      4,358       11.64       19.11
Vocational ISCED 2C >= 2 years, no acce |         44        0.12       19.23
Vocational ISCED 2A/2B, access ISCED 3  |        341        0.91       20.14
Vocational ISCED 2, access ISCED 3 gene |         44        0.12       20.26
Vocational ISCED 3C < 2 years, no acces |        508        1.36       21.62
General ISCED 3A/3B, access ISCED 5B/lo |         93        0.25       21.86
General ISCED 3A, access upper tier ISC |      5,304       14.16       36.03
Vocational ISCED 3C >= 2 years, no acce |      4,078       10.89       46.92
Vocational ISCED 3A, access ISCED 5B/ l |        666        1.78       48.69
Vocational ISCED 3A, access upper tier  |      5,386       14.38       63.08
General ISCED 4A/4B, access ISCED 5B/lo |         17        0.05       63.12
General ISCED 4A, access upper tier ISC |         19        0.05       63.17
ISCED 4 programmes without access ISCED |        836        2.23       65.40
Vocational ISCED 4A/4B, access ISCED 5B |         91        0.24       65.65
Vocational ISCED 4A, access upper tier  |        996        2.66       68.31
ISCED 5A short, intermediate/academic/g |        203        0.54       68.85
ISCED 5B short, advanced vocational qua |      1,626        4.34       73.19
ISCED 5A medium, bachelor/equivalent fr |      1,665        4.45       77.64
ISCED 5A medium, bachelor/equivalent fr |      3,133        8.37       86.00
ISCED 5A long, master/equivalent from l |        792        2.11       88.12
ISCED 5A long, master/equivalent from u |      3,961       10.58       98.69
               ISCED 6, doctoral degree |        408        1.09       99.78
                                  Other |         81        0.22      100.00
----------------------------------------+-----------------------------------
                                  Total |     37,450      100.00
* do not show labels:
tabulate edulvlb, nolabel 
    Highest |
   level of |
  education |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        317        0.85        0.85
        113 |      2,252        6.01        6.86
        129 |          8        0.02        6.88
        212 |        223        0.60        7.48
        213 |      4,358       11.64       19.11
        221 |         44        0.12       19.23
        222 |        341        0.91       20.14
        223 |         44        0.12       20.26
        229 |        508        1.36       21.62
        312 |         93        0.25       21.86
        313 |      5,304       14.16       36.03
        321 |      4,078       10.89       46.92
        322 |        666        1.78       48.69
        323 |      5,386       14.38       63.08
        412 |         17        0.05       63.12
        413 |         19        0.05       63.17
        421 |        836        2.23       65.40
        422 |         91        0.24       65.65
        423 |        996        2.66       68.31
        510 |        203        0.54       68.85
        520 |      1,626        4.34       73.19
        610 |      1,665        4.45       77.64
        620 |      3,133        8.37       86.00
        710 |        792        2.11       88.12
        720 |      3,961       10.58       98.69
        800 |        408        1.09       99.78
       5555 |         81        0.22      100.00
------------+-----------------------------------
      Total |     37,450      100.00

Brainstorming:

How can we recode this variable to simplify it without losing too much information? Note it is a measure of education across countries.

Try them out by yourselves using the commands we have learned so far; e.g., recode, drop, gen, egen, rename, etc.

There is no single correct solution, and it depends on how you justify your choice. You may also discuss it with your peers.

It’s NOT a mandatory task, but if you’d like to see how your peers think about it and want some feedback, you can upload your idea or solution by the end of this week, to our Moodle section “Lab exercises”, here:


Crosstabulation (contingency table)

The first variable is in rows, and the second variable is in columns.

tab vote gndr 
 Voted last national |        Gender
            election |      Male     Female |     Total
---------------------+----------------------+----------
                 Yes |    12,503     14,291 |    26,794 
                  No |     3,465      4,299 |     7,764 
Not eligible to vote |     1,298      1,296 |     2,594 
---------------------+----------------------+----------
               Total |    17,266     19,886 |    37,152 
table cntry vote gndr
                                 |           Gender         
                                 |    Male   Female    Total
---------------------------------+--------------------------
Country                          |                          
  BE                             |                          
    Voted last national election |                          
      Yes                        |     517      526    1,043
      No                         |      75       73      148
      Not eligible to vote       |      76       68      144
      Total                      |     668      667    1,335
  BG                             |                          
    Voted last national election |                          
      Yes                        |     900      988    1,888
      No                         |     341      415      756
      Not eligible to vote       |      39       24       63
      Total                      |   1,280    1,427    2,707
  CH                             |                          
    Voted last national election |                          
      Yes                        |     440      404      844
      No                         |     157      162      319
      Not eligible to vote       |     174      150      324
      Total                      |     771      716    1,487
  CZ                             |                          
    Voted last national election |                          
      Yes                        |     641      846    1,487
      No                         |     345      445      790
      Not eligible to vote       |      87       95      182
      Total                      |   1,073    1,386    2,459
  EE                             |                          
    Voted last national election |                          
      Yes                        |     437      596    1,033
      No                         |     158      153      311
      Not eligible to vote       |      92       97      189
      Total                      |     687      846    1,533
  FI                             |                          
    Voted last national election |                          
      Yes                        |     609      656    1,265
      No                         |     128       90      218
      Not eligible to vote       |      41       48       89
      Total                      |     778      794    1,572
  FR                             |                          
    Voted last national election |                          
      Yes                        |     508      517    1,025
      No                         |     284      306      590
      Not eligible to vote       |     157      150      307
      Total                      |     949      973    1,922
  GB                             |                          
    Voted last national election |                          
      Yes                        |     393      489      882
      No                         |      89      123      212
      Not eligible to vote       |      23       30       53
      Total                      |     505      642    1,147
  GR                             |                          
    Voted last national election |                          
      Yes                        |   1,124    1,210    2,334
      No                         |     157      200      357
      Not eligible to vote       |      33       32       65
      Total                      |   1,314    1,442    2,756
  HR                             |                          
    Voted last national election |                          
      Yes                        |     509      556    1,065
      No                         |     180      281      461
      Not eligible to vote       |      15       18       33
      Total                      |     704      855    1,559
  HU                             |                          
    Voted last national election |                          
      Yes                        |     484      811    1,295
      No                         |     157      265      422
      Not eligible to vote       |      55       71      126
      Total                      |     696    1,147    1,843
  IE                             |                          
    Voted last national election |                          
      Yes                        |     624      716    1,340
      No                         |     128      144      272
      Not eligible to vote       |      85       66      151
      Total                      |     837      926    1,763
  IS                             |                          
    Voted last national election |                          
      Yes                        |     363      380      743
      No                         |      28       34       62
      Not eligible to vote       |      40       51       91
      Total                      |     431      465      896
  IT                             |                          
    Voted last national election |                          
      Yes                        |     867      932    1,799
      No                         |     244      323      567
      Not eligible to vote       |     113       96      209
      Total                      |   1,224    1,351    2,575
  LT                             |                          
    Voted last national election |                          
      Yes                        |     398      709    1,107
      No                         |     215      287      502
      Not eligible to vote       |      23       16       39
      Total                      |     636    1,012    1,648
  ME                             |                          
    Voted last national election |                          
      Yes                        |     564      513    1,077
      No                         |      55       73      128
      Not eligible to vote       |      10       20       30
      Total                      |     629      606    1,235
  MK                             |                          
    Voted last national election |                          
      Yes                        |     502      582    1,084
      No                         |     112      167      279
      Not eligible to vote       |      16       15       31
      Total                      |     630      764    1,394
  NL                             |                          
    Voted last national election |                          
      Yes                        |     641      614    1,255
      No                         |      73       67      140
      Not eligible to vote       |      36       37       73
      Total                      |     750      718    1,468
  NO                             |                          
    Voted last national election |                          
      Yes                        |     576      543    1,119
      No                         |      64       47      111
      Not eligible to vote       |      78       95      173
      Total                      |     718      685    1,403
  PT                             |                          
    Voted last national election |                          
      Yes                        |     540      697    1,237
      No                         |     190      295      485
      Not eligible to vote       |      37       58       95
      Total                      |     767    1,050    1,817
  SI                             |                          
    Voted last national election |                          
      Yes                        |     384      441      825
      No                         |     149      160      309
      Not eligible to vote       |      49       49       98
      Total                      |     582      650    1,232
  SK                             |                          
    Voted last national election |                          
      Yes                        |     482      565    1,047
      No                         |     136      189      325
      Not eligible to vote       |      19       10       29
      Total                      |     637      764    1,401
  Total                          |                          
    Voted last national election |                          
      Yes                        |  12,503   14,291   26,794
      No                         |   3,465    4,299    7,764
      Not eligible to vote       |   1,298    1,296    2,594
      Total                      |  17,266   19,886   37,152
------------------------------------------------------------

Present a numerical variable by a categorical variable using table

table gndr, statistic(mean happy)
         |      Mean
---------+----------
Gender   |          
  Male   |  7.245336
  Female |  7.238955
  Total  |  7.241918
--------------------
table gndr, stat(mean happy) stat(sd happy) stat(n happy)
         |      Mean   Standard deviation   Number of nonmissing values
---------+-------------------------------------------------------------
Gender   |                                                             
  Male   |  7.245336              1.89884                        17,421
  Female |  7.238955             1.963183                        20,100
  Total  |  7.241918             1.933552                        37,521
-----------------------------------------------------------------------
tabstat happy, by(gndr) stat(mean sd n)
Summary for variables: happy
Group variable: gndr (Gender)

     gndr |      Mean        SD         N
----------+------------------------------
     Male |  7.245336   1.89884     17421
   Female |  7.238955  1.963183     20100
----------+------------------------------
    Total |  7.241918  1.933552     37521
-----------------------------------------

P.s., what about median and mode?

tabstat happy, by(gndr) stat(median)
Summary for variables: happy
Group variable: gndr (Gender)

     gndr |       p50
----------+----------
     Male |         8
   Female |         8
----------+----------
    Total |         8
---------------------
* or:
tabstat happy, by(gndr) stat(p50)
Summary for variables: happy
Group variable: gndr (Gender)

     gndr |       p50
----------+----------
     Male |         8
   Female |         8
----------+----------
    Total |         8
---------------------
table happy gndr
                    |           Gender         
                    |    Male   Female    Total
--------------------+--------------------------
How happy are you   |                          
  Extremely unhappy |     106      139      245
  1                 |      84      109      193
  2                 |     201      261      462
  3                 |     447      504      951
  4                 |     541      686    1,227
  5                 |   1,658    1,989    3,647
  6                 |   1,725    2,086    3,811
  7                 |   3,610    3,824    7,434
  8                 |   4,823    5,359   10,182
  9                 |   2,524    2,915    5,439
  Extremely happy   |   1,702    2,228    3,930
  Total             |  17,421   20,100   37,521
-----------------------------------------------

Data visualization of Numerical Vars 1

Histograms (continuous vars)

histogram netustm

Data visualization of Numerical Vars 2

Bar charts (discrete vars)

graph bar, over(netusoft)

graph bar (count), over(netusoft)

Data visualization of Categorical Vars

Bar charts

graph bar (count), over(cntry)

Data visualization of Multiple Vars (continuous by categorical)

Box plots

graph box netustm, over(cntry)

Internet use in mins by gender by country

graph box netustm, over(gndr) over(cntry) horizontal

Data visualization (between two continuous)

Scatter plots

scatter netustm yrbrn

scatter netustm yrbrn || lfit netustm yrbrn

Summary: presenting a single variable

Variable Central tendency measure Plot type
Nominal mode barplot
Ordinal mode, median barplot (correct order)
Interval mode, median, mean barplot/histogram
Ratio mode, median, mean histogram

Google bff!

There are various methods and commands in Stata for achieving similar outputs, and if you’ want to’d like to explore further, Google is always your best friend! Some useful resources:


Image source: https://ru.pinterest.com/pin/680958406132830397/

Image source: https://devcamp.com/site_blogs/top-5-programming-memes

Mandatory Assignment 1

Due date: by 9.Oct.2024 23:59

  • Work individually (though you can discuss with your peers)

  • Select 3 variables from today’s data set based on your own interests

  • Explore these variables using what we’ve learned

  • Describe each variable and organize the outputs neatly

  • Convert it to a PDF file and name it as: surname_quanlab_1

  • Upload to Moodle under “Lab Materials” assignment section: