Stat 5301 HW3

Question 1

The experimental units are the online shoppers that visit this company’s website. The company is setting up their experiment such that it displays different web pages in order to intrigue the customers to buy their featured shoes. Thus, their subjects are those shoppers that visit the site.

The different conditions applied to these online shoppers are the two different web pages. Thus, the treatments are:

1: Web page describing the comfort of the running shoes.

2: Web page describing the discounted price for the shoes.

The company will measure the comparison of the two marketing strategies by looking at the sales of the featured running shoes. Thus, the outcome for this experiment are the sales of the featured running shoes.

The experiment could be improved by using randomization. Otherwise, the results of the experiment will depend too heavily on whether shoppers in general tend to buy shoes at a certain time of day (no matter the marketing tools). If we could randomize so that different shoppers throughout the day receive one of the two treatments at random, then we can more confidently compare the results that pertain to the marketing tools rather than time of day. Our comparison of the two treatments is only valid when both treatments are applied to similar groups of our shoppers. One aspect of shoppers having similar characteristics is the time of day that they shop.

A placebo treatment in this experiment, as stated in the question paragraph, would be a time frame where they don’t display a marketing web page. Adding in this treatment would improve the experiment because it would allow the experimenters to compare the sales to their standard web page (with no marketing displayed). It would be helpful so that they can see if the increase in sales is due to customers solely wanting shoes or if it is possibly due to the marketing. In the improved situation described in (b), the placebo treatment would be applied at random to customers throughout the day. Combining randomization and a placebo treatment would together improve this experiment so that the company can compare their sales not only against multiple marketing strategies but also to compare to see just how much of an effect their marketing has overall against their normal web page.

NOTE: I am assuming the experimenters have a way to compare the sales of their featured shoes based on which treatment the customer was under when visiting the web page.

Question 2

Factors: Calcium (with three levels: 0, 200, 400 mg) and Vitamin D (with two levels: 0, 50 IU)

Treatments: 6 treatments; Tablet with:

no Calcium and no Vitamin D (placebo)
no Calcium and 50 IU of Vitamin D
200 mg of Calcium and no Vitamin D
200 mg of Calcium and 50 IU of Vitamin D
400 mg of Calcium and no Vitamin D
400 mg of Calcium and 50 IU of Vitamin D

Gender might have different effects on the response so we do not want to purposefully separate the treatments for the males and the women.
When rolling the die, we don’t know that every treatment will eventually be used/assigned. For example, we could roll a 1 for every subject and thus everyone in the study would be assigned treatment 1 which is the placebo. There would be nothing to study. Further, we want equal amounts of experimental units to be assigned to each treatment. There is no guarantee that this would happen from rolling a die.

We can randomize the 60 subjects by using computer software. We do this by assigning each subject a unique id number, 1-60. In R we can create a data frame with a column for the id and a column describing their gender. Then we ask R to randomly order the id numbers. From this random order we take blocks of 10 ids (block 1: 1-10 from random order, block 2: 11-20 from random order, block 3: 21-30 from random order, block 4: 31-40 from random order, block 5: 41-50 from random order, block 6: 51-60 from random order). Each block is assigned a different treatment (block 1: treatment 1, block 2: treatment 2, …, block 6: treatment 6). We will see an example of one such design matrix in part (d).

Outline of the design:

subjects <- data.frame(id=1:60,
                                         gender=c(rep("M", 30), rep("F", 30)))

random.order <- sample(subjects$id)
design <- matrix(random.order, nrow=6, ncol=10, byrow=TRUE)
rownames(design) <- paste("Treatment", 1:6)
design

##             [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## Treatment 1   53   40   16   56   21   17   18    3   39    20
## Treatment 2   44   19   57   26   60   42   43   37   52     1
## Treatment 3   36   23    4   15   50   29   10   30   25    54
## Treatment 4   28    6   47   27   11   14    2   41   34    31
## Treatment 5    9   49    8   45   24    5   51   22   12    55
## Treatment 6   35    7   38   33   58   59   48   13   32    46

Blocking by men and women:

subjects_men <- subjects[subjects$gender=="M",]
subjects_women <- subjects[subjects$gender=="F",]
random.men.order <- sample(subjects_men$id)
random.women.order <- sample(subjects_women$id)
men_design <- matrix(random.men.order, nrow=6, ncol=5, byrow=TRUE)
rownames(men_design) <- paste("Treatment", 1:6)
women_design <- matrix(random.women.order, nrow=6, ncol=5, byrow=TRUE)
rownames(women_design) <- paste("Treatment", 1:6)

men_design

##             [,1] [,2] [,3] [,4] [,5]
## Treatment 1    4   27   16   13   18
## Treatment 2    7   20    5    3    8
## Treatment 3   25   11   10   15    6
## Treatment 4   21   17   28   19   26
## Treatment 5   23   12   29    9   30
## Treatment 6    1   14   24    2   22

women_design

##             [,1] [,2] [,3] [,4] [,5]
## Treatment 1   52   57   56   35   45
## Treatment 2   48   42   49   33   34
## Treatment 3   43   31   50   37   55
## Treatment 4   53   32   60   58   39
## Treatment 5   46   51   54   40   36
## Treatment 6   59   38   47   44   41

Overall design:

block_design <- cbind(men_design, women_design)
block_design

##             [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## Treatment 1    4   27   16   13   18   52   57   56   35    45
## Treatment 2    7   20    5    3    8   48   42   49   33    34
## Treatment 3   25   11   10   15    6   43   31   50   37    55
## Treatment 4   21   17   28   19   26   53   32   60   58    39
## Treatment 5   23   12   29    9   30   46   51   54   40    36
## Treatment 6    1   14   24    2   22   59   38   47   44    41

Question 3

Population: 584 longleaf pine trees
Sample: 40 trees
Concerns about sample design: I do not have any concerns with the sample design because they describe their sample of 40 trees as a random sample. I assume from this statement that all 584 trees had an equal chance of being picked for the sample. I also assume that the 40 trees picked were in fact longleaf pine trees.

Population: All workers
Sample: 100 restaurant workers
Concerns about sample design: The stress of restaurant workers does not represent the stress of all different types of workers. Different types of workers will encounter different types of work stress. So the sample does not represent the whole population.

Population: U.S. mothers
Sample: Households with children under 15 who were contacted through random digit dialing.
Concerns about sample design: My first concern is that this sample misses those who do not have phones. Also, because the responses were gathered through phone calls, the survey has a high non-response rate (40%). It is possible that those households with stay at home mothers are more likely to have someone at home to answer the survey phone call.

Question 4

This is a simple random sample because it gives every resident of Sweden an equal chance to be chosen.

This is a stratified random sample because they first divide the emails into work and personal email addresses and then take the random sample.

This is a voluntary response sample because the reviewers volunteered themselves to rate the external hard drive. We see that those who chose to give reviews were on opposing ends of opinions (1-star versus 4 and 5-stars).

Stat 5301 HW3

Jessica Wheeler

September 23, 2015

Question 1

Question 2

Question 3

Question 4