1 Data Description

Please load the MSR package and the data camera from the package. The baseline levels for the attributes are already set for you. To see a detailed description of the data, run ?camera or help(camera) in your command line prompt.

# load the package MSR
library(MSR)

# load the data `camera`
data(camera)

# get a sneak peak of the data 
head(camera)
##   id      memory_format                 lens    price  ratings
## 1  1            SD Card Plastic Lens 8x Zoom 300 euro 51.05633
## 2  1 Compact Flash Card   Glass Lens 8x Zoom 100 euro 89.76444
## 3  1            SD Card Plastic Lens 4x Zoom 100 euro 20.50488
## 4  1 Compact Flash Card Plastic Lens 8x Zoom 200 euro 61.27721
## 5  1       Memory Stick Plastic Lens 8x Zoom 100 euro 62.21202
## 6  1            SD Card   Glass Lens 8x Zoom 200 euro 33.26009

The data has 3 product attributes for a camera and 3 levels for each attribute. The levels of these attributes are as below. Note that the baseline levels are already set for all the attributes (Memory Stick, Glass Lens 8x Zoom, 300 euro).

## $memory_format
## [1] "Memory Stick"       "SD Card"            "Compact Flash Card"
## 
## $lens
## [1] "Glass Lens 8x Zoom"   "Plastic Lens 8x Zoom" "Plastic Lens 4x Zoom"
## 
## $price
## [1] "300 euro" "100 euro" "200 euro"

2 Estimating the Coefficients

We first estimate a linear regression of ratings on the three attributes (memory_format, lens, price) to obtain the coefficients of levels. The baseline levels are already set for you. In your own practices, you may use relevel() function to choose your own baselines. In theory, you can use any level as the baseline for an attribute.

# running the linear regression 
mdl <- lm(ratings ~ memory_format + lens + price, camera)

# get a summary of the model results
summary(mdl)
## 
## Call:
## lm(formula = ratings ~ memory_format + lens + price, data = camera)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -45.959  -9.616   0.387   9.410  54.870 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      46.6175     1.5757  29.585  < 2e-16 ***
## memory_formatSD Card             -0.1962     1.4588  -0.134  0.89307    
## memory_formatCompact Flash Card   0.5834     1.4588   0.400  0.68938    
## lensPlastic Lens 8x Zoom         -3.7992     1.4588  -2.604  0.00943 ** 
## lensPlastic Lens 4x Zoom         -9.9566     1.4588  -6.825 2.14e-11 ***
## price100 euro                    16.5377     1.4588  11.336  < 2e-16 ***
## price200 euro                     6.2857     1.4588   4.309 1.92e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.73 on 605 degrees of freedom
## Multiple R-squared:  0.2281, Adjusted R-squared:  0.2204 
## F-statistic: 29.79 on 6 and 605 DF,  p-value: < 2.2e-16

From the estimation results, we see that the overall model is significant with F-stat 29.79 and p-value < 0.05. Therefore, we reject the null hypothesis that all coefficients are zero. Note that it is unnecessary to check the R-squared as we are not doing prediction here.

Moreover, a closer look at the coefficients show that the two levels of memory_format are insignificant, with p-values of 0.89 and 0.69 respectively. Therefore, we cannot reject the null hypothesis that the coefficients are zeros. We will use this observation in the next step.

3 Obtaining the Partworths

In class, we have discussed three rules to transform the coefficients to partworths:

  1. Baseline levels partworths = 0
  2. Insignificant levels partworths = 0
  3. Significant levels partworths = coefficients

Here, we presenting two options. For those of you who are not familiar with R, we will create a table in word/excel/ppt to record the results. For those of you R lovers, pleas see the final section for the R code.

First, we create a coefficients table to record the coefficients and p-values of all levels. In your own practice, please follow a similar procedure. I will show you the table here, but you can create your table in word or excel. Note that in the table, the coefficients of baselines are zeros, and p-values of the baselines are empty. This is because we must set the coefficients of baselines to zeros.

Attributes Levels Coefficients P-values
Memory Format SD card -0.196 0.893
Compact flash card 0.583 0.689
Memory stick 0.000
Lens Plastic 8x Zoom -3.799 0.009
Plastic 4x Zoom -9.957 2.14E-11
Glass 8x Zoom 0.000
Price 100 euro 16.538 2.00E-16
200 euro 6.286 1.92E-05
300 euro 0.000


Based on the coefficients table, we then follow the three rules to obtain the partworths of all levels. The partworths of all baselines (Memory Stick, Glass 8x Zoom and 300 euro) are zeros. The partworths for insignificant levels are also zeros, including SD Card and Compact Flash Card. For all the other levels, the partworths are the coefficients.

Attributes Levels Coefficients P-values Partworths
Memory Format SD card -0.196 0.893 0.000
Compact flash card 0.583 0.689 0.000
Memory stick 0.000 0.000
Lens Plastic 8x Zoom -3.799 0.009 -3.799
Plastic 4x Zoom -9.957 2.14E-11 -9.957
Glass 8x Zoom 0.000 0.000
Price 100 euro 16.538 2.00E-16 16.538
200 euro 6.286 1.92E-05 6.286
300 euro 0.000 0.000

4 Making Use of the Results

Given the partworths, we can use them to generate some insights into consumers. Here, we focus on three applications:

  • Calculating the importance of attributes;
  • Checking the most preferred products;
  • Comparing product ideas;

4.1 Calculating the importance of attributes

To calculate the importance of an attribute, we first look at the partworths of all the levels of the attribute. Then, we find the highest and lowest levels and difference them. \[ importance = max(partworth) - min(partworth) \] Following the equation, we have:

  • Importance of Memory Format = 0 - 0 = 0
  • Importance of Lens = 0 - (-9.957) = 9.957
  • Importance of Price = 16.538 - 0 = 16.538

We can also normalize the importance for reporting purposes. With the relative importance, our clients can read results much more easily. To do the normalization, we divide each importance by the sum of all importance.

\[ Memory\,\,Format=\frac{0}{\left( 0+9.957+16.538 \right)}=0 \\ Lens=\frac{9.957}{\left( 0+9.957+16.538 \right)}=0.376 \\ Price=\frac{16.538}{\left( 0+9.957+16.538 \right)}=0.624 \]

4.2 Obtaining the most preferred product profile

To see which product profile consumers like the most, we just combine all the highest levels. For Lens, it’s Glass 8x Zoom. For Price, it’s 100 euro. For Memory Format, all the levels have the same partworth (all zeros). This means consumers do not care about which memory format to include, and the company can decide whichever memory format without influencing consumers’ preferences.

4.3 Comparing product ideas

In real research projects, you are often asked by your clients to compare some product profiles. These profiles are probably “good candidates” from within the company. For example, suppose your client asks you to compare the following two product ideas:

  • Camera A: SD card, Plastic 4x Zoom and priced at 100 euro.
  • Camera B: Memory Stick, Plastic 8x Zoom and priced at 200 euro.

To compare the two products, you just need to add up the partworths of the levels of the products.

  • For Camera A, we have \(0 − 9.957 +16.538 = 6.581\).
  • For Camera B, we have \(0 − 3.799 + 6.286 = 2.487\).

From the calculation, Camera A is more preferred by consumers.

5 For R Lovers: R Codes to Obtain Partworths and Make Use of Results (ignore this if you are not familiar with R)

First, we create three vectors for three attributes with names.

Memory_Format <- setNames(rep(0,3),
                          levels(camera$memory_format))
Lens <- setNames(rep(0,3),
                          levels(camera$lens))
Price <- setNames(rep(0,3),
                          levels(camera$price))

We next obtain the coefficients from the estimation results and transform insignificant coefficients to zeros by checking if p-values are larger than 0.05.

# getting the coefficients table
results <- summary(mdl)$coefficients
coefficients <- results[-1,1]*(results[-1,4]<0.05)

# reshape into a matrix
coefficients <- matrix(coefficients, nrow = 2, ncol = 3)

We next pack the coefficients into the three partworth vectors and create a list called partworths to store all partworths values.

Memory_Format[2:3] <- coefficients[,1]
Lens[2:3] <- coefficients[,2]
Price[2:3] <- coefficients[,3]

partworths <- list(Memory_Format = Memory_Format,
                   Lens = Lens,
                   Price = Price)
partworths
## $Memory_Format
##       Memory Stick            SD Card Compact Flash Card 
##                  0                  0                  0 
## 
## $Lens
##   Glass Lens 8x Zoom Plastic Lens 8x Zoom Plastic Lens 4x Zoom 
##             0.000000            -3.799195            -9.956553 
## 
## $Price
##  300 euro  100 euro  200 euro 
##  0.000000 16.537702  6.285731

Given the partworths table, we then calculate the importance of different attributes and the relative importance.

importance <- unlist(lapply(partworths,
                            function(x){max(x)-min(x)}))
importance
## Memory_Format          Lens         Price 
##      0.000000      9.956553     16.537702
relative_importance <- importance/sum(importance)
relative_importance
## Memory_Format          Lens         Price 
##     0.0000000     0.3758005     0.6241995

The most preferred products are the combination of all highest levels.

most_preferred <- lapply(partworths,
                         function(x){which(x==max(x),arr.ind = T)})
most_preferred
## $Memory_Format
##       Memory Stick            SD Card Compact Flash Card 
##                  1                  2                  3 
## 
## $Lens
## Glass Lens 8x Zoom 
##                  1 
## 
## $Price
## 100 euro 
##        2

Lastly, we compare the two hypothetical product profiles: A vs. B.

  • Camera A: SD card, Plastic 4x Zoom and priced at 100 euro.
  • Camera B: Memory Stick, Plastic 8x Zoom and priced at 200 euro.
Utility_A <- partworths$Memory_Format[2] +
  partworths$Lens[3] + 
  partworths$Price[2]
setNames(Utility_A,"Utility_A")
## Utility_A 
##  6.581148
Utility_B <- partworths$Memory_Format[1] +
  partworths$Lens[2] + 
  partworths$Price[3]
setNames(Utility_B,"Utility_B")
## Utility_B 
##  2.486536
---
title: 'R Tutorial: Conjoint Analysis'
output:
  html_document:
    df_print: default
    code_download: yes
    theme: readable
    toc: yes
    toc_float: 
      collapsed: no
      smooth_scroll: no
    number_sections: yes
  pdf_document:
    toc: yes
---

# Data Description 

Please load the `MSR` package and the data `camera` from the package. The baseline levels for the attributes are already set for you. To see a detailed description of the data, run `?camera` or `help(camera)` in your command line prompt. 

```{r}
# load the package MSR
library(MSR)

# load the data `camera`
data(camera)

# get a sneak peak of the data 
head(camera)
```

The data has 3 product attributes for a camera and 3 levels for each attribute. The levels of these attributes are as below. Note that the baseline levels are already set for all the attributes (`Memory Stick`, `Glass Lens 8x Zoom`, `300 euro`).

```{r, echo=FALSE}
# check the levels of all attributes
lapply(camera[,2:4],levels)
```

# Estimating the Coefficients

We first estimate a linear regression of ratings on the three attributes (`memory_format`, `lens`, `price`) to obtain the coefficients of levels. The baseline levels are already set for you. In your own practices, you may use `relevel()` function to choose your own baselines. In theory, you can use any level as the baseline for an attribute. 

```{r}
# running the linear regression 
mdl <- lm(ratings ~ memory_format + lens + price, camera)

# get a summary of the model results
summary(mdl)

```

From the estimation results, we see that the overall model is significant with `F-stat` 29.79 and `p-value < 0.05`. Therefore, we reject the null hypothesis that all coefficients are zero. Note that it is unnecessary to check the R-squared as we are not doing prediction here. 

Moreover, a closer look at the coefficients show that the two levels of `memory_format` are insignificant, with `p-values` of 0.89 and 0.69 respectively. Therefore, we cannot reject the null hypothesis that the coefficients are zeros. We will use this observation in the next step. 

# Obtaining the Partworths

In class, we have discussed three rules to transform the coefficients to partworths:

  1. Baseline levels partworths = 0
  2. Insignificant levels partworths = 0
  3. Significant levels partworths = coefficients  

Here, we presenting two options. For those of you who are not familiar with R, we will create a table in word/excel/ppt to record the results. For those of you R lovers, pleas see the final section for the R code. 

First, we create a coefficients table to record the coefficients and p-values of all levels. In your own practice, please follow a similar procedure. I will show you the table here, but you can create your table in word or excel. Note that in the table, the coefficients of baselines are zeros, and p-values of the baselines are empty. This is because we must set the coefficients of baselines to zeros.   

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-fymr{border-color:inherit;font-weight:bold;text-align:left;vertical-align:top}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-fymr">Attributes</th>
    <th class="tg-fymr">Levels</th>
    <th class="tg-fymr">Coefficients</th>
    <th class="tg-fymr">P-values</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">Memory Format</td>
    <td class="tg-0pky">SD card</td>
    <td class="tg-0pky">-0.196</td>
    <td class="tg-0pky">0.893</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Compact flash card</td>
    <td class="tg-0pky">0.583</td>
    <td class="tg-0pky">0.689</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Memory stick</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-0pky">Lens</td>
    <td class="tg-0pky">Plastic 8x Zoom</td>
    <td class="tg-0pky">-3.799</td>
    <td class="tg-0pky">0.009</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Plastic 4x Zoom</td>
    <td class="tg-0pky">-9.957</td>
    <td class="tg-0pky">2.14E-11</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Glass 8x Zoom</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-0pky">Price</td>
    <td class="tg-0pky">100 euro</td>
    <td class="tg-0pky">16.538</td>
    <td class="tg-0pky">2.00E-16</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">200 euro</td>
    <td class="tg-0pky">6.286</td>
    <td class="tg-0pky">1.92E-05</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">300 euro</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
  </tr>
</tbody>
</table><br>

Based on the coefficients table, we then follow the three rules to obtain the partworths of all levels. The partworths of all baselines (`Memory Stick`, `Glass 8x Zoom` and `300 euro`) are zeros. The partworths for insignificant levels are also zeros, including `SD Card` and `Compact Flash Card`. For all the other levels, the partworths are the coefficients. 

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-fymr{border-color:inherit;font-weight:bold;text-align:left;vertical-align:top}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-fymr">Attributes</th>
    <th class="tg-fymr">Levels</th>
    <th class="tg-fymr">Coefficients</th>
    <th class="tg-fymr">P-values</th>
    <th class="tg-0pky"><span style="font-weight:bold">Partworths</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">Memory Format</td>
    <td class="tg-0pky">SD card</td>
    <td class="tg-0pky">-0.196</td>
    <td class="tg-0pky">0.893</td>
    <td class="tg-0pky">0.000</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Compact flash card</td>
    <td class="tg-0pky">0.583</td>
    <td class="tg-0pky">0.689</td>
    <td class="tg-0pky">0.000</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Memory stick</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">0.000</td>
  </tr>
  <tr>
    <td class="tg-0pky">Lens</td>
    <td class="tg-0pky">Plastic 8x Zoom</td>
    <td class="tg-0pky">-3.799</td>
    <td class="tg-0pky">0.009</td>
    <td class="tg-0pky">-3.799</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Plastic 4x Zoom</td>
    <td class="tg-0pky">-9.957</td>
    <td class="tg-0pky">2.14E-11</td>
    <td class="tg-0pky">-9.957</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">Glass 8x Zoom</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">0.000</td>
  </tr>
  <tr>
    <td class="tg-0pky">Price</td>
    <td class="tg-0pky">100 euro</td>
    <td class="tg-0pky">16.538</td>
    <td class="tg-0pky">2.00E-16</td>
    <td class="tg-0pky">16.538</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">200 euro</td>
    <td class="tg-0pky">6.286</td>
    <td class="tg-0pky">1.92E-05</td>
    <td class="tg-0pky">6.286</td>
  </tr>
  <tr>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">300 euro</td>
    <td class="tg-0pky">0.000</td>
    <td class="tg-0pky"></td>
    <td class="tg-0pky">0.000</td>
  </tr>
</tbody>
</table>

# Making Use of the Results

Given the partworths, we can use them to generate some insights into consumers. Here, we focus on three applications: 

* Calculating the importance of attributes; 
* Checking the most preferred products; 
* Comparing product ideas; 

## Calculating the importance of attributes

To calculate the importance of an attribute, we first look at the partworths of all the levels of the attribute. Then, we find the highest and lowest levels and difference them. 
$$ importance = max(partworth) - min(partworth) $$
Following the equation, we have: 

* Importance of Memory Format = 0 - 0 = 0
* Importance of Lens = 0 - (-9.957) = 9.957
* Importance of Price = 16.538 - 0 = 16.538

We can also normalize the importance for reporting purposes. With the relative importance, our clients can read results much more easily. To do the normalization, we divide each importance by the sum of all importance. 

$$
Memory\,\,Format=\frac{0}{\left( 0+9.957+16.538 \right)}=0
\\
Lens=\frac{9.957}{\left( 0+9.957+16.538 \right)}=0.376
\\
Price=\frac{16.538}{\left( 0+9.957+16.538 \right)}=0.624
$$

## Obtaining the most preferred product profile

To see which product profile consumers like the most, we just combine all the highest levels. For `Lens`, it's `Glass 8x Zoom`. For `Price`, it's `100 euro`. For `Memory Format`, all the levels have the same partworth (all zeros). This means consumers do not care about which memory format to include, and the company can decide whichever memory format without influencing consumers' preferences. 

## Comparing product ideas

In real research projects, you are often asked by your clients to compare some product profiles. These profiles are probably "good candidates" from within the company. For example, suppose your client asks you to compare the following two product ideas: 

* Camera A: SD card, Plastic 4x Zoom and priced at 100 euro. 
* Camera B: Memory Stick, Plastic 8x Zoom and priced at 200 euro. 

To compare the two products, you just need to add up the partworths of the levels of the products. 

* For Camera A, we have $0 − 9.957 +16.538 = 6.581$.
* For Camera B, we have $0 − 3.799 + 6.286 = 2.487$. 

From the calculation, Camera A is more preferred by consumers. 

# For R Lovers: R Codes to Obtain Partworths and Make Use of Results (ignore this if you are not familiar with R)

First, we create three vectors for three attributes with names. 

```{r}
Memory_Format <- setNames(rep(0,3),
                          levels(camera$memory_format))
Lens <- setNames(rep(0,3),
                          levels(camera$lens))
Price <- setNames(rep(0,3),
                          levels(camera$price))
```

We next obtain the coefficients from the estimation results and transform insignificant coefficients to zeros by checking if p-values are larger than 0.05. 

```{r}
# getting the coefficients table
results <- summary(mdl)$coefficients
coefficients <- results[-1,1]*(results[-1,4]<0.05)

# reshape into a matrix
coefficients <- matrix(coefficients, nrow = 2, ncol = 3)
```

We next pack the coefficients into the three partworth vectors and create a list called partworths to store all partworths values.  

```{r}
Memory_Format[2:3] <- coefficients[,1]
Lens[2:3] <- coefficients[,2]
Price[2:3] <- coefficients[,3]

partworths <- list(Memory_Format = Memory_Format,
                   Lens = Lens,
                   Price = Price)
partworths
```

Given the partworths table, we then calculate the importance of different attributes and the relative importance. 

```{r}
importance <- unlist(lapply(partworths,
                            function(x){max(x)-min(x)}))
importance

relative_importance <- importance/sum(importance)
relative_importance
```

The most preferred products are the combination of all highest levels. 

```{r}
most_preferred <- lapply(partworths,
                         function(x){which(x==max(x),arr.ind = T)})
most_preferred
```

Lastly, we compare the two hypothetical product profiles: A vs. B. 

* Camera A: SD card, Plastic 4x Zoom and priced at 100 euro. 
* Camera B: Memory Stick, Plastic 8x Zoom and priced at 200 euro. 

```{r}
Utility_A <- partworths$Memory_Format[2] +
  partworths$Lens[3] + 
  partworths$Price[2]
setNames(Utility_A,"Utility_A")

Utility_B <- partworths$Memory_Format[1] +
  partworths$Lens[2] + 
  partworths$Price[3]
setNames(Utility_B,"Utility_B")
```





