Many cosmetic companies today have introduced Virtual Try-on technology (VTO) on their online shopping websites. VTO can provide a better shopping experience and help customers find the right product according to relevant studies. However, for companies, whether VTO can improve the sales revenue remains unclear.
In this paper, we designed an experimental research to anwser three questions:
Compared to on-line shopping without VTO, is mean sales of lipsticks higher in on-line shopping with VTO?
Compared to the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30, is the difference higher in customers younger than 30?
Compared to the difference in mean sales of lipsticks between shopping with and without VTO in best-seller products, is the difference higher in miscellaneous products?
Our experiment is based on the lipsticks and customers of YSL company. We randomly select 2000 customers and assign them equally into two groups. In the control group, customers view 20 lipsticks and shop online without VTO. In the treatment group, customers should use VTO before making buying decisions for all 20 lipsticks. In our experiment, we balance the age and race of selected customers, and the color and popularity of lipsticks according to YSL’s customer segmentation and product list.
The result of our experiment is analyzed by the mean sales of two groups. To answer the first question, we perform a two-sample t-test to see whether the mean sales of treatment group is significantly higher than the control group. To answer the second and third question, we perform ANOVA test and Tukey’s HSD method to see whether the difference in mean sales of lipsticks between shopping with and without VTO is significantly different between two groups.
We hope our research can provide cosmetic companies with a unique perspective on VTO’s profitability and usage.
The fast development of e-commerce has boosted various technologies to improve customer experience in online shopping. Virtual try-on (VTO) is one of the most popular technologies used by cosmetic companies to help customers decide on the right product for them. For example, YSL has placed VTO in an obvious position on its website, see Figure 1. By uploading customers’ images to the VTO system, VTO uses augmented reality (AR) to let them try on the product, e.g., lipstick, online and in real time. The system will then render the source image with makeup estimation, see Figure2. It is applicable in multiple product ranges and customer appearances.
For customers, VTO helps them decide the suitable lipstick and creates a better kind of online shopping experience. For companies, however, it remains a question on whether the investment on this fancy technology is worthwhile. Stakeholders may ask: will VTO increase the sales? and if yes, by how much and what product will benefit from it? These managerial questions can be answered through a research method.
This paper aims to answer three research questions with hypothesis listed below:
1. Compared to on-line shopping without VTO, is mean sales of lipsticks higher in on-line shopping with VTO
Null Hypo: Compared to on-line shopping without VTO, the mean sales of lipsticks are smaller in on-line shopping with VTO.
Alternative Hypo: Compared to on-line shopping without VTO, the mean sales of lipsticks are higher in on-line shopping with VTO.
Meaningful effect size: 10% higher in the mean sales of lipsticks with VTO than without VTO.
2. Compared to the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30, is the difference higher in customers younger than 30?
Null Hypo: Compared to the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30, the difference is smaller in customers younger than 30.
Alternative Hypo: Compared to the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30, the difference is higher in customers younger than 30.
Meaningful effect size: 10% higher in the difference in mean sales between with and without VTO in customers younger than 30 than the difference in customers older than 30.
3. In comparing the difference of the mean sales of lipsticks between shopping with VTO and without, the difference of the mean sales is higher for the category of miscellaneous lipsticks rather than for the bestseller ones?
Null Hypo: Compared to the changes in mean sales of bestseller category lipsticks by using VTO, the change in mean sales of the miscellaneous category lipsticks by using VTO is not larger.
Alternative Hypo: Compared to the changes in mean sales of bestseller category lipsticks by using VTO, the change in mean sales of the miscellaneous category lipsticks by using VTO is larger.
Meaningful effect size: 10% higher in the difference in mean sales between with and without VTO in miscellaneous products than the difference in best-seller products.
Despite the wide application of VTO by cosmetic leaders, controversies remain in the VTO’s effects on product sales and in what circumstances these effects differ. Earlier studies (Bialkova, 2022) mostly focus on try-on experience, categorized into words such as ‘Satisfaction’, ‘Immersion’, and ‘Interactivity’, which are vulnerable to inconsistency as subjective feelings greatly differ by individuals. But few studies have looked into VTO’s impacts on sales, a big concern shared by cosmetic companies but neglected by scholars. Additionally, when considering what may change the effects of VTO, most studies focus on technology and system design. For example, Atieh (2017) concluded that better aesthetic quality and estimation accuracy of AR leads to a higher level of enjoyment during the shopping experience. Also, a study by Bigne (2021) stated that a system design that provides interactivity, realism, and ease of use will positively influence the virtual try-on experience. However, besides technology, customers and products also greatly influence VTO’s effects. Take the example of lipstick. Some colors naturally don’t fit certain racial groups for cultural and aesthetic reasons (Collins, 2021), a fact cannot be changed by VTO. In this case, it is assumed that VTO will not help the sales of these products but is more likely to discourage customers. This effect gap among different customers and products was neglected by earlier studies but can help cosmetic companies make the best use of VTO by targeting the right customers and products. Considering the limitations of current research, it is necessary to conduct further studies into the effects of VTO.
The population of interest is customers who have purchased YSL’s lipsticks and do not have visual try-on experiences before the study.
The research aims to recruit 2000 customers who have purchased YSL’s lipsticks and do not have visual try-on experiences before to participate in our experiment. We cooperate with the YSL sales team. Recruitment materials invite customers who intend to buy lipsticks to take a survey before they pay, and then use the data of the survey to randomly select 2000 people as a sample. Our study is designed to find out whether Virtual Try-On (VTO) technology helps increase the sales of lipsticks and improve customers’ purchasing intention, so we place a high value on participants’ potential interest in lipsticks. To include all target audiences who need lipsticks and rule out bias, we will not oversample those who have higher purchase prevalence, such as the women of age group between 20 and 40.
We look into recruited customers’ profile and their purchase history (Table 1), seeking to answer if the sales growth or decline after the VTO application is differed by color preference of lipsticks. Product preference is also included to shed light on the potential VTO application on other product categories such as blusher. Age group gives insight into customers’ perception on VTO, as in if a certain age group finds VTO helpful and accurate or not.
Table 1. Socio-demographic Characteristics of Sample
| Demographic characteristic | Sample |
| N | 2000 |
| Female, % | 85% |
| Other (including male, other, prefer not to tell), % | 15% |
| Average age (years) | 28 |
| Age group, % | |
| Below 20 | 15 |
| 20-30 | 35 |
| 30-40 | 33 |
| 40-50 | 13 |
| 50-60 | 3 |
| Above 60 | 1 |
| Race, % | |
| White | 57.8 |
| Hispanic and Latino | 18.7 |
| African American | 12.1 |
| Asian | 5.9 |
| Native American | 0.7 |
| Hawaiian | 0.2 |
| Other | 4.6 |
| Relationship status, % | |
| Married or engaged | 16 |
| In a relationship | 31 |
| Single (Never married) | 44 |
| Other (Including divorced, separated, and widowed, etc) | 9 |
| Customer profile: purchase history | |
Lipsticks purchase preference, % (Who have bought one group more than others) |
|
| Best-seller | 31 |
| Reddish orange | 20 |
| Pink/rosy | 22 |
| Nude | 18 |
| Brunet | 7 |
| Others (have only purchased one lipstick & have bought each group evenly) | 2 |
| Product preference, % | |
| Have only bought lipsticks | 59 |
| Have bought other products | 36 |
| Never buy at YSL’s official site | 5 |
Operational Procedures
1)For research question 1 and research question 2
Participants will be randomly divided into two groups: group A and group B. Group A will be the experimental group where participants will use the Visual Try-On to help them do the lipstick shopping, while group B will be the control group where participants will only be presented with the photos of lipsticks with color and lipstick numbers. 20 lipsticks of different shades are involved in the experiment, see Table2. Specifically, we select two of YSL’s primarily popularized series in consideration of the different textures of the lipsticks: VINYL CREAM LIP STAIN and ROUGE VOLUPTÉ SHINE LIPSTICK BALM. Either series, composed of 10 Lipsticks, has 5 subcategories. One is the best-seller group since we assumed that VTO technology has a negligible effect on customers’ purchasing intention and behavior. The other group is a miscellaneous group based on lipsticks color attribution: Reddish orange, Pink/Rosy, Nude, and Brunet. Each color category has two lipsticks to reduce contingency on a single choice within every single group.
Table 2. 20 lipsticks in different shades
| Ca t egories | B es t -seller Group | Misce l laneous Groups | Misce l laneous Groups | Misce l laneous Groups | Misce l laneous Groups |
|---|---|---|---|---|---|
| reddish orange | p i nk/rosy | nude | brunet | ||
| VINYL CREAM LIP STAIN | 407 | 406 | 405 | 404 | 401 |
| 416 | 408 | 410 | 434 | 409 | |
| ROUGE VOLUPTÉ SHINE L IPSTICK BALM | 80 | 14 | 12 | 44 | 122 |
| 83 | 46 | 84 | 150 | 131 |
1)For research question 3
Participants will also be firstly randomly divided into two groups: group A and group B. A will be the experimental group where participants will use the Visual Try-On to help them do the lipstick shopping, while group B will be the control group where participants will only be presented with the photos of lipsticks with color and lipstick numbers. One thing different from the design for question 1 and question 2 is that we will further divide group A and group B into group a1, group a2; and group b1, group b2. Group a1 and group b1 will only be presented with four lipsticks with color in the best seller group, while group a2 and group b2 will only be presented with four lipsticks with color in the miscellaneous group. The dependent variable will be the total number of lipsticks each participant buys. In this way, we can use two-way anova to test whether the effect would be different on different lipstyle groups
In Oct 21 st 2022, Our team cooperates with YSL’ sales team to quickly and precisely reduce the scope to find out appropriate population of interest, what follows is the random sampling of 2000 customers for our research studies. After sampling, we use questionnaire survey to gather information so as to assure the exact experiment day. It’s designed to take 15 days to finish the preparation stage. Research implementation begins immediately after preparation, after our professional page designers providing secure and exclusive interface, experiment is conducted and is finished by Nov 25 th. Since we have three research questions with analysis methods of one t-test and two two-way ANOVA tests, the data analysis procedure lasts for 5 days. In Nov 30 th, with all experiments and simulations done, our team spends 5 days on integrating results and writing report to better illustrate how Virtual Try- on Technology affects YSL’s lipstick sales in a combination of descriptive and quantitative scale.
Table3 Time Table
| Procedures | Start date | Duration days | End date |
|---|---|---|---|
| Preparation | 2022.10.21 | 15 days | 2022.11.05 |
| Research Implementation | 2022.11.05 | 20 days | 2022.11.25 |
| Data Analysis | 2022.11.25 | 5days | 2022.11.30 |
| Experiment Report | 2022.11.30 | 5days |
|
The study will be conducted online, and participants can use their smartphones or laptops with cameras to do the Visual Try-On. Members of each group will be asked to decide whether they’re going to buy any lipsticks and if want, which color they would like to buy. Apart from these two questions, they will also be surveyed about other experiment experiences, and table 2 will demonstrate the detailed factors we will analyze.
At first, we coordinate with the organization of YSL to recruit voluntary customers who will engage in our study. In this process, YSL will use its server and equipment to store all of data about customers. As a famous fashion company, YSL spend a lot protecting customers’ data from leaking and so it could ensure data security. Next, researchers should be trained to pay attention to data security before performing the study. The training tutorial should include three parts. (1) improve researcher’s awareness about data security by talking about the meaning of data security. (2) describe the scenario where researchers might be involved in hurting data security. (3) tell them the severe consequences if they do not obey the requirements about data security. Most importantly, in the process of study, researchers should be given the limited permissions to access the data they need. Also, they should adopt secret methods to transfer data, such as using encrypted U disk, transferring encrypted documents via emails, etc. After we finished the study, all the results of study about customers’ data will also be stored in YSL’s server.
Outcomes (dependent variables)
Outcomes (dependent variables) are the quantities (or sales) of lipsticks every participate purchased at the end of the study. Researchers will show 20 lipsticks one by one to participants and these participants will decide to whether to purchase these lipsticks one by one. In this way, the study will generate a binary outcome (0 represents not buy/1 represents buy) of every lipstick for every subject. After adding up the binary outcome of all the lipsticks, researchers will know the total number of lipsticks every subject purchased. Every lipstick is sold 40$, so it is easy to calculate lipsticks.
Yi-En Tseng and 20%, Zijian Li and 20%, Yuxuan Zhang and 60%
Treatments (Independent Variables)
Treatments (Independent Variables) are the variables about whether participate purchase lipsticks by virtual trying on. VTO group include those who purchase lipsticks after virtual trying on while non-VTO group include those who purchase lipsticks after checking the pictures. We supposed customers will purchase more lipsticks with VTO because they are more likely to be confident to buy the lipsticks that fit them. However, another possibility is that customers will purchase fewer lipstick by VTO because they are less likely to Impulsively buying online. So, we need to divide participant into control group(non-VTO) and treatment group (VTO) to research how VTO effects sales of lipsticks.
Yalan Liu and 50%, Wenlu Guo and 50%
other variable
The first other variable is age. Those who are younger than 30 years old are be considered as young people while those who are older than 30 years old are be considered as non-young people.
The second other variable is lipstick style. There are two different types of lipstick—best seller group and miscellaneous group. Best seller group includes 4 lipsticks and miscellaneous group includes 16 lipsticks.
Yalan Liu and 50%, Wenlu Guo and 50%
Question 1
In the research question1, we compare the mean sales of treatment group (with VTO) with control group (without VTO). The sales from one individual customer is composed by his or her buying choice on the 20 lipsticks. The sales from one group is composed by the sum of sales of all individual customers. To analyze the difference in mean sales, we perform a two-sample t-test because both our dependent variable and independent variables are numeric variable.
We aim to reject the null hypothesis that the mean sales of treatment group is smaller than or equal as the control group. We set the p-value for rejection as 0.05 and a minimum effect size to be considered meaningful as 10% higher in the mean sales of treatment group than control group.
Question 2
In the research question2, we try to compare the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30 and the difference younger than 30. According to the combination of VTO and age, we divide customers into four groups: group A (young people and VTO) , group B (young people and Non VTO) , group C (Non young people and VTO) , group D(Non young people and Non VTO). Young people are those whose age are below 30 years old while non young people are those whose age are above 30 years old. To solve the problem, we decided to use the two-way ANOVA model. This test statistic is based on the ratio of the mean sum of squares between groups and the mean sum of squares within each group. If the ratio of the mean sum of squares between groups is significantly different from the mean sum of squares within each group, we can infer there are significant differences between the four groups. By looking at the grouped means, we could make the decisions about our question.
Question 3
Research question 3 states that in comparing the difference of the mean sales of lipsticks between shopping with VTO and without, the difference of the mean sales is higher for the category of miscellaneous lipsticks rather than for the bestseller ones. In this case, since there are two independent categorical variables - with VTO/without VTO and Bestseller/ Miscellaneous (lipsticks) - and the numeric dependent variable being the mean sales of lipsticks, a two-way ANOVA method is chosen for conducting the analysis. This specific method would not only allow separate analysis of the effect for each of the two groups of independent variables on the dependent variable, but would also allow a comparison of their combination effect.
To have a better and clear understanding of the research question, we change the variable-with VTO/without VTO- to VTO/ non-VTO, and also change Bestseller/Miscellaneous (lipsticks) to 1/0 to make a binary independent variable so as to conduct further analysis.
Our sample size is 2000 customers. This size is large enough for us to balance the customer profile difference that may affect customers’ buying behavior and usage of VTO, such as race, age and purchasing experience. Additionally, with this large sample size, we can have a favorable statistical power (over 90%) for identifying the effect.
We assume cosmetic companies can freely choose the sample size with their own online shopping records. Considering the cost involved in conducting the experiment, here we propose the minimum sample size to detect the effect with over 85% power. For the first question, if we want to identify a medium effect (0.5), we need at least 60 customers in each sample. If we want to identify a small effect (0.2), we need at least 360 customers in each sample. For the second and third question, for a medium effect we need at least 20 customers in each group (40 in total), and for small effect we need at least 114 customers in each group (228 in total).
question 1
If we can reject the null hypothesis and detect a meaningful effect in improving sales by VTO, we will recommend the introduction of VTO by cosmetic companies. Cosmetic companies can increase their sales by encouraging customers to use VTO before making their buying decision. However, if we fail to reject the null hypothesis, we will believe VTO has no effect in improving sales and thus the profitability of this technology should be reconsidered.
question2
The result of the two-way anova test shows that, compared to the difference in mean sales of lipsticks between shopping with and without VTO for customers older than 30 years old, the difference is higher for customers younger than 30 years old. It implies that younger customers are affected more than older customers by VTO. That said, the null hypothesis, which was that the difference in mean sales of lipsticks between shopping with and without VTO for customers younger than 30 is smaller than customers older than 30, was rejected.
When it’s significant that VTO has a positive effect on customers younger than 30 years old, the finding can be a decisive influence on decision-making process for from marketing department of YSL to management class.A useful interpretation might be when it comes to buying lipsticks, younger customers consider VTO helpful in finding out if the lipsticks fit them, and so the sales increased accordingly. It’s hard to select customers to show VTO or not, since we don’t have the customer profiles of new customers. However, there’s a few key takeaways. The younger customers tend to be more interactive and acceptive of new changes in online shopping process, which makes room for future applications of marketing products. Possible business recommendations can be made through this finding. For example, interactive game or pop-up buttons for marketing purposes may interest younger customers, and reach the company’s target impression of certain content. On the other hand, to some extent, it connotes the factor of difference between online shopping and shopping brick-and-mortar, since the outcome of sales difference is triggered by online shopping behavior of customers from different ages. The inclination of online shopping and shopping brick-and-mortar for customers is worth looking into. In this case, for online shopping, younger customers are more likely to buy with VTO, however, things may go a different way when it happens in a mall. Younger customers may not care for paper coupons as older customers do. The company needs to know where its sales come from, why are the sales made, and whom make the purchase through what channel.
However, if the null hypothesis is not rejected, which infers that sales of younger customers are not affected by VTO technology more than older customers. Failure to reject null hypothesis would overturn our perception of the customer’s purchasing behavior, which would be worth digging into for a more in depth research on the interaction of factors contributing to this result. We would seek to answer (1) If there’s other confounding factor that makes younger customers consider VTO not helpful, and (2) what instead of VTO that reassures younger customers that the colors of lipstick fit them, for making further recommendations.
question3
Reiterating our null hypothesis, which was that VTO exerts the same effect on the mean sales of both the miscellaneous lipsticks and the bestseller ones, we can now confirm, based on the results of our analysis, that we are rejecting it. Indeed, from the results of our two-way ANOVA test, we have obtained statistical significant results confirming that the usage of VTO has a higher influence on the mean sales of miscellaneous lipsticks. Detailedly, since the independent variable- either group or lipstyle has a p-value of less than 0.05, and simultaneously, the combination of the two also has a significant p-value of 0.0079, it can be concluded that H1 hypothesis- in comparing the difference of the mean sales of lipsticks between shopping with VTO and without, the difference of the mean sales is higher for the category of miscellaneous lipsticks rather than for the bestseller ones-is true under this scenario.
Besides, apart from the conclusion we made, we want to use Tukey’s Honest Significant Difference test to further figure out the difference between VTO’s effect on bestseller group and miscellaneous group in a quantifiable scale. The data showcases that VTO has increases the mean sales of bestseller lipsticks by 0.21, while the mean sales of miscellaneous lipsticks has been improved by 0.45. The difference gap between these two improvements is 0.24. By contrast, the difference of mean sales between bestseller and miscellaneous products is 0.896, which is indicated by Non-VTO:1-Non-VTO:0. The above-mentioned illustration further prove our research question’s alternative hypothesis and thus, reject the null hypothesis.

There are two major limitations. Firstly, we only have one dependent variable quantities (or sales) in our research. It is one-sided because participants may think VTO as freshness at the first time and so would like to purchase more lipstick in this situation. Also, the organization could not just depend on sales to make decisions, for customers’ purchasing action could be assessed from multiple angles such as experiences, perceptions, and satisfactions. In the future, we could add more dependent variables in our experiment. Secondly, our research only focuses on lipstick. YSL has a lot of make-up products, such as eye shadow, blush etc. When customers want to purchase these products online, they could also use VTO. In the future, we could explore the effects of VTO on other make-up products.
Compared to on-line shopping without VTO, is mean sales of lipsticks higher in on-line shopping with VTO?
In this research question, we define no-effect as the situation where the mean sales of lipsticks with VTO is lower or no more than 5% higher than the mean sales without VTO.
We use simulation techniques to generate a dataset that will enable us to test the effect. We use rbinom function in R to generate binary variables, which suggests whether customers buy the lipstick or not. In order to simulate the situation where there is no significant difference in the mean sales between using VTO and not, we set the probability of buying lipsticks similar in the two groups.
For customer group who use VTO, we set the probability of buying lipsticks between 0.4 to 0.8. For customer group who don’t use VTO, we set the probability of buying lipsticks around the 5% higher or lower than the probability of VTO group. The buying probability for each group is summarized by use of VTO, age group and product type as Table 4.
Table 4. Parameter setting for no-effect scenario simulation
| Tre a tment | Age group | Best s eller | R e ddish o range | Pink or rosy | nude | b runet |
| VTO | Below 30 | 0 . 6-0.8 | 0.6 | 0 . 5-0.7 | 0 . 5-0.7 | 0 . 6-0.7 |
| VTO | Over 30 | 0.6 | 0.4 | 0.4 | 0.4 | 0.4 |
| N o n-VTO | Below 30 | 0 . 6-0.7 | 0.4 3 -0.53 | 0 .7 3 -0.83 | 0 .4 3 -0.63 | 0.4 3 -0.73 |
| N o n-VTO | Over 30 | 0.55 | 0 . 3-0.5 | 0 . 3-0.5 | 0 . 3-0.4 | 0 . 3-0.5 |
#We pack the data-generation code into a function
n<-2000
library(data.table)
library(dplyr)
library(forcats)
experiment<-function(seedn){
set.seed(seed=seedn)
n<-2000
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[,Age := sample(x=c("Below 20" ,"20-30" , "30-40" ,"40-50" , "50-60" ,"Above 60"), size = 2000, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.03, 0.01))]
VTO.dat <- VTO.dat %>%
mutate(Age_group = fct_recode(.f = Age, "Young_people" = "Below 20", "Young_people" = "20-30", "Non_young_people" = "30-40", "Non_young_people" = "40-50","Non_young_people" = "Above 60", "Non_young_people" = "50-60"))
nVTO_young=nrow(VTO.dat[group=='VTO'&Age_group=='Young_people'])
nNonVTO_young=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Young_people'])
nVTO_old=nrow(VTO.dat[group=='VTO'&Age_group=='Non_young_people'])
nNonVTO_old=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Non_young_people'])
VTO.dat<- data.table(VTO.dat)
#create binary variables to represent the probability that whether customers will buy this lipstick or not
#quantity_Best_seller_Group-use_vto
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
#quantity_Best_seller_Group-not_use_vto
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nNonVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
#add up the best seller
VTO.dat[,quantity_Best_seller_Group := quantity_407+quantity_416 +quantity_80+quantity_83 ]
#__________________________________________________________________
#quantity_reddish_orange
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_reddish_orange := quantity_406+quantity_408 +quantity_14+quantity_46]
#__________________________________________________________________
#quantity_pink_or_rosy
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nVTO_young,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nVTO_young,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nNonVTO_young,size=1,prob=0.73) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_pink_or_rosy := quantity_405+quantity_410 +quantity_12+quantity_84]
#__________________________________________________________________
#quantity_nude
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nNonVTO_old,size=1,prob=0.2) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_nude := quantity_404+quantity_434 +quantity_44+quantity_150]
#__________________________________________________________________
#quantity_brunet
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nNonVTO_young,size=1,prob=0.73) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_brunet := quantity_401+quantity_409 +quantity_122+quantity_131]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
return(VTO.dat)
}
#create analyze.experiment function to perform 2 sample t test with the datasets generated above
analyze.experiment<-function(the.dat){
#t.test
salestest<-t.test(x=the.dat[group == "VTO",sales],y=the.dat[group == "Non-VTO",sales],alternative = "greater")
#collect coefficients and calculate effect size
sales.effect<-salestest$estimate[1]-salestest$estimate[2]
effect.size<-sales.effect/salestest$estimate[2]
lower.bound<-salestest$conf.int[1]
upper.bound<-salestest$conf.int[2]
t<-salestest$statistic
p<-salestest$p.value
result<-data.table(effect=sales.effect,upper_ci=upper.bound,lower_ci=lower.bound,effect.size=effect.size,t=t,p=p)
return(result)
}
#perform simulation 2000 times; each simulation has a different seed so the data will be changed
B<-1000
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
round<-1:B
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
We run the simulation 2000 times to learn the mean effect. In 2000 times of random simulation, the mean effect is 1.728, which well suggests that there is no significant difference in the mean sales between using VTO and not. Also, we get a p-value higher than 0.05 in 91.6% of simulations, suggesting a strong experiment power.
If our future study based on real customer data receives such results, we should believe that VTO does not improve out sales revenue in general.
Table 5. Simulation results for no-effect scenario
| Res e arch Que s tion | Sce n ario | Mean E f fect in Simu l ated Data | 95% C onfi d ence Int e rval of Mean E f fect | P erce n tage of F alse Posi t ives | Pe r cent a geof True Nega t ives | P erce n tage of F alse Nega t ives | P erce n tage of True Posi t ives |
| Que s tion 1 | No E f fect | 1 .728 | [-2 6 .53, inf] | 0 . 0315 | 0 . 9685 |
# analyze the t-test result using the analyze.experiment function
s.results=s[,analyze.experiment(the.dat=.SD),keyby='round']
s.results[,summary(effect)]
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.84 13.23 16.32 16.38 19.56 31.20
summary(s.results)
round effect upper_ci lower_ci
Min. : 1.0 Min. : 1.84 Min. :Inf Min. :-5.861
1st Qu.: 500.8 1st Qu.:13.23 1st Qu.:Inf 1st Qu.: 5.524
Median :1000.5 Median :16.32 Median :Inf Median : 8.594
Mean :1000.5 Mean :16.38 Mean :Inf Mean : 8.640
3rd Qu.:1500.2 3rd Qu.:19.56 3rd Qu.:Inf 3rd Qu.:11.826
Max. :2000.0 Max. :31.20 Max. :Inf Max. :23.542
effect.size t p
Min. :0.004412 Min. :0.3932 Min. :0.0000000
1st Qu.:0.032443 1st Qu.:2.8200 1st Qu.:0.0000167
Median :0.040302 Median :3.4737 Median :0.0002624
Mean :0.040458 Mean :3.4846 Mean :0.0064424
3rd Qu.:0.048501 3rd Qu.:4.1589 3rd Qu.:0.0024256
Max. :0.078305 Max. :6.7050 Max. :0.3471123
s.results[,mean(p<0.05)] #the power of experiment; 1-power = type2 error
[1] 0.9685
s.results[,mean(effect.size>0.1)]
[1] 0
In this research question, we define a meaningful effect as 10% higher in the mean sales of lipsticks with VTO than without VTO.
We use simulation techniques to generate a data set that will enable us to test the effect. We use rbinom function in R to generate binary variables, which suggests whether customers buy the lipstick or not. For customer group who use VTO and is aged below 30, we set the probability of buying best-seller products as 0.8 and miscellaneous products between 0.7. For customer group who use VTO and is aged above 30, we set the probability of buying best-seller products as 0.6 and miscellaneous products as 0.4. For customer group who don’t use VTO and is aged below 30, we set the probability of buying best-seller products between 0.78 to 0.81 and miscellaneous products between 0.63 to 0.73. For customer group who don’t use VTO and is aged above 30, we set the probability of buying best-seller products between 0.59 to 0.61 and miscellaneous products between 0.34 to 0.43. The buying probability for each group is summarized by use of VTO, age group and product type as Table 6.
Table 6. Parameter setting for an effected scenario simulation
|
Age g r oup |
|
R e ddish or a nge |
|
n ude | br u net |
| VTO | Below 30 | 0.8 | 0.7 | 0.7 | 0.7 | 0.7 |
| VTO | Over 30 | 0.6 | 0.4 | 0.4 | 0.4 | 0.4 |
| N o n-VTO | Below 30 | 0 .78 - 0.81 | 0.63 -0.73 | 0.7 | 0.63 -0.73 | 0.68 -0.72 |
| N o n-VTO | Over 30 | 0.59 -0.61 | 0.34 -0.43 | 0.39 -0.41 |
|
|
#We pack the data-generation code into a function
n<-2000
library(data.table)
experiment<-function(seedn){
set.seed(seed=seedn)
n<-2000
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[,Age := sample(x=c("Below 20" ,"20-30" , "30-40" ,"40-50" , "50-60" ,"Above 60"), size = 2000, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.03, 0.01))]
VTO.dat <- VTO.dat %>%
mutate(Age_group = fct_recode(.f = Age, "Young_people" = "Below 20", "Young_people" = "20-30", "Non_young_people" = "30-40", "Non_young_people" = "40-50","Non_young_people" = "Above 60", "Non_young_people" = "50-60"))
nVTO_young=nrow(VTO.dat[group=='VTO'&Age_group=='Young_people'])
nNonVTO_young=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Young_people'])
nVTO_old=nrow(VTO.dat[group=='VTO'&Age_group=='Non_young_people'])
nNonVTO_old=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Non_young_people'])
VTO.dat<- data.table(VTO.dat)
#create binary variables to represent the probability that whether customers will buy this lipstick or not
#quantity_Best_seller_Group-use_vto
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nVTO_young,size=1,prob=0.9) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
#quantity_Best_seller_Group-not_use_vto
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nNonVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nNonVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nNonVTO_old,size=1,prob=0.55) ]
#add up the best seller
VTO.dat[,quantity_Best_seller_Group := quantity_407+quantity_416 +quantity_80+quantity_83 ]
#__________________________________________________________________
#quantity_reddish_orange
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nVTO_young,size=1,prob=0.9) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_reddish_orange := quantity_406+quantity_408 +quantity_14+quantity_46]
#__________________________________________________________________
#quantity_pink_or_rosy
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nVTO_young,size=1,prob=0.9) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nNonVTO_young,size=1,prob=0.73) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nNonVTO_young,size=1,prob=0.83) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_pink_or_rosy := quantity_405+quantity_410 +quantity_12+quantity_84]
#__________________________________________________________________
#quantity_nude
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nNonVTO_old,size=1,prob=0.2) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nNonVTO_young,size=1,prob=0.53) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_nude := quantity_404+quantity_434 +quantity_44+quantity_150]
#__________________________________________________________________
#quantity_brunet
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nNonVTO_young,size=1,prob=0.73) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nNonVTO_young,size=1,prob=0.43) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nNonVTO_young,size=1,prob=0.63) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nNonVTO_old,size=1,prob=0.3) ]
VTO.dat[,quantity_brunet := quantity_401+quantity_409 +quantity_122+quantity_131]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
return(VTO.dat)
}
#create analyze.experiment function to perform 2 sample t test with the datasets generated above
analyze.experiment<-function(the.dat){
#t.test
salestest<-t.test(x=the.dat[group == "VTO",sales],y=the.dat[group == "Non-VTO",sales],alternative = "greater")
#collect coefficients and calculate effect size
sales.effect<-salestest$estimate[1]-salestest$estimate[2]
effect.size<-sales.effect/salestest$estimate[2]
lower.bound<-salestest$conf.int[1]
upper.bound<-salestest$conf.int[2]
t<-salestest$statistic
p<-salestest$p.value
result<-data.table(effect=sales.effect,upper_ci=upper.bound,lower_ci=lower.bound,effect.size=effect.size,t=t,p=p)
return(result)
}
#perform simulation 2000 times; each simulation has a different seed so the data will be changed
B<-1000
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
round<-1:B
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
In 2000 times of random simulation, the mean effect size (the difference in mean sales) is 20.27. This effect size takes up 12.85% of mean sales of non-VTO group, which suggests a significant and meaningful increase in sales after using VTO. Also, we get a p-value lower than 0.05 in 99.1% of simulations and thus we can say we have a strong experiment power.
If our future study based on real customer data receives such results, we should believe that VTO does improve our sales revenue by 10% in general.
Table 7. Simulation results for an effected scenario
|
S cena r io |
Simu l ated D ata ** |
|
P erce n tage of F alse P o siti v es | P erce n tage of True N e gati v es | P erce n tage of F alse N e gati v es | P erce n tage of True P o siti v es |
| Que s tion 1 | Ef f ect: (Exp e cted S ize) | 2 0.27 | [- 6 .13, inf] | 0 . 0475 | 0 .984 |
# analyze the t-test result using the analyze.experiment function
s.results=s[,analyze.experiment(the.dat=.SD),keyby='round']
s.results[,summary(effect)]
Min. 1st Qu. Median Mean 3rd Qu. Max.
35.64 49.00 52.32 52.32 55.68 70.32
summary(s.results)
round effect upper_ci lower_ci
Min. : 1.0 Min. :35.64 Min. :Inf Min. :27.24
1st Qu.: 500.8 1st Qu.:49.00 1st Qu.:Inf 1st Qu.:40.57
Median :1000.5 Median :52.32 Median :Inf Median :43.87
Mean :1000.5 Mean :52.32 Mean :Inf Mean :43.87
3rd Qu.:1500.2 3rd Qu.:55.68 3rd Qu.:Inf 3rd Qu.:47.20
Max. :2000.0 Max. :70.32 Max. :Inf Max. :61.97
effect.size t p
Min. :0.0867 Min. : 6.983 Min. :0.000e+00
1st Qu.:0.1195 1st Qu.: 9.524 1st Qu.:0.000e+00
Median :0.1285 Median :10.182 Median :0.000e+00
Mean :0.1285 Mean :10.193 Mean :1.420e-15
3rd Qu.:0.1371 3rd Qu.:10.844 3rd Qu.:0.000e+00
Max. :0.1743 Max. :13.861 Max. :1.965e-12
s.results[,mean(p<0.05)] #the power of experiment; 1-power = type2 error
[1] 1
s.results[,mean(effect.size>0.1)]
[1] 0.984
Compared to the difference in mean sales of lipsticks between shopping with and without VTO in customers older than 30, is the difference higher in customers younger than 30?
In this research question, we define no-effect as the situation where the difference in mean sales of lipsticks between shopping with and without VTO in customers younger than 30 is lower than or equal to the difference older than 30.
We use simulation techniques to generate a dataset that will enable us to test the effect. We use rbinom function in R to generate binary variables, which suggests whether customers buy the lipstick or not. In order to simulate the situation where all of four group have the same sales., we set the probability of buying lipsticks similar in the four groups.
For customer group who use VTO and age below 30, we set the probability of buying lipsticks between 0.4 to 0.8. For customer group who use VTO and age over 30, we set the probability of buying lipsticks between 0.4 to 0.9. For customer group who don’t use VTO and below 30 or over 30, we set the probability of buying lipsticks around the 5% higher or lower than the probability of VTO group and below 30 or over 30. The buying probability for each group is summarized by use of VTO, age group and product type as Table8.
Table 8. Parameter setting for no-effect scenario simulation
|
Age g r oup |
|
R e ddish or a nge |
|
n ude | br u net |
| VTO | Below 30 | 0 . 6-0.7 | 0 . 4-0.5 | 0 . 4-0.6 | 0 . 5-0.7 | 0.4 -0.55 |
| VTO | Over 30 | 0 . 5-0.7 | 0.5 | 0 . 5-0.6 | 0 . 5-0.6 | 0 . 4-0.5 |
| N o n-VTO | Below 30 | 0 . 4-0.6 | 0.5 | 0 . 3-0.5 | 0 . 4-0.5 | 0 . 4-0.5 |
| N o n-VTO | Over 30 | 0 . 4-0.5 | 0 . 4-0.5 | 0.5 | 0 . 5-0.6 | 0 . 4-0.5 |
n<-2000
library(data.table)
experiment<-function(seedn){
set.seed(seed=seedn)
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[,Age := sample(x=c("Below 20" ,"20-30" , "30-40" ,"40-50" , "50-60" ,"Above 60"), size = 2000, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.03, 0.01))]
VTO.dat <- VTO.dat %>%
mutate(Age_group = fct_recode(.f = Age, "Young_people" = "Below 20", "Young_people" = "20-30", "Non_young_people" = "30-40", "Non_young_people" = "40-50","Non_young_people" = "Above 60", "Non_young_people" = "50-60"))
nVTO_young=nrow(VTO.dat[group=='VTO'&Age_group=='Young_people'])
nNonVTO_young=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Young_people'])
nVTO_old=nrow(VTO.dat[group=='VTO'&Age_group=='Non_young_people'])
nNonVTO_old=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Non_young_people'])
VTO.dat<- data.table(VTO.dat)
#quantity_Best_seller_Group-use_vto
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
#quantity_Best_seller_Group-not_use_vto
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
#add up the best seller
VTO.dat[,quantity_Best_seller_Group := quantity_407+quantity_416 +quantity_80+quantity_83 ]
#__________________________________________________________________
#quantity_reddish_orange
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_reddish_orange := quantity_406+quantity_408 +quantity_14+quantity_46]
#__________________________________________________________________
#quantity_pink_or_rosy
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nNonVTO_young,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_pink_or_rosy := quantity_405+quantity_410 +quantity_12+quantity_84]
#__________________________________________________________________
#quantity_nude
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nNonVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_nude := quantity_404+quantity_434 +quantity_44+quantity_150]
#__________________________________________________________________
#quantity_brunet
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nVTO_young,size=1,prob=0.55) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_brunet := quantity_401+quantity_409 +quantity_122+quantity_131]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
return(VTO.dat)
}
analyze.experiment<-function(data3){
model1 <- lm(sales~group+Age_group+group*Age_group,data=data3)
anova<-anova(model1)
interactionv2<-aov(sales~group+Age_group+group*Age_group,data=data3)
Tukey<-TukeyHSD(interactionv2, ordered=FALSE, conf.level=.95)
p_group_Age_group<-anova$`Pr(>F)`[3] # the p value of anova
Tukey_p_higher_than_0.05<-sum(Tukey$`group:Age_group`[,4]>0.05) #count the number of p value over 0.05 ( for Tukey,all 6 combo considered)
VTO_young_Non_VTO_young<-Tukey$`group:Age_group`[1,1] #difference between VTO_1 and VTO_0
VTO_old_Non_VTO_old<-Tukey$`group:Age_group`[6,1] #difference between Non_VTO_1 and Non_VTO0
diff<-VTO_young_Non_VTO_young-VTO_old_Non_VTO_old
result<-data.table(p_group_Age_group=p_group_Age_group,
Tukey_p_higher_than_0.05=Tukey_p_higher_than_0.05,
VTO_young_Non_VTO_young=VTO_young_Non_VTO_young,
VTO_old_Non_VTO_old=VTO_old_Non_VTO_old,
diff=diff)
return(result)
}
#run the simulation 2000 times
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
In 2000 times of random simulation, the mean difference between young group using VTO and young group without VTO in sales is 30.06. the mean difference between old group using VTO and young group without VTO in sales is 31.75. the mean effect is -1.69, which suggests young people do not have a significant and meaningful increase in sales after using VTO than old people. Also, we get a p-value lower than 0.05 in 5.85% of simulations and thus we can say we do not have a strong experiment power.
Table 9. Simulation results for no-effect scenario
|
S cena r io |
Simu l ated D ata ** |
|
P erce n tage of F alse P o siti v es | P erce n tage of True N e gati v es | P erce n tage of F alse N e gati v es | P erce n tage of True P o siti v es |
| Que s tion 2 | No E f fect |
|
[ -29 . 312, 22. 877] | 0. 064 | 0. 936 |
s.results=s[,analyze.experiment(data3=.SD),keyby='round']
summary(s.results)
round p_group_Age_group Tukey_p_higher_than_0.05
Min. : 1.0 Min. :0.001451 Min. :0.000
1st Qu.: 500.8 1st Qu.:0.216219 1st Qu.:1.000
Median :1000.5 Median :0.473612 Median :2.000
Mean :1000.5 Mean :0.479631 Mean :1.426
3rd Qu.:1500.2 3rd Qu.:0.729629 3rd Qu.:2.000
Max. :2000.0 Max. :0.999925 Max. :2.000
VTO_young_Non_VTO_young VTO_old_Non_VTO_old diff
Min. :23.73 Min. :22.55 Min. :-24.036
1st Qu.:38.19 1st Qu.:35.73 1st Qu.: -2.856
Median :42.24 Median :39.64 Median : 2.606
Mean :42.10 Mean :39.71 Mean : 2.390
3rd Qu.:45.90 3rd Qu.:43.64 3rd Qu.: 7.887
Max. :57.64 Max. :58.59 Max. : 25.294
s.results[,mean(p_group_Age_group<0.05)]
[1] 0.064
In this research question, we define an expected effect as the situation where the difference in mean sales of lipsticks between shopping with and without VTO in customers younger than 30 is higher more 10% than the difference older than 30.
We use simulation techniques to generate a dataset that will enable us to test the effect. We use rbinom function in R to generate binary variables, which suggests whether customers buy the lipstick or not. In order to simulate the situation where all of four group have the same sales., we set the probability of buying lipsticks similar in the four groups.
For customer group who use VTO and age below 30, we set the probability of buying lipsticks between 0.4 to 0.8. For customer group who use VTO and age over 30, we set the probability of buying lipsticks between 0.4 to 0.9. For customer group who don’t use VTO and below 30 or over 30, we set the probability of buying lipsticks around the 5% higher or lower than the probability of VTO group and below 30 or over 30. The buying probability for each group is summarized by use of VTO, age group and product type as Table10
Table 10. Parameter setting for an effected scenario simulation
|
Age g r oup |
|
R e ddish or a nge |
|
n ude | br u net |
| VTO | Below 30 | 0 . 6-0.8 | 0 . 4-0.7 | 0 . 5-0.6 | 0 . 5-0.7 | 0 . 6-0.8 |
| VTO | Over 30 | 0 . 5-0.7 | 0.5 | 0 . 5-0.9 | 0 . 4-0.6 | 0 . 5-0.8 |
| N o n-VTO | Below 30 | 0 . 4-0.6 | 0.5 | 0 . 3-0.5 | 0 . 4-0.5 | 0 . 4-0.7 |
| N o n-VTO | Over 30 | 0 . 4-0.5 | 0 . 4-0.5 | 0.5 | 0 . 5-0.8 | 0 . 4-0.5 |
#We pack the data-generation code into a function
n<-2000
library(data.table)
experiment<-function(seedn){
set.seed(seed=seedn)
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[,Age := sample(x=c("Below 20" ,"20-30" , "30-40" ,"40-50" , "50-60" ,"Above 60"), size = 2000, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.03, 0.01))]
VTO.dat <- VTO.dat %>%
mutate(Age_group = fct_recode(.f = Age, "Young_people" = "Below 20", "Young_people" = "20-30", "Non_young_people" = "30-40", "Non_young_people" = "40-50","Non_young_people" = "Above 60", "Non_young_people" = "50-60"))
nVTO_young=nrow(VTO.dat[group=='VTO'&Age_group=='Young_people'])
nNonVTO_young=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Young_people'])
nVTO_old=nrow(VTO.dat[group=='VTO'&Age_group=='Non_young_people'])
nNonVTO_old=nrow(VTO.dat[group=='Non-VTO'&Age_group=='Non_young_people'])
VTO.dat<- data.table(VTO.dat)
#quantity_Best_seller_Group-use_vto
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nVTO_young,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nVTO_old,size=1,prob=0.7) ]
#quantity_Best_seller_Group-not_use_vto
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_407 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_407 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_416 := rbinom(n=nNonVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_416 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_80 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_80 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_83 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_83 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
#add up the best seller
VTO.dat[,quantity_Best_seller_Group := quantity_407+quantity_416 +quantity_80+quantity_83 ]
#__________________________________________________________________
#quantity_reddish_orange
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_406 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_406 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_408 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_408 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_14 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_14 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_46 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_46 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_reddish_orange := quantity_406+quantity_408 +quantity_14+quantity_46]
#__________________________________________________________________
#quantity_pink_or_rosy
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nVTO_old,size=1,prob=0.9) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_405 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_405 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_410 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_410 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_12 := rbinom(n=nNonVTO_young,size=1,prob=0.3) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_12 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_84 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_84 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_pink_or_rosy := quantity_405+quantity_410 +quantity_12+quantity_84]
#__________________________________________________________________
#quantity_nude
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_404 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_404 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_434 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_434 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_44 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_44 := rbinom(n=nNonVTO_old,size=1,prob=0.8) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_150 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_150 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_nude := quantity_404+quantity_434 +quantity_44+quantity_150]
#__________________________________________________________________
#quantity_brunet
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nVTO_old,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nVTO_old,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nVTO_young,size=1,prob=0.8) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nVTO_old,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nVTO_young,size=1,prob=0.6) ]
VTO.dat[group == "VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_401 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_401 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_409 := rbinom(n=nNonVTO_young,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_409 := rbinom(n=nNonVTO_old,size=1,prob=0.4) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_122 := rbinom(n=nNonVTO_young,size=1,prob=0.7) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_122 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Young_people',quantity_131 := rbinom(n=nNonVTO_young,size=1,prob=0.5) ]
VTO.dat[group == "Non-VTO"&Age_group=='Non_young_people',quantity_131 := rbinom(n=nNonVTO_old,size=1,prob=0.5) ]
VTO.dat[,quantity_brunet := quantity_401+quantity_409 +quantity_122+quantity_131]
#__________________________________________________________________
#quantity and sales
VTO.dat[,quantity := quantity_Best_seller_Group+quantity_reddish_orange+quantity_pink_or_rosy+quantity_nude+quantity_brunet]
VTO.dat[,sales := quantity*40]
return(VTO.dat)
}
#create analyze.experiment function to perform Anova test
analyze.experiment<-function(data3){
model1 <- lm(sales~group+Age_group+group*Age_group,data=data3)
anova<-anova(model1)
interactionv2<-aov(sales~group+Age_group+group*Age_group,data=data3)
Tukey<-TukeyHSD(interactionv2, ordered=FALSE, conf.level=.95)
p_group_Age_group<-anova$`Pr(>F)`[3] # the p value of anova
Tukey_p_higher_than_0.05<-sum(Tukey$`group:Age_group`[,4]>0.05) #count the number of p value over 0.05 ( for Tukey,all 6 combo considered)
VTO_young_Non_VTO_young<-Tukey$`group:Age_group`[1,1] #difference between VTO_1 and VTO_0
VTO_old_Non_VTO_old<-Tukey$`group:Age_group`[6,1] #difference between Non_VTO_1 and Non_VTO0
diff<-VTO_young_Non_VTO_young-VTO_old_Non_VTO_old
result<-data.table(p_group_Age_group=p_group_Age_group,
Tukey_p_higher_than_0.05=Tukey_p_higher_than_0.05,
VTO_young_Non_VTO_young=VTO_young_Non_VTO_young,
VTO_old_Non_VTO_old=VTO_old_Non_VTO_old,
diff=diff)
return(result)
}
#run the simulation 2000 times
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
In 2000 times of random simulation, the mean difference between young group using VTO and young group without VTO in sales is 96.04. the mean difference between old group using VTO and young group without VTO in sales is 71.87. the mean effect is 24.17, which suggests young people have a significant and meaningful increase in sales after using VTO than old people. Also, we get a p-value lower than 0.05 in 87% of simulations and thus we can say we have a strong experiment power.
Table 11. Simulation results for an effect scenario
|
S cena r io |
Simu l ated D ata ** |
|
P erce n tage of F alse P o siti v es | P erce n tage of True N e gati v es | P erce n tage of F alse N e gati v es | P erce n tage of True P o siti v es |
| Que s tion 2 | Ef f ect: ( Exp e cted S ize) | 2 4 .166 | [4 . 515, 48 . 396] | 87% | 13% |
s.results=s[,analyze.experiment(data3=.SD),keyby='round']
summary(s.results)
round p_group_Age_group Tukey_p_higher_than_0.05
Min. : 1.0 Min. :0.0000000 Min. :0.000
1st Qu.: 500.8 1st Qu.:0.0001638 1st Qu.:1.000
Median :1000.5 Median :0.0018659 Median :1.000
Mean :1000.5 Mean :0.0287945 Mean :1.206
3rd Qu.:1500.2 3rd Qu.:0.0150387 3rd Qu.:2.000
Max. :2000.0 Max. :0.9625853 Max. :2.000
VTO_young_Non_VTO_young VTO_old_Non_VTO_old diff
Min. : 79.81 Min. :55.09 Min. :-4.515
1st Qu.: 92.23 1st Qu.:68.22 1st Qu.:18.897
Median : 95.98 Median :72.05 Median :24.277
Mean : 96.04 Mean :71.87 Mean :24.166
3rd Qu.: 99.84 3rd Qu.:75.38 3rd Qu.:29.274
Max. :113.30 Max. :91.71 Max. :48.396
s.results[,mean(p_group_Age_group<0.05)]
[1] 0.87
In comparing the difference of the mean sales of lipsticks between shopping with VTO and without, the difference of the mean sales is higher for the category of miscellaneous lipsticks rather than for the bestseller ones?
In this research question, we define no-effect as the situation where the change in mean sales of miscellaneous lipsticks by using VTO is lower or no more than 5% higher than the changes in mean sales of bestsellers by using VTO.
We use simulation techniques to generate a dataset that will enable us to test the effect. We use rbinom function in R to generate quantity variables, which suggest how many lipsticks each customer would buy. The purpose is to simulate the situation where there is no significant difference in the change in mean sales of miscellaneous lipsticks and the changes in mean sales of bestsellers by using VTO.
The buying probability for each group is summarized by the use of VTO, Lipstyle and buying quantity in Table12. Basically, the probability to buy more lipsticks for the bestseller group is more, however, the effects of VTO on both groups are set to be quite similar.
Table 12. Parameter setting for no-effect scenario simulation
| Tre a tment | Li p style | Quan t ity_0 | Quan t ity_1 | Quan t ity_2 | Quan t ity_3 | Quan t ity_4 |
|---|---|---|---|---|---|---|
| VTO | Best s eller | 0.01 | 0.10 | 0.25 | 0.38 | 0.18 |
| VTO | M i scell a neous | 0.09 | 0.32 | 0.33 | 0.16 | 0.1 |
| N o n-VTO | Best s eller | 0.04 | 0.13 | 0.25 | 0.35 | 0.15 |
| N o n-VTO | M i scell a neous | 0.15 | 0.35 | 0.33 | 0.13 | 0.04 |
#simulate the data for the third question
#seed=198
#form the dataset
experiment<-function(seedn){
set.seed(seed=seedn)
n=2000
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[group == "VTO",lipstyle:=c(rep.int(x=1,times=n/4),rep.int(x=0,times=n/4)) ]
VTO.dat[group == "Non-VTO",lipstyle:=c(rep.int(x=1,times=n/4),rep.int(x=0,times=n/4)) ]
VTO.dat[group == "VTO"&lipstyle==1, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.01,0.10, 0.25, 0.38,0.18))]
VTO.dat[group == "VTO"&lipstyle==0, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.09,0.32, 0.33, 0.16,0.10))]
VTO.dat[group == "Non-VTO"&lipstyle==1, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.04,0.13, 0.25, 0.35,0.15))]
VTO.dat[group == "Non-VTO"&lipstyle==0, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.04))]
VTO.dat <- VTO.dat %>% mutate(testID=1:2000)
data3 <- data.frame(VTO.dat, stringsAsFactors = TRUE)
data3$lipstyle <- as.factor(data3$lipstyle)
return(data3)
}
#do the anova test
analyze.experiment<-function(data3){
test3 <- lm(quantity~group+lipstyle+group*lipstyle, data=data3)
anova<-anova(test3)
interaction3 <- aov(quantity~group+lipstyle+group*lipstyle, data=data3)
Tukey<-TukeyHSD(interaction3, ordered=FALSE, conf.level=.95)
p_group_lipstyle<-anova$`Pr(>F)`[3] # the p value of anova
Tukey_p_higher_than_0.05<-sum(Tukey$`group:lipstyle`[,4]>0.05) #count the number of p value over 0.05 ( for Tukey,all 6 combo considered)
VTO_0_Non_VTO_0<-Tukey$`group:lipstyle`[1,1] #difference between VTO_1 and VTO_0
VTO_1_Non_VTO_1<-Tukey$`group:lipstyle`[6,1] #difference between Non_VTO_1 and Non_VTO0
diff<-VTO_0_Non_VTO_0-VTO_1_Non_VTO_1
result<-data.table(p_group_lipstyle=p_group_lipstyle,
Tukey_p_higher_than_0.05=Tukey_p_higher_than_0.05,
VTO_0_Non_VTO_0=VTO_0_Non_VTO_0,
VTO_1_Non_VTO_1=VTO_1_Non_VTO_1,
diff=diff)
return(result)
}
#run the simulation 2000 times
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
We run the simulation 2000 times to learn the mean effect. In 2000 times of random simulation, the mean effect is 0.102, which well suggests that there is no significant difference in the changes in sales of using VTO between the bestseller group and the miscellaneous group. The p-value is 0.3421, which is higher than 0.05
Table 13. Simulation results for no-effect scenario
|
S cena r io |
Simu l ated D ata ** |
|
P erce n tage of F alse P o siti v es | P erce n tage of True N e gati v es | P erce n tage of F alse N e gati v es | P erce n tage of True P o siti v es |
| Que s tion 3 | No E f fect | 0 .102 | [-0 . 238, 0. 436] | 0 .199 | 0 .801 |
s.results=s[,analyze.experiment(data3=.SD),keyby='round']
summary(s.results)
round p_group_lipstyle Tukey_p_higher_than_0.05 VTO_0_Non_VTO_0
Min. : 1.0 Min. :0.0000023 Min. :0.000 Min. :0.0760
1st Qu.: 500.8 1st Qu.:0.0803692 1st Qu.:0.000 1st Qu.:0.2540
Median :1000.5 Median :0.2637702 Median :0.000 Median :0.3000
Mean :1000.5 Mean :0.3421416 Mean :0.348 Mean :0.2998
3rd Qu.:1500.2 3rd Qu.:0.5728970 3rd Qu.:1.000 3rd Qu.:0.3460
Max. :2000.0 Max. :1.0000000 Max. :2.000 Max. :0.5620
VTO_1_Non_VTO_1 diff
Min. :0.0080 Min. :-0.238
1st Qu.:0.1560 1st Qu.: 0.042
Median :0.1980 Median : 0.100
Mean :0.1978 Mean : 0.102
3rd Qu.:0.2400 3rd Qu.: 0.162
Max. :0.4420 Max. : 0.436
s.results[,mean(p_group_lipstyle<0.05)]
[1] 0.199
In this research question, we define a meaningful effect as 10% higher in the changes in sales of lipsticks by using VTO in the miscellaneous group than in the bestseller group.
We use simulation techniques to generate a data set that will enable us to test the effect. We use rbinom function in R to generate quantity variables, which suggest how many lipsticks each customer would buy. The purpose is to simulate the situation where there is a significant difference in the change in mean sales of miscellaneous lipsticks and the changes in mean sales of bestsellers by using VTO.
The buying probability for each group is summarized by the use of VTO, Lipstyle and buying quantity in TableX. Basically, the probability to buy more lipsticks for the bestseller group is more. Besides, the effects of VTO on the miscellaneous group is designed to be larger than on the bestseller group, represented by a larger increase in higher quantity rate in the miscellaneous group using VTO than not using VTO.
Table 14. Parameter setting for an effected scenario simulation
| Tre a tment | Li p style | Quan t ity_0 | Quan t ity_1 | Quan t ity_2 | Quan t ity_3 | Quan t ity_4 |
|---|---|---|---|---|---|---|
| VTO | Best s eller | 0.01 | 0.07 | 0.37 | 0.38 | 0.17 |
| VTO | M i scell a neous | 0.08 | 0.25 | 0.28 | 0.33 | 0.06 |
| N o n-VTO | Best s eller | 0.04 | 0.13 | 0.25 | 0.35 | 0.15 |
| N o n-VTO | M i scell a neous | 0.15 | 0.35 | 0.33 | 0.13 | 0.04 |
#simulate the data for the third question
#seed=198
#form the dataset
experiment<-function(seedn){
set.seed(seed=seedn)
n=2000
group<-c(rep.int(x='VTO',times=n/2),rep.int(x='Non-VTO',times=n/2))
VTO.dat<- data.table(group=group)
VTO.dat[group == "VTO",lipstyle:=c(rep.int(x=1,times=n/4),rep.int(x=0,times=n/4)) ]
VTO.dat[group == "Non-VTO",lipstyle:=c(rep.int(x=1,times=n/4),rep.int(x=0,times=n/4)) ]
VTO.dat[group == "VTO"&lipstyle==1, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.01,0.07, 0.37, 0.38,0.17))]
VTO.dat[group == "VTO"&lipstyle==0, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.08,0.25, 0.28, 0.33,0.06))]
VTO.dat[group == "Non-VTO"&lipstyle==1, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.04,0.13, 0.25, 0.35,0.15))]
VTO.dat[group == "Non-VTO"&lipstyle==0, quantity := sample(x=c(0,1,2,3,4), size =n/4, replace =T,prob=c(0.15,0.35, 0.33, 0.13,0.04))]
VTO.dat <- VTO.dat %>% mutate(testID=1:2000)
data3 <- data.frame(VTO.dat, stringsAsFactors = TRUE)
data3$lipstyle <- as.factor(data3$lipstyle)
return(data3)
}
#do the anova test
analyze.experiment<-function(data3){
test3 <- lm(quantity~group+lipstyle+group*lipstyle, data=data3)
anova<-anova(test3)
interaction3 <- aov(quantity~group+lipstyle+group*lipstyle, data=data3)
Tukey<-TukeyHSD(interaction3, ordered=FALSE, conf.level=.95)
p_group_lipstyle<-anova$`Pr(>F)`[3] # the p value of anova
Tukey_p_higher_than_0.05<-sum(Tukey$`group:lipstyle`[,4]>0.05) #count the number of p value over 0.05 ( for Tukey,all 6 combo considered)
VTO_0_Non_VTO_0<-Tukey$`group:lipstyle`[1,1] #difference between VTO_1 and VTO_0
VTO_1_Non_VTO_1<-Tukey$`group:lipstyle`[6,1] #difference between Non_VTO_1 and Non_VTO0
diff<-VTO_0_Non_VTO_0-VTO_1_Non_VTO_1
result<-data.table(p_group_lipstyle=p_group_lipstyle,
Tukey_p_higher_than_0.05=Tukey_p_higher_than_0.05,
VTO_0_Non_VTO_0=VTO_0_Non_VTO_0,
VTO_1_Non_VTO_1=VTO_1_Non_VTO_1,
diff=diff)
return(result)
}
#run the simulation 2000 times
n<-2000
RNGversion(vstr=3.6)
set.seed(seed=198)
exp=1
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
s=cbind(round,sim)
x <- 2:n
for (exp in x) {
sim=experiment(exp)
round=rep.int(x=exp,times=n)
round=as.data.table(round)
ss=cbind(round,sim)
s=rbind(s,ss)
}
In 2000 times of random simulation, the mean effect is 0.3253, which suggests a significant and meaningful larger changes in sales by using VTO in the miscellaneous group. Also, we get a p-value lower than 0.05 in 95.45% of simulations and thus we can say we have a strong experiment power.
Table 15. Simulation results for an effected scenario
|
S cena r io |
Simu l ated D ata ** |
|
P erce n tage of F alse P o siti v es | P erce n tage of True N e gati v es | P erce n tage of F alse N e gati v es | P erce n tage of True P o siti v es |
|---|---|---|---|---|---|---|---|
| Que s tion 3 | Ef f ect: Exp e cted Size | 0 . 3253 | [ 0.02 4 0,0. 6 500] | 0 . 0455 | 0 . 9545 |
s.results=s[,analyze.experiment(data3=.SD),keyby='round']
summary(s.results)
round p_group_lipstyle Tukey_p_higher_than_0.05 VTO_0_Non_VTO_0
Min. : 1.0 Min. :0.0000000 Min. :0.000 Min. :0.2520
1st Qu.: 500.8 1st Qu.:0.0000202 1st Qu.:0.000 1st Qu.:0.4340
Median :1000.5 Median :0.0003251 Median :1.000 Median :0.4760
Mean :1000.5 Mean :0.0097197 Mean :0.575 Mean :0.4783
3rd Qu.:1500.2 3rd Qu.:0.0033480 3rd Qu.:1.000 3rd Qu.:0.5220
Max. :2000.0 Max. :0.7882708 Max. :1.000 Max. :0.6880
VTO_1_Non_VTO_1 diff
Min. :-0.044 Min. :0.0240
1st Qu.: 0.112 1st Qu.:0.2655
Median : 0.152 Median :0.3260
Mean : 0.153 Mean :0.3253
3rd Qu.: 0.194 3rd Qu.:0.3840
Max. : 0.382 Max. :0.6500
s.results[,mean(p_group_lipstyle<0.05)]
[1] 0.9545
Bialkova, S., & Barr, C. (2022). Virtual try-on: How to enhance consumer experience? In 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) (pp. 01-08). IEEE.
Bigne, E. (2021). A model of adoption of AR-based self-service technologies: a two country comparison. International Journal of Retail & Distribution Management.
Collins, H.N., Johnson, P.I., Calderon, N.M. et al. Differences in personal care product use by race/ethnicity among women in California: implications for chemical exposures. J Expo Sci Environ Epidemiol (2021). https://doi.org/10.1038/s41370-021-00404-7.
Poushineh, A., & Vasquez-Parraga, A. Z. (2017). Discernible impact of augmented reality on retail customer’s experience, satisfaction, and willingness to buy. Journal of Retailing and Consumer Services, 34, 229-234.