Cluster Analysis in R, when we do data analytics, there are two kinds
of approaches, one is supervised and the other is unsupervised.
Clustering is a method for finding subgroups of observations within a
data set. When we are doing clustering, we need observations in the same
group with similar patterns and observations in different groups to be
dissimilar.
If there is no response variable, then suitable for an unsupervised
method, which implies that it seeks to find relationships between the n
observations without being trained by a response variable. Thus,
clustering allows us to identify homogenous groups and categorize them
from the dataset.
One of the simplest clustering is K-means, the most commonly used
clustering method for splitting a dataset into a set of n groups. If
datasets contain no response variable and with many variables then it
comes under an unsupervised approach. Moreover, cluster analysis is an
unsupervised approach and sed for segmenting markets into groups of
similar customers or patterns.
In summary, cluster analysis is a statistical technique that groups
similar observations into clusters based on their
characteristics.Clustering is the task of dividing the population or
data points into a number of groups such that data points in the same
groups are more similar to other data points in the same group and
dissimilar to the data points in other groups. The goal of clustering is
to identify patterns or groups of similar objects within a data set of
interest.
Two research scenarios where cluster analysis may be used as
the data analysis strategy.
Cluster analysis is a definite benefit, and it is widely used across
industries, functionalities, and the research field. To better depict
the usefulness of cluster analysis in research, let us look at the
bottom two examples.
1. Data is imperative for brands and organizations to derive inferences and draw conclusions into the mind of customers. Cluster analysis is a critical component of data analysis in market research that aids brands with deriving trends, identifying groups among various demographics of customers, purchase behaviors, likes and dislikes, and more.
This market research analysis method provides insights into bucketing
information into smaller groups that help understand how different
groups of persons respond under comparable conditions. Various
organizations and academics can categorize clusters based on pre-defined
criteria for what makes sense of a cluster, but the underlying data
analysis theme is consistent.
FUrthermore, brands have long used cluster analysis to make sense of
purchasing behavior studies and trends among their client base by
applying demographic segmentation. Geographic location, gender, age,
annual family income, and other criteria are commonly evaluated.
These parameters emphasize how various customer groups make other
purchasing decisions; as a result, retail behemoths use this data to
draw parallels on how to promote to such audiences. This also aids in
increasing the ROI on spending while decreasing client attrition.
2 Classes, or conceptually significant collections
of items with similar properties, are fundamental in how individuals
examine and describe the world. Indeed, humans are adept at grouping
objects (clustering) and allocating specific objects to these categories
(classification). Even very young children, for example, can swiftly
describe the items in an image as buildings, automobiles, people,
animals, plants, and so on. Clusters are prospective classes in the
context of data interpretation, and cluster analysis is the study of
strategies for automatically finding classes. The following is an
excellent example from Biology.
Biologists has spent many years developing a taxonomy (hierarchical classification) of all living things, which includes kingdoms, phyla, classes, orders, families, genera, and species. As a result, it is perhaps not surprising that much of the early work in cluster analysis tried to develop a mathematical taxonomy field capable of automatically discovering such classification systems. Recently, we used clustering to examine the massive volumes of genetic data that are now available. As an example, clustering analysis has been used to identify groupings of genes with similar functions.
The one-way multivariate analysis of variance (one-way MANOVA) is
used to determine whether there are any differences between independent
groups on more than one continuous dependent variable. In this regard,
it differs from a one-way ANOVA, which only measures one dependent
variable. The following are some scenarios where we uses one way
MANOVA.
Two research scenarios where one way MANOVA may be used as the data analysis strategy.
1. You could use a one-way MANOVA to see if there
were differences in drug users’ perceptions of attractiveness and
intelligence in movies (i.e., the two dependent variables are
“perceptions of attractiveness” and “perceptions of intelligence”, while
the independent variable is “drug users in movies”, which has three
independent groups: “non-user”, “experimenter”, and “regular user”).
Alternatively, you could use a one-way MANOVA to see if there were
differences in students’ short-term and long-term recall of facts based
on three different lengths of lecture (i.e., the two dependent variables
are “short-term memory recall” and “long-term memory recall”, and the
independent variable is “lecture duration”, which has four independent
groups: “30 minutes”, “60 minutes”, “90 minutes”, and “120
minutes”).
2 A researcher allocates 33 individuals to one of three groups at random. The first group receives interactive dietary information from an online website. Group 2 gets the same information from a nurse practitioner, but Group 3 gets it from a video film recorded by the same nurse practitioner. To see if there is a difference in presenting types, the researcher looks at three separate ratings of the presentation: difficulty, usefulness, and importance. The researcher is particularly interested in whether the interactive website is preferable because it is the most cost-effective method of presenting information.
Additional Exmaple
A clinical psychologist enrolls 100 persons with panic disorder in
his study. For eight weeks, each participant receives one of four types
of treatment. At the completion of treatment, each subject takes part in
a structured interview during which the clinical psychologist rates them
in three categories: physiological, emotional, and cognitive. The
clinical psychologist wants to determine which sort of treatment best
reduces panic disorder symptoms as measured by physiological, emotional,
and cognitive measures.
Linear Discriminant Analysis (LDA) is a dimensionality reduction
technique. LDA used for dimensionality reduction to reduce the number of
dimensions (i.e. variables) in a dataset while retaining as much
information as possible.
Basically, it helps to find the linear combination of original
variables that provide the best possible separation between the groups.
This, we provide some scenarios below where we can use LDA.
Two research scenarios where linear discriminant analysis may be used as the data analysis strategy.
1. LDA is being used to identify customers. With the
help of LDA, we can easily locate and select features that can specify
the demography of consumers who are more inclined to buy a specific
product in a mall. This can be useful if we want to identify a specific
demographic of shoppers who tend to buy a particular item in a
mall.
2 LDA is extremely useful in the medical profession
for categorizing patient diseases based on a variety of factors relating
to the patient’s health and the current medical treatments. It
categorizes disease as mild, moderate, or severe based on these
criteria. The doctors can alter the treatment’s pace by using this
classification to their advantage.
Factor analysis is a statistical method used to search for some
unobserved variables called factors from observed variables called
factors. It uses uses the correlation structure amongst observed
variables to model a smaller number of unobserved, latent variables
known as factors. Researchers use this statistical method when
subject-area knowledge suggests that latent factors cause observable
variables to covary. Use factor analysis to identify the hidden
variables.
Analysts often refer to the observed variables as indicators because
they literally indicate information about the factor. Factor analysis
treats these indicators as linear combinations of the factors in the
analysis plus an error. The procedure assesses how much of the variance
each factor explains within the indicators. The idea is that the latent
factors create commonalities in some of the observed variables.
For example, socioeconomic status (SES) is a factor you can’t measure
directly. However, you can assess occupation, income, and education
levels. These variables all relate to socioeconomic status. People with
a particular socioeconomic status tend to have similar values for the
observable variables. If the factor (SES) has a strong relationship with
these indicators, then it accounts for a large portion of the variance
in the indicators. Hence, examples are as follows:
Two research scenarios where factor analysis may be used as the data analysis strategy.
1. Factor analysis is used to uncover “factors” that
explain a wide range of test results. For example, intelligence studies
discovered that persons who perform well on a verbal ability test
perform well on other examinations that need linguistic talents.
Researchers explained this by utilizing component analysis to identify
one factor, commonly referred to as verbal intelligence, which
represents a person’s ability to solve problems involving verbal
skills.
Factor analysis in psychology is most often associated with
intelligence research. However, it also has been used to find factors in
a broad range of domains such as personality, attitudes, beliefs, etc.
It is linked to psychometrics, as it can assess the validity of an
instrument by finding if the instrument indeed measures the postulated
factors.
Hence, to find “factors” that can explain a range of test outcomes,
factor analysis is performed. For instance, intelligence studies have
shown that persons who perform well on verbal aptitude exams also
perform well on other verbal aptitude tests. Researchers used factor
analysis to identify one factor, commonly referred to as verbal
intelligence, which reflects an individual’s capacity to solve
challenges involving verbal skills.
2 Data mining and machine learning go together.
Factor Analysis may be a Machine Learning tool because of this. Machine
learning algorithms employ Factor Analysis to minimise the number of
variables in a dataset to get a more accurate and enhanced collection of
observable factors. They are well trained with massive data to make room
for additional applications. It is a popular unsupervised machine
learning technique for dimensionality reduction. Machine learning and
Factor Analysis may create data mining methods and speed up data
investigation.
Factor Analysis can rival artificial intelligence in data mining. FA
simplifies data mining by filtering out variables that are linked. Data
scientists have long struggled to uncover links and correlate variables.
This statistical strategy has improved data mining.
Also, Marketing promotes products, services, and brands. This statistical technique might aid marketing factor analysis. Businesses use this analysis to establish the link between marketing campaign aspects to improve their long-term performance. It also links customer satisfaction to post-campaign feedback to quantify campaign efficacy and audience impact. Thus, factor analysis may improve marketing input and consumer happiness, increasing sales.
‘canondat.sav’ data is not found
a. Screen the data for violation of assumptions. Perform appropriate transformations, where appropriate. Be sure to explain the bases for your transformations.
library(mvtnorm)
library(QuantPsyc)
## Warning: package 'QuantPsyc' was built under R version 4.2.3
## Loading required package: boot
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: purrr
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
##
## Attaching package: 'QuantPsyc'
## The following object is masked from 'package:base':
##
## norm
library(energy)
## Warning: package 'energy' was built under R version 4.2.3
library(readxl)
library(rmarkdown)
valuegendata <- read_excel("C:/Users/63966/Downloads/valuegendata.xlsx")
valuegendata
## # A tibble: 2,315 × 37
## ComJesus ImpRe…¹ Gender SchType Race Birth…² Birth…³ Birth…⁴ parstat FathSDA
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 2 1 3 5 1 1 0 2 2
## 2 2 4 1 2 5 0 0 0 2 3
## 3 5 2 2 2 2 0 0 0 4 2
## 4 5 3 2 2 4 1 0 0 1 3
## 5 1 5 1 3 6 1 1 1 3 2
## 6 4 1 1 2 6 1 1 1 2 2
## 7 4 1 2 2 4 1 0 1 2 2
## 8 2 3 2 3 3 0 0 0 1 3
## 9 5 4 1 2 5 0 0 0 1 3
## 10 2 1 1 2 2 1 0 1 1 2
## # … with 2,305 more rows, 27 more variables: MothSDA <chr>, GradeLvel <chr>,
## # baptism <chr>, howold <chr>, spiritualmaturity <dbl>, PerDev <dbl>,
## # Grace <dbl>, Works <dbl>, CongClimate <dbl>, LV_Altruim <dbl>,
## # LV_Adventism <dbl>, LV_Materialism <dbl>, Den_Loyal <dbl>,
## # AdventStd_Diss <dbl>, AtSchl <dbl>, FamClim <dbl>, SchClimate <dbl>,
## # QualRelEd <dbl>, SpiritInfluence <dbl>, Rate_Church <dbl>,
## # Rate_School <dbl>, IntRelig <dbl>, ExtRelig <dbl>, FrndRel <dbl>, …
Null Hypothesis: The variables follow a multivariate normal
distribution.
Alternative Hypothesis: The variables do not follow a multivariate normal distribution.
Using the mvn function in MVN package with the code ``mvn(data = valuegendata, mvnTest = “hz”), we will then see that the variables follow a multivariate normal distribution.
b. Run canonical correlation analysis to determine the nature of the relationships between the two sets of variables.
require(ggplot2)
## Loading required package: ggplot2
require(GGally)
## Loading required package: GGally
## Warning: package 'GGally' was built under R version 4.2.3
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
require(CCA)
## Loading required package: CCA
## Warning: package 'CCA' was built under R version 4.2.3
## Loading required package: fda
## Warning: package 'fda' was built under R version 4.2.3
## Loading required package: splines
## Loading required package: fds
## Warning: package 'fds' was built under R version 4.2.3
## Loading required package: rainbow
## Warning: package 'rainbow' was built under R version 4.2.3
## Loading required package: pcaPP
## Warning: package 'pcaPP' was built under R version 4.2.3
## Loading required package: RCurl
## Loading required package: deSolve
## Warning: package 'deSolve' was built under R version 4.2.3
##
## Attaching package: 'fda'
## The following object is masked from 'package:boot':
##
## melanoma
## The following object is masked from 'package:graphics':
##
## matplot
## Loading required package: fields
## Warning: package 'fields' was built under R version 4.2.3
## Loading required package: spam
## Warning: package 'spam' was built under R version 4.2.3
## Spam version 2.9-1 (2022-08-07) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
##
## Attaching package: 'spam'
## The following objects are masked from 'package:mvtnorm':
##
## rmvnorm, rmvt
## The following objects are masked from 'package:base':
##
## backsolve, forwardsolve
## Loading required package: viridis
## Warning: package 'viridis' was built under R version 4.2.3
## Loading required package: viridisLite
##
## Try help(fields) to get started.
colnames(valuegendata) <- c("comJesus", "ImpReligion", "Gender", "SchType", "Race", "Birth_ME", "Birth_Mother", "Birth_Dad", "parstat", "FathSDA", "MothSDA", "GradeLvel", "baptism", "howold", "spiritualmaturity", "PerDev", "Grace", "Works", "CongClimate", "LV_Altruim", "LV_Adventism", "LV_Materialism", "Den_Loyal", "AdventStd_Diss", "AtSchl", "FamClim", "SchClimate", "QualRelEd", "SpiritInfluence", "Rate_Church", "Rate_School", "IntRelig", "ExtRelig", "FrndRel", "AdventOrtho", "MAH_1", "ProbMah")
summary(valuegendata)
## comJesus ImpReligion Gender SchType
## Length:2315 Length:2315 Length:2315 Length:2315
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## Race Birth_ME Birth_Mother Birth_Dad
## Length:2315 Length:2315 Length:2315 Length:2315
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## parstat FathSDA MothSDA GradeLvel
## Length:2315 Length:2315 Length:2315 Length:2315
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## baptism howold spiritualmaturity PerDev
## Length:2315 Length:2315 Min. :12.00 Min. : 4.00
## Class :character Class :character 1st Qu.:35.00 1st Qu.:13.00
## Mode :character Mode :character Median :41.00 Median :17.00
## Mean :40.78 Mean :17.08
## 3rd Qu.:47.00 3rd Qu.:21.00
## Max. :60.00 Max. :32.00
##
## Grace Works CongClimate LV_Altruim LV_Adventism
## Min. : 7.00 Min. : 3.00 Min. :11.0 Min. : 4.00 Min. :2.000
## 1st Qu.:17.00 1st Qu.: 7.00 1st Qu.:41.0 1st Qu.:11.00 1st Qu.:5.000
## Median :22.00 Median :10.00 Median :49.0 Median :12.93 Median :6.000
## Mean :21.79 Mean :10.12 Mean :47.8 Mean :12.35 Mean :5.971
## 3rd Qu.:27.00 3rd Qu.:14.00 3rd Qu.:56.0 3rd Qu.:14.00 3rd Qu.:7.000
## Max. :35.00 Max. :15.00 Max. :66.0 Max. :16.00 Max. :8.000
##
## LV_Materialism Den_Loyal AdventStd_Diss AtSchl
## Min. :2.000 Min. : 5.00 Min. : 8.00 Min. : 4.00
## 1st Qu.:3.000 1st Qu.:16.00 1st Qu.:27.00 1st Qu.:15.00
## Median :4.000 Median :19.00 Median :35.00 Median :18.00
## Mean :4.508 Mean :18.21 Mean :34.47 Mean :16.94
## 3rd Qu.:6.000 3rd Qu.:21.00 3rd Qu.:42.00 3rd Qu.:20.00
## Max. :8.000 Max. :23.00 Max. :64.00 Max. :24.00
##
## FamClim SchClimate QualRelEd SpiritInfluence
## Min. : 5.00 Min. : 9.00 Min. : 8.00 Min. : 27.00
## 1st Qu.:23.00 1st Qu.:22.00 1st Qu.:29.00 1st Qu.: 77.00
## Median :26.00 Median :25.00 Median :34.00 Median : 89.00
## Mean :25.03 Mean :24.52 Mean :33.71 Mean : 88.22
## 3rd Qu.:29.00 3rd Qu.:28.00 3rd Qu.:40.00 3rd Qu.:101.00
## Max. :30.00 Max. :36.00 Max. :48.00 Max. :135.00
##
## Rate_Church Rate_School IntRelig ExtRelig
## Min. :10.00 Min. :10.00 Min. : 7.00 Min. : 7.00
## 1st Qu.:40.00 1st Qu.:40.00 1st Qu.:23.00 1st Qu.:15.00
## Median :49.00 Median :45.00 Median :26.00 Median :18.00
## Mean :49.04 Mean :45.51 Mean :25.78 Mean :18.29
## 3rd Qu.:57.00 3rd Qu.:53.38 3rd Qu.:29.00 3rd Qu.:21.00
## Max. :70.00 Max. :70.00 Max. :35.00 Max. :35.00
##
## FrndRel AdventOrtho MAH_1 ProbMah
## Min. : 4.00 Min. : 25.0 Min. : 0.7353 Min. :0.000003
## 1st Qu.:10.00 1st Qu.:128.0 1st Qu.: 4.8931 1st Qu.:0.257340
## Median :11.37 Median :138.0 Median : 7.1437 Median :0.521208
## Mean :11.37 Mean :134.5 Mean : 7.9965 Mean :0.509567
## 3rd Qu.:13.00 3rd Qu.:144.0 3rd Qu.:10.1107 3rd Qu.:0.768939
## Max. :16.00 Max. :150.0 Max. :40.3615 Max. :0.999432
## NA's :16 NA's :16
xtabs(~comJesus, data = valuegendata)
## comJesus
## 1 2 3 4 5
## 21 196 284 1026 772
psych <- valuegendata[, 1:18]
acad <- valuegendata[, 19:37]
ggpairs(psych)
## Warning: Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 7 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 31 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 31 rows containing missing values (`stat_boxplot()`).
## Removed 31 rows containing missing values (`stat_boxplot()`).
## Removed 31 rows containing missing values (`stat_boxplot()`).
## Removed 31 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 10 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 10 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 10 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 10 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 10 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 10 rows containing missing values (`stat_boxplot()`).
## Removed 10 rows containing missing values (`stat_boxplot()`).
## Removed 10 rows containing missing values (`stat_boxplot()`).
## Removed 10 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 12 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 12 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 12 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 12 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 12 rows containing missing values (`stat_boxplot()`).
## Removed 12 rows containing missing values (`stat_boxplot()`).
## Removed 12 rows containing missing values (`stat_boxplot()`).
## Removed 12 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 9 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 9 rows containing non-finite values (`stat_g_gally_count()`).
## Removed 9 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 9 rows containing missing values (`stat_boxplot()`).
## Removed 9 rows containing missing values (`stat_boxplot()`).
## Removed 9 rows containing missing values (`stat_boxplot()`).
## Removed 9 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 16 rows containing non-finite values (`stat_g_gally_count()`).
## Warning: Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Removed 16 rows containing missing values (`stat_boxplot()`).
## Warning: Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## Removed 7 rows containing missing values (`stat_boxplot()`).
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggpairs(acad)
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning: Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Warning: Removed 16 rows containing non-finite values (`stat_density()`).
## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 16 rows containing missing values
## Warning: Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Removed 16 rows containing missing values (`geom_point()`).
## Warning: Removed 16 rows containing non-finite values (`stat_density()`).
c. Report your result at the very least, you should present the following: research problem under investigation; the hypothesis/hypotheses being tested; descriptive statistics (means, standard deviation, inter-correlations within and between sets of variables; results of the canonical correlation analysis. Be sure to include summary tables.
Research problem under investigation - Is there an association
between the variables in SET 1 and in SET 2. Null Hypothesis: Our
two sets of variables are not linearly related.
*Alternative Hypothesis: Our two sets of variables are linearly related.
*RESULTS: The two sets are not linearly related