Introduction:

For our week 5 discussion, we were to come forward with datasets exemplifying “untidy” data. Project 2 requires us to take 3 of the peer posted examples from Week 5, tidy the data, and then perform the analysis requested.

This portion will be focusing on Kory’s African American therapists in LA and his provided untidy example.

Sources: - African American Therapists in LA: https://www.psychologytoday.com/us/therapists/ca/los-angeles?category=african-american Classmate: Kory Martin & his post

Let’s take a look at the data.

therapists <- read_csv("https://raw.githubusercontent.com/d-ev-craig/DATA607/main/Projects/Project2%20-%20Untidy%20Data/therapists.csv")
## Rows: 3200 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): X1, X2, X3
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(therapists)
## # A tibble: 6 × 3
##   X1          X2                                                           X3   
##   <chr>       <chr>                                                        <chr>
## 1 Uriah Cty   "Marriage & Family Therapist, MA, LMFT"                      Los …
## 2 <NA>        "Maybe you remember this scene from a movie; picture it, a … & On…
## 3 <NA>        "(213) 513-5553"                                             <NA> 
## 4 <NA>         <NA>                                                        <NA> 
## 5 James Birks "Marriage & Family Therapist, LMFT"                          Los …
## 6 <NA>        "Accepting Teletherapy Clients Only. In today's world It ca… & On…
therapists
## # A tibble: 3,200 × 3
##    X1            X2                                                        X3   
##    <chr>         <chr>                                                     <chr>
##  1 Uriah Cty     "Marriage & Family Therapist, MA, LMFT"                   Los …
##  2 <NA>          "Maybe you remember this scene from a movie; picture it,… & On…
##  3 <NA>          "(213) 513-5553"                                          <NA> 
##  4 <NA>           <NA>                                                     <NA> 
##  5 James Birks   "Marriage & Family Therapist, LMFT"                       Los …
##  6 <NA>          "Accepting Teletherapy Clients Only. In today's world It… & On…
##  7 <NA>           <NA>                                                     <NA> 
##  8 <NA>           <NA>                                                     <NA> 
##  9 Taronda Jones "Clinical Social Work/Therapist, LCSW"                    Los …
## 10 <NA>          "Are you looking for someone to help you navigate throug… & On…
## # … with 3,190 more rows

Variables of Interest

The variables that we need to create after taking a look seem to be the following: - Name
- Focus - Credentials
- Description
- Phone Number - Location - Mode
- Accepting Clients

We will expand each of the columns one by one to get our values. Grabbing names from the first column will be first.

name <- therapists %>% filter(!is.na(X1))
colnames(name)[1] <- 'name'
name
## # A tibble: 800 × 3
##    name                 X2                                          X3          
##    <chr>                <chr>                                       <chr>       
##  1 Uriah Cty            Marriage & Family Therapist, MA, LMFT       Los Angeles…
##  2 James Birks          Marriage & Family Therapist, LMFT           Los Angeles…
##  3 Taronda Jones        Clinical Social Work/Therapist, LCSW        Los Angeles…
##  4 Christina Harrison   Clinical Social Work/Therapist, LCSW        Los Angeles…
##  5 Eric Michael Katende Marriage & Family Therapist Associate, AMFT Los Angeles…
##  6 Brittany Williams    Pre-Licensed Professional                   Los Angeles…
##  7 Claudia Williams     Pre-Licensed Professional, MSW, ACSW        Los Angeles…
##  8 Bradlisia Dixon      Marriage & Family Therapist, LMFT           Los Angeles…
##  9 Dr. Daryl M Rowe     Psychologist, PhD                           Los Angeles…
## 10 Camille Tenerife     Marriage & Family Therapist, LMFT           Los Angeles…
## # … with 790 more rows
name <- name[,1]

Expanding Second Column

In the second column, we are looking to pull out the following variables - Specialty - Description
- Phone Number

We will approach this by creating sequence vectors that identify the values in the column we are interested in. We then take all those values, turn them into a column, and combine those columns to create our new dataframe. I am choosing this method since pivot_wider is performed within a cell and each variable is at a fixed sequence in the column.

#coalesce method -not used
#nameExpanded <- name %>% slice(rep(1:n(),each = 4))
#nameExpanded

#therapists2 <- data.frame(nameExpanded,therapists[,-1])
#therapists2

#df %>%
#mutate(A = coalesce(A,B))

#Extract values by sequences
specialSeq <- seq(from=1,to=3200, by = 4)
specialty <- therapists$X2[specialSeq]
therapists2 <- cbind(name,specialty)


#Desc Sequence
descSeq <- seq(from=2,to=3200, by = 4)
desc <- therapists$X2[descSeq]
therapists2 <- data.frame(therapists2,desc)


#phone number Sequence
phoneSeq <- seq(from=3,to=3200, by = 4)
phone <- therapists$X2[phoneSeq]
therapists2 <- data.frame(therapists2,phone)

#The 4th cell in the sequence of column 2 are all NAs
#scraps Sequence
#scrapSeq <- seq(from=4,to=3200, by = 4)
#scrap <- therapists$X2[scrapSeq]
#therapists2 <- data.frame(therapists2,scrap)
head(therapists2)
##                   name                                   specialty
## 1            Uriah Cty       Marriage & Family Therapist, MA, LMFT
## 2          James Birks           Marriage & Family Therapist, LMFT
## 3        Taronda Jones        Clinical Social Work/Therapist, LCSW
## 4   Christina Harrison        Clinical Social Work/Therapist, LCSW
## 5 Eric Michael Katende Marriage & Family Therapist Associate, AMFT
## 6    Brittany Williams                   Pre-Licensed Professional
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               desc
## 1 Maybe you remember this scene from a movie; picture it, a peaceful airplane ride becomes turbulent. The plane begins losing altitude rapidly. The flight attendant urgently instructs you to "put your oxygen mask on first" before helping others. That simple, but critical statement, is just as important in our everyday lives.  Remembering to take time to yourself, non selfishly, can be difficult for anyone who tends to place others' needs first. Together,  we'll explore your thoughts, feelings, decisions, choices, wants, and needs in your therapy. I will assist you in" finding your voice" and help you to "put your oxygen mask on first.
## 2                                                        Accepting Teletherapy Clients Only. In today's world It can be so difficult to connect to our truth and live as our authentic selves. My mission is to create a non-judgmental, supportive and affirmative environment that will encourage you to heal, to grow and to reach your fullest potential. Through collaboration, empathetic listening and challenging negative patterns we will work together and help you reach your goals. Embarking on a therapeutic journey takes hard work, vulnerability and commitment. So if you are willing to put in the work I am committed to helping you succeed.
## 3                                                                                                                                                        Are you looking for someone to help you navigate through difficult times? Do you find yourself feeling alone with no one to help you find solutions to your problems? Are you a coupIe struggling to have a healthy relationship? I am a Licensed Clinical Social Worker in the State of California, Oregon and Washington providing hope and encouragement to those in need. I have over 9 years' experience as a psychotherapist and have worked with all ages, from early childhood to the aged adult.
## 4                                                                                                                                                                                         My primary goal in working in mental health is to put myself out of a job. Given that each of us intrinsically know what we need to heal and grow, my approach to psychotherapy is to couple your expertise of being you – your reality and lived experiences – with my skillset, in order to collaboratively work towards achieving your goals. Through a culturally-affirming and healing-centered stance, I utilize evidence-based treatments to facilitate our work.
## 5                                                               Reaching out for help is one of the most humbling experiences we can go through in life. In a world that is constantly pushing narratives on us based on race, gender, sexuality and/or dominant cultural beliefs, it is life-affirming to engage preferred narratives that support and empower our own values. In difficult times it is easy to feel misunderstood or lost, longing to reconnect with our own resources. My work aims for those reconnections. My approach is collaborative, working from a place of respectful curiosity. I am not the expert on your lived experience. You are.
## 6                                                                                                                                                             I am in the final year of my master's degree in social work. As a masters-level clinician, I want to collaborate with you to address and achieve your therapeutic goals.   My experience includes working with survivors of domestic violence, sexual assault, and elder abuse, among other victimizations. I have facilitated domestic violence groups where I have educated survivors on the cycle of abuse, healthy relationships, and setting healthy boundaries with the people in their lives.
##            phone
## 1 (213) 513-5553
## 2           <NA>
## 3 (323) 347-3314
## 4 (213) 320-6802
## 5 (213) 212-7852
## 6           <NA>

Expanding Third Column

In the third column we want to pull the following variables:
- Location
- Online - Client Acceptance

#Location Seq
locSeq <- seq(from=1,to=3200, by = 4)
loc <- therapists$X3[locSeq]
therapists2 <- data.frame(therapists2,loc)

#online Seq
onlineSeq <- seq(from=2,to=3200, by = 4)
online <- therapists$X3[onlineSeq]
therapists2 <- data.frame(therapists2,online)

#Client Seq
clientSeq <- seq(from=3,to=3200, by = 4)
client <- therapists$X3[clientSeq]
therapists2 <- data.frame(therapists2,client)

# 4th value in column is all NA
# scrapSeq <- seq(from=4,to=3200, by = 4)
# scrap <- therapists$X3[scrapSeq]
# therapists2 <- data.frame(therapists2,scrap)
# therapists2

NA Handling

There are a few columns that contain NA’s that I’d prefer to replace with text so that it could easily be readable to an end user.

therapists2$specialty[is.na(therapists2$specialty)]<- 'None Listed'
therapists2$phone[is.na(therapists2$phone)]<- 'None Listed'
therapists2$online[is.na(therapists2$online)]<- 'No' # check this column
therapists2$client[is.na(therapists2$client)]<- 'Accepting'

therapists2$online <- str_replace(therapists2$online, '& ','')

Expanding Location Further

We will also expand our location column into a state and zip code column.

therapists2 <- therapists2 %>% separate_wider_delim(loc, delim = ',', names = c('city','var'))

therapists2
## # A tibble: 800 × 8
##    name                 specialty          desc  phone city  var   online client
##    <chr>                <chr>              <chr> <chr> <chr> <chr> <chr>  <chr> 
##  1 Uriah Cty            Marriage & Family… "May… (213… Los … "   … Online Accep…
##  2 James Birks          Marriage & Family… "Acc… None… Los … "   … Online Accep…
##  3 Taronda Jones        Clinical Social W… "Are… (323… Los … "   … Online Accep…
##  4 Christina Harrison   Clinical Social W… "My … (213… Los … "   … No     Not a…
##  5 Eric Michael Katende Marriage & Family… "Rea… (213… Los … "   … Online Accep…
##  6 Brittany Williams    Pre-Licensed Prof… "I a… None… Los … "   … Online Accep…
##  7 Claudia Williams     Pre-Licensed Prof… "“We… (424… Los … "   … Online Accep…
##  8 Bradlisia Dixon      Marriage & Family… "Are… (424… Los … "   … Online Accep…
##  9 Dr. Daryl M Rowe     Psychologist, PhD  "As … (323… Los … "   … No     Not a…
## 10 Camille Tenerife     Marriage & Family… "As … (424… Los … "   … Online Accep…
## # … with 790 more rows
therapists2$var <- str_replace(therapists2$var, 'CA ','')

therapists2$var <- trimws(therapists2$var)

therapists2 <- therapists2 %>% rename(c('var' = 'zip'))

state <- rep('CA',800)

therapists2 <- cbind(therapists2,state)

head(therapists2)
##                   name                                   specialty
## 1            Uriah Cty       Marriage & Family Therapist, MA, LMFT
## 2          James Birks           Marriage & Family Therapist, LMFT
## 3        Taronda Jones        Clinical Social Work/Therapist, LCSW
## 4   Christina Harrison        Clinical Social Work/Therapist, LCSW
## 5 Eric Michael Katende Marriage & Family Therapist Associate, AMFT
## 6    Brittany Williams                   Pre-Licensed Professional
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               desc
## 1 Maybe you remember this scene from a movie; picture it, a peaceful airplane ride becomes turbulent. The plane begins losing altitude rapidly. The flight attendant urgently instructs you to "put your oxygen mask on first" before helping others. That simple, but critical statement, is just as important in our everyday lives.  Remembering to take time to yourself, non selfishly, can be difficult for anyone who tends to place others' needs first. Together,  we'll explore your thoughts, feelings, decisions, choices, wants, and needs in your therapy. I will assist you in" finding your voice" and help you to "put your oxygen mask on first.
## 2                                                        Accepting Teletherapy Clients Only. In today's world It can be so difficult to connect to our truth and live as our authentic selves. My mission is to create a non-judgmental, supportive and affirmative environment that will encourage you to heal, to grow and to reach your fullest potential. Through collaboration, empathetic listening and challenging negative patterns we will work together and help you reach your goals. Embarking on a therapeutic journey takes hard work, vulnerability and commitment. So if you are willing to put in the work I am committed to helping you succeed.
## 3                                                                                                                                                        Are you looking for someone to help you navigate through difficult times? Do you find yourself feeling alone with no one to help you find solutions to your problems? Are you a coupIe struggling to have a healthy relationship? I am a Licensed Clinical Social Worker in the State of California, Oregon and Washington providing hope and encouragement to those in need. I have over 9 years' experience as a psychotherapist and have worked with all ages, from early childhood to the aged adult.
## 4                                                                                                                                                                                         My primary goal in working in mental health is to put myself out of a job. Given that each of us intrinsically know what we need to heal and grow, my approach to psychotherapy is to couple your expertise of being you – your reality and lived experiences – with my skillset, in order to collaboratively work towards achieving your goals. Through a culturally-affirming and healing-centered stance, I utilize evidence-based treatments to facilitate our work.
## 5                                                               Reaching out for help is one of the most humbling experiences we can go through in life. In a world that is constantly pushing narratives on us based on race, gender, sexuality and/or dominant cultural beliefs, it is life-affirming to engage preferred narratives that support and empower our own values. In difficult times it is easy to feel misunderstood or lost, longing to reconnect with our own resources. My work aims for those reconnections. My approach is collaborative, working from a place of respectful curiosity. I am not the expert on your lived experience. You are.
## 6                                                                                                                                                             I am in the final year of my master's degree in social work. As a masters-level clinician, I want to collaborate with you to address and achieve your therapeutic goals.   My experience includes working with survivors of domestic violence, sexual assault, and elder abuse, among other victimizations. I have facilitated domestic violence groups where I have educated survivors on the cycle of abuse, healthy relationships, and setting healthy boundaries with the people in their lives.
##            phone        city   zip online                    client state
## 1 (213) 513-5553 Los Angeles 90048 Online                 Accepting    CA
## 2    None Listed Los Angeles 90044 Online                 Accepting    CA
## 3 (323) 347-3314 Los Angeles 90008 Online                 Accepting    CA
## 4 (213) 320-6802 Los Angeles 90066     No Not accepting new clients    CA
## 5 (213) 212-7852 Los Angeles 90004 Online                 Accepting    CA
## 6    None Listed Los Angeles 90019 Online                 Accepting    CA
#Factorizing
therapists2$zip <- as.factor(therapists2$zip)
therapists2$client <- as.factor(therapists2$client)
therapists2$online <- as.factor(therapists2$online)

Analysis

From here its a pretty tidy data set. I did not split name since some of the ‘names’ are company names and would be best left all as one entity for it to be easily recognizable. I also left unicode inside the descr column since it may be best to leave those if importing the data elsewhere to another platform.

Some analysis that was chosen to be performed: Identifying the % of therapist offering online, on-ste Identifying the percent accepting new patients Break down the group based on their zip code

#Some totals
therapists2 %>% dplyr::group_by(online) %>% dplyr::summarize(n())
## # A tibble: 2 × 2
##   online `n()`
##   <fct>  <int>
## 1 No       102
## 2 Online   698
therapists2 %>% dplyr::group_by(client) %>% dplyr::summarize(n())
## # A tibble: 3 × 2
##   client                    `n()`
##   <fct>                     <int>
## 1 Accepting                   683
## 2 Not accepting new clients    78
## 3 Waitlist for new clients     39
therapists2 %>% dplyr::group_by(zip) %>% dplyr::summarize(total=n()) %>% arrange(desc(total))
## # A tibble: 65 × 2
##    zip   total
##    <fct> <int>
##  1 90025    80
##  2 90001    45
##  3 90066    38
##  4 90008    37
##  5 90034    34
##  6 90045    33
##  7 90064    33
##  8 90048    26
##  9 90036    21
## 10 90024    20
## # … with 55 more rows
#Percentage of Therapists offering online
onlineOffer <- sum(therapists2$online =='Online') #Summing number of therapists offering online
noOnline <- sum(therapists2$online == 'No') #Summing number of therapists not offering online


#percentageNotOnline <- noOnline/800 #800 is the total number of therapists
percentageOnline <- (onlineOffer/800 * 100)
print(paste0('Percentage Offering Online Service is ',percentageOnline, '%'))
## [1] "Percentage Offering Online Service is 87.25%"
newPatients <- sum(therapists2$client =='Accepting') #Summing number of therapists accepting new patients
percentagePatient <- (newPatients/800 * 100)
print(paste0('Percentage Accepting New Patients is ',percentagePatient,'%'))
## [1] "Percentage Accepting New Patients is 85.375%"
#Percentage accepting both

both <- sum(therapists2$online == 'Online' & therapists2$client == 'Accepting')
percentageBoth <- (both/800 *100)
print(paste0('Percentage accepting new patients and offering online service is ', percentageBoth,'%'))
## [1] "Percentage accepting new patients and offering online service is 82.375%"
# Breaking Groups down based on Zip

g <- ggplot(data=therapists2,aes(x=online, fill=online))
g + geom_bar()

g2 <- ggplot(data=therapists2,aes(x=client, fill=client))
g2 + geom_bar()

#Ordering the table by zip code, then by whether they offer online services, then by whether they're accepting new clients
therapists2 <- therapists2 %>% arrange(zip,online,client)



#Table to help find breakdown by each zip code
therapists2 %>% dplyr::group_by(zip,client,online) %>% dplyr::summarize(total = n())
## `summarise()` has grouped output by 'zip', 'client'. You can override using the
## `.groups` argument.
## # A tibble: 134 × 4
## # Groups:   zip, client [122]
##    zip   client                    online total
##    <fct> <fct>                     <fct>  <int>
##  1 90001 Accepting                 No         1
##  2 90001 Accepting                 Online    40
##  3 90001 Not accepting new clients No         3
##  4 90001 Waitlist for new clients  Online     1
##  5 90002 Accepting                 Online    12
##  6 90003 Accepting                 Online     3
##  7 90003 Waitlist for new clients  Online     1
##  8 90004 Accepting                 Online    13
##  9 90005 Accepting                 Online     8
## 10 90005 Not accepting new clients No         1
## # … with 124 more rows

Summary

So in summary, of the 800 total therapists:
- 698 of them offer online service (87.25%)
- 683 of them are accepting new patients (85.375%), 78 not accepting, 39 have a waitlist
- Zip code 90025 has the highest total with 80 therapists in the area