Project 3: Singular Value Decomposition

Data Transformation: Raw Data

## [1] "Raw.Df:   c(24983, 101)"

Data Transformation: Data Filtering

  • The dataset is very large (25,983, 101). I would like to remove some data to make it more managable
  • To make the dataset more managable, subset by the number of jokes rated by a user, column 0. Randomly selected 80 jokes rated.
  • Remove all the ALL null columns (unrated joke)
  • The final subset dimensions are more managable with 145 rows (reviewers) & 100 jokes
  • Check the occurrance of NAs
  • imput the mean values with the column averages
  • Center and scale the data subset
## [1] "Subset.1 (Num Jokes Filter):   c(145, 100)"

Data Transformation: NA Review

  • Check the % of NAS in the full dataset
  • Impute the NAs with the column means
  • Check the % of Nas in full dataset as a check
## [1] "The occurances of NAs Subset.1:  0.2"
## [1] "The occurances of NAs Subset.1:  0"

Data Transformation: Center & Scale the Dataset

## [1] "Subset.1(Scaled):   c(145, 100)"

EDA: Joke Histograms

  • Distribution full subset1.scaled matrix
  • Distribution full subset1.scaled matrix for users rowMeans()
  • Distribution full subset1.scaled matrix for joke colMeans()

Modeling: Manual Singular Value Decomposition

Modeling: Manual SVD New Joke Recommendation

  • Select a random column, I picked 72 because I know there are some NAs in the raw.data
  • Filter the row.names.svd & the actual.user.data for 72
  • Get joke predictions for the user row, column 72
  • Remove all the jokes that were previously rated
  • Access and print the top new joke recommendations to a dataframe
  • Print out user 72’s favorite joke in the actual dataset
ID Prediction
V89 1.6201198
V99 1.4471087
V95 1.3147159
V82 1.2564554
V83 1.2472436
V85 1.2294766
V81 1.2061194
V90 1.1993620
V88 1.1740979
V87 1.1585494
V77 1.0959425
V76 1.0860230
V93 1.0733386
V79 1.0577904
V72 1.0572973
V84 1.0230551
V86 0.9982176
V10 0.9958562
V34 0.9901881
V74 0.9690001
V5 0.9552345
V94 0.9081638
V101 0.8683392
V78 0.7876121
V98 0.7858298
V75 0.7850924
V100 0.7535020
V96 0.7101319
V97 0.7060250
V91 0.6681112
V80 0.6457942
V59 0.5683559
ID JokeNumber JokeText
V89 89 A radio conversation of a US naval ship with Canadian authorities … Americans: Please divert your course 15 degrees to the North to avoid acollision.Canadians: Recommend you divert YOUR course 15 degrees to the South to avoid a collision.Americans: This is the Captain of a US Navy ship. I say again, divert YOUR course.Canadians: No. I say again, you divert YOUR course.Americans: This is the aircraft carrier USS LINCOLN, the second largest ship in the United States’ Atlantic Fleet. We are accompanied by three destroyers, three cruisers and numerous support vessels. I demand that you change your course 15 degrees north, that’s ONE FIVE DEGREES NORTH, or counter-measures will be undertaken to ensure the safety of this ship.Canadians: This is a lighthouse. Your call.
V99 99 A bus station is where a bus stops.A train station is where a train stops.On my desk I have a work station…
V95 95 Just a thought ..Before criticizing someone, walk a mile in their shoes. Then when you do criticize them, you will be a mile away and have their shoes !
V82 82 Q: How do you keep a computer programmer in the shower all day long?A: Give them a shampoo with a label that says“rinse, lather, repeat”.
V83 83 What a woman says:“This place is a mess! C’mon,You and I need to clean up,Your stuff is lying on the floor andyou’ll have no clothes to wear,if we don’t do laundry right now!”What a man hears:blah, blah, blah, blah, C’monblah, blah, blah, blah, you and Iblah, blah, blah, blah, on the floorblah, blah, blah, blah, no clothesblah, blah, blah, blah, RIGHT NOW!
V85 85 Q: How many Presidents does it take to screw in a light bulb?A: It depends upon your definition of screwing a light bulb.
ID JokeNumber JokeText
V55 55
V55 55 A woman has twins, and gives them up for adoption. One ofthem goes to a family in Egypt and is named “Amal.” The other goes toa family in Spain; they name him “Juan.” Years later, Juan sends apicture of himself to his mom. Upon receiving the picture, she tellsher husband that she wishes she also had a picture of Amal. Her husband responds, “But they are twins-if you’ve seen Juan, you’veseen Amal.

Modeling: Recommenderlab Singular Value Decomposition

  • Create the “realRatingMatrix” matrix
  • split the data into train/test datasets
  • Run the Recommender function
  • Access joke predictions for a specific user. I could not select 72 because of the sample, I selected user 20
  • Print out user 72’s favorite joke in the actual dataset
ID Prediction
V67 0.7147266
V54 0.7017651
V17 0.6981405
V16 0.6925065
V92 0.6819355
V49 0.6742383
V3 0.6637131
V9 0.6559380
V11 0.6559016
V70 0.6531204
V50 0.6524799
V55 0.6493488
V26 0.6489915
V89 0.6477634
V81 0.6446870
V22 0.6410212
V30 0.6409484
V37 0.6362936
V47 0.6362581
V82 0.6331612
V41 0.6242400
V19 0.6228781
V68 0.6227169
V4 0.6208399
V85 0.6202696
V83 0.6194318
V33 0.6160997
V90 0.6130822
V88 0.6108557
V93 0.6106620
V69 0.6097609
V66 0.6088052
V28 0.6043038
V72 0.6018931
V78 0.6016115
V39 0.6002933
V7 0.5982826
V76 0.5981089
V58 0.5971256
V6 0.5943214
V73 0.5896175
V74 0.5895097
V42 0.5868121
V48 0.5865084
V59 0.5862263
V36 0.5834887
V46 0.5809431
V79 0.5790882
V57 0.5781848
V27 0.5773127
V84 0.5756710
V60 0.5747504
V45 0.5737886
V29 0.5729655
V100 0.5717924
V31 0.5672953
V75 0.5554902
V8 0.5483455
V40 0.5473070
V98 0.5410895
V91 0.5364175
V62 0.5348333
V21 0.5344838
V15 0.5322301
V18 0.5316212
V56 0.5295603
V13 0.5270320
V96 0.5212690
V80 0.5199422
V32 0.5166465
V35 0.5030441
V44 0.4980550
V23 0.4957757
V71 0.4957418
V64 0.4927519
V101 0.4828276
V5 0.4707692
V52 0.4545357
V14 0.4519977
V99 0.4481885
V2 0.0000000
V10 0.0000000
V12 0.0000000
V20 0.0000000
V24 0.0000000
V25 0.0000000
V34 0.0000000
V38 0.0000000
V43 0.0000000
V51 0.0000000
V53 0.0000000
V61 0.0000000
V63 0.0000000
V65 0.0000000
V77 0.0000000
V86 0.0000000
V87 0.0000000
V94 0.0000000
V95 0.0000000
V97 0.0000000
ID JokeNumber JokeText
V67 67
V67 67 Once upon a time, two brooms fell in love and decided to get married.Before the ceremony, the bride broom informed the groom broom that she was expecting a little whiskbroom. The groom broom was aghast!“How is this possible?” he asked. “We’ve never swept together!
V54 54
V54 54 The Pope dies and, naturally, goes to heaven. He’s met by the receptioncommittee, and after a whirlwind tour he is told that he can enjoy anyof the myriad of recreations available.He decides that he wants to read all of the ancient original text of theHoly Scriptures, so he spends the next eon or so learning languages.After becoming a linguistic master, he sits down in the library andbegins to pour over every version of the Bible, working back from mostrecent “Easy Reading” to the original script.All of a sudden there is a scream in the library. The Angels comerunning in only to find the Pope huddled in his chair, crying to himselfand muttering, “An ‘R’! The scribes left out the ‘R’.” A particularly concerned Angel takes him aside, offering comfort, askshim what the problem is and what does he mean. After collecting hiswits, the Pope sobs again, “It’s the letter ‘R’. They left out the ‘R’.The word was supposed to be CELEBRATE!”
V17 17
V17 17 How many men does it take to screw in a light bulb? One…men will screw anything.
ID JokeNumber JokeText
V51 51
V51 51 Did you hear that Clinton has announced there is a new national bird? The spread eagle.

Summary

In project 3, I tried to explore the capabilities for the SVD models by attempting the models manually and with the recommenderlabs library. I find the SVD models a bit difficult to interpret and need to do some additional research.