Often useful to distinguish between personal and organisational tweets in document/ discourse analysis of twitter.
This note sets out some ideas for doing this.
We use
3000 tweets from @PublicHealthBot - this is a retweet engine so we are interesting in the characteristics of retweeters rather than primary tweeters
textfeatures
, spacyr
.We are primarily interested in the user descriptions - this case retweet_description
and usernames.
This filters tweets where the retweet description contains the word view, View, views, Views. Of 3190, 254 contain one or more of these terms.
We can see how successful this strategy is by reviewing a sample of the descriptions.
retweet_name | name | retweet_description |
---|---|---|
Prof. Iain Buchan | Public Health Data Science | Public Health and Clinical Informatics researcher helping data and algorithms work at scale for societal gain. Views my own. |
David Baker | Public Health Data Science | Director of Partnerships and Innovation at Sandwell and West Birmingham NHS Trust. Views are my own. Thank you |
Eustis Corrigan | Public Health Data Science | Senior Managing Director, #CBIZ #Memphis (NYSE: @CBZ) | Collaborative Leader | Valued Advisor | Views are mine - RTs are not endorsements |
Vidhi Thakkar | Public Health Data Science | Health Policy PhD HSPR, views are my own, Digital Health, Educator, @MacHealthSci, patient oriented research, health systems design, lifelong learner @ihpmeuoft |
EmTech MENA | Public Health Data Science | Discover the emerging technologies that will change the world through the #EmTechMENA platform. Organized by @TechReviewAR in partnership with @dubaifuture |
Michelle Fenner | Public Health Data Science | Michelle Fenner - #digital #healthcare Digital Marketing and Storytelling Perpetual Mediator. Views are 100% my own! |
Jeff Acaba | Public Health Data Science | #zerodiscrimination, fulfilment of #SRHR, recognition of #LGBTIQ rights, & respect to #humanrights as we #endAIDS and #endTB. Views & RTs my own. 🏳️🌈☕️ |
Pharma Tech Outlook | Public Health Data Science | Pharma Tech Outlook has contributors from the most established organizations who have been presenting their viewpoint using this unique print platform. |
Carol Sinclair | Public Health Data Science | Assoc Director, Public Health&Intelligence, NHSNSS/Non-Exec Director, Scottish Ambulance Service. Passionate about innovation in public sector. All views my own |
Brett Keller | Public Health Data Science | Roving data guy, occasional musician. Job: @PSIimpact. Alum of @WilsonSchool econ/policy, @JohnsHopkinsSPH global epi, @TrumanScholars. Opinions/views = mine. |
In the sample we can see that often retweeters are recognisable individual names though not always.
We will add a new column to the tweet table (add a feature).
This approach is unlikely to filter all personal users - we will need additional strategies.
The idea here is that individuals are more likely to use singular personal pronouns like I and me than organisations.
Detecting this can be done with powerful NLP tools, but a much quicker method is to use the textfeature
package which is very quick and can detect personal pronouns.
## [32m↪[39m [38;5;244mCounting features in text...[39m
## [32m↪[39m [38;5;244mSentiment analysis...[39m
## [32m↪[39m [38;5;244mParts of speech...[39m
## [32m↪[39m [38;5;244mWord dimensions started[39m
## [32m✔[39m Job's done!
## # A tibble: 3,190 x 134
## n_urls n_uq_urls n_hashtags n_uq_hashtags n_mentions n_uq_mentions n_chars
## <int> <int> <int> <int> <int> <int> <int>
## 1 0 0 4 4 0 0 129
## 2 0 0 0 0 1 1 21
## 3 0 0 2 2 1 1 127
## 4 0 0 0 0 0 0 124
## 5 1 1 1 1 2 2 99
## 6 0 0 0 0 0 0 136
## 7 0 0 0 0 0 0 121
## 8 0 0 1 1 0 0 144
## 9 0 0 0 0 0 0 36
## 10 0 0 0 0 0 0 133
## # … with 3,180 more rows, and 127 more variables: n_uq_chars <int>,
## # n_commas <int>, n_digits <int>, n_exclaims <int>, n_extraspaces <int>,
## # n_lowers <int>, n_lowersp <dbl>, n_periods <int>, n_words <int>,
## # n_uq_words <int>, n_caps <int>, n_nonasciis <int>, n_puncts <int>,
## # n_capsp <dbl>, n_charsperword <dbl>, sent_afinn <dbl>, sent_bing <dbl>,
## # sent_syuzhet <dbl>, sent_vader <dbl>, n_polite <dbl>, n_first_person <int>,
## # n_first_personp <int>, n_second_person <int>, n_second_personp <int>,
## # n_third_person <int>, n_tobe <int>, n_prepositions <int>, w1 <dbl>,
## # w2 <dbl>, w3 <dbl>, w4 <dbl>, w5 <dbl>, w6 <dbl>, w7 <dbl>, w8 <dbl>,
## # w9 <dbl>, w10 <dbl>, w11 <dbl>, w12 <dbl>, w13 <dbl>, w14 <dbl>, w15 <dbl>,
## # w16 <dbl>, w17 <dbl>, w18 <dbl>, w19 <dbl>, w20 <dbl>, w21 <dbl>,
## # w22 <dbl>, w23 <dbl>, w24 <dbl>, w25 <dbl>, w26 <dbl>, w27 <dbl>,
## # w28 <dbl>, w29 <dbl>, w30 <dbl>, w31 <dbl>, w32 <dbl>, w33 <dbl>,
## # w34 <dbl>, w35 <dbl>, w36 <dbl>, w37 <dbl>, w38 <dbl>, w39 <dbl>,
## # w40 <dbl>, w41 <dbl>, w42 <dbl>, w43 <dbl>, w44 <dbl>, w45 <dbl>,
## # w46 <dbl>, w47 <dbl>, w48 <dbl>, w49 <dbl>, w50 <dbl>, w51 <dbl>,
## # w52 <dbl>, w53 <dbl>, w54 <dbl>, w55 <dbl>, w56 <dbl>, w57 <dbl>,
## # w58 <dbl>, w59 <dbl>, w60 <dbl>, w61 <dbl>, w62 <dbl>, w63 <dbl>,
## # w64 <dbl>, w65 <dbl>, w66 <dbl>, w67 <dbl>, w68 <dbl>, w69 <dbl>,
## # w70 <dbl>, w71 <dbl>, w72 <dbl>, w73 <dbl>, …
textfeature
creates a range of other text features for each tweet which might be useful - it also attaches sentiment scores using a variety of sentimeng algorithms.
We’ll combine the features generated this way with the original tweet data and a new column for tweets which have at least one singular pronoun.
We can summarise at the overlap between tweets with disclaimers and those which have first person singular references.
views | sing_pron | n |
---|---|---|
0 | 0 | 2612 |
0 | 1 | 324 |
1 | 0 | 73 |
1 | 1 | 181 |
Using Natural Language Processing techniques…
Makes use of the spacyr
package. spacy
is trained on very large datasets like Wiki to learn categories including names and organisations. We can then annotate the tweet text with labels to add whether it is from a person, organisation (or both) and combine this annotations from the first two ideas. This gives a number of tweet categories summarised in the table.
## Observations: 3,190
## Variables: 229
## $ user_id <chr> "983658091874054144", "983658091874054144", "…
## $ status_id <chr> "1212881931458752517", "1212876982377230337",…
## $ created_at <dttm> 2020-01-02 23:42:41, 2020-01-02 23:23:01, 20…
## $ screen_name <chr> "PublicHealthBot", "PublicHealthBot", "Public…
## $ text <chr> "Manager, Epidemiology Analytics\nBuckinghams…
## $ source <chr> "PHDS bot mark 2", "PHDS bot mark 2", "PHDS b…
## $ display_text_width <dbl> 120, 140, 144, 113, 140, 120, 140, 140, 140, …
## $ reply_to_status_id <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ reply_to_user_id <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ reply_to_screen_name <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ is_quote <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ is_retweet <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
## $ favorite_count <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ retweet_count <int> 1, 1, 1, 1, 2, 3, 2, 1, 1, 1, 1, 1, 1, 4, 1, …
## $ quote_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ reply_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ hashtags <list> ["Epijobs", NA, "AI", NA, NA, NA, NA, NA, <"…
## $ symbols <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ urls_url <list> ["jobs.jnj.com/jobs/101019121…", NA, NA, NA,…
## $ urls_t.co <list> ["https://t.co/X4nIl0LapT", NA, NA, NA, NA, …
## $ urls_expanded_url <list> ["https://jobs.jnj.com/jobs/1010191211?lang=…
## $ media_url <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ media_t.co <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ media_expanded_url <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ media_type <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ ext_media_url <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ ext_media_t.co <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ ext_media_expanded_url <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ ext_media_type <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ mentions_user_id <list> ["632184479", <"943222483326513152", "119496…
## $ mentions_screen_name <list> ["Epijobs", <"sami_barrit", "VPrasadMDMPH">,…
## $ lang <chr> "en", "en", "en", "en", "en", "en", "en", "en…
## $ quoted_status_id <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_text <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_created_at <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ quoted_source <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_favorite_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_retweet_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_user_id <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_screen_name <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_name <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_followers_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_friends_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_statuses_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_location <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_description <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ quoted_verified <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ retweet_status_id <chr> "1212881274391670785", "1212873956149075968",…
## $ retweet_text <chr> "Manager, Epidemiology Analytics\nBuckinghams…
## $ retweet_created_at <dttm> 2020-01-02 23:40:04, 2020-01-02 23:10:59, 20…
## $ retweet_source <chr> "Twitter Web App", "Twitter for iPhone", "Lin…
## $ retweet_favorite_count <int> 3, 4, 3, 1, 1, 6, 2, 1, 1, 2, 2, 1, 4, 3, 2, …
## $ retweet_retweet_count <int> 1, 1, 1, 1, 2, 3, 2, 1, 1, 1, 1, 1, 1, 4, 1, …
## $ retweet_user_id <chr> "632184479", "943222483326513152", "306053959…
## $ retweet_screen_name <chr> "Epijobs", "sami_barrit", "StephTweetChat", "…
## $ retweet_name <chr> "Epi Job Openings", "Barrit Sami", "Steph S. …
## $ retweet_followers_count <int> 1058, 88, 7544, 341, 567, 2809, 12099, 55, 12…
## $ retweet_friends_count <int> 91, 797, 6984, 685, 2059, 702, 344, 68, 481, …
## $ retweet_statuses_count <int> 41375, 789, 8952, 314, 664, 10499, 16069, 666…
## $ retweet_location <chr> "", "Les Marolles", "St Louis, MO", "Philadel…
## $ retweet_description <chr> "Your \"ONE STOP\" for the latest Job Opportu…
## $ retweet_verified <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ place_url <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ place_name <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ place_full_name <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ place_type <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ country <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ country_code <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ geo_coords <list> [<NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA…
## $ coords_coords <list> [<NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA…
## $ bbox_coords <list> [<NA, NA, NA, NA, NA, NA, NA, NA>, <NA, NA, …
## $ status_url <chr> "https://twitter.com/PublicHealthBot/status/1…
## $ name <chr> "Public Health Data Science", "Public Health …
## $ location <chr> "England, United Kingdom", "England, United K…
## $ description <chr> "Interested in #PublicHealth and #DataScience…
## $ url <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ protected <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ followers_count <int> 755, 755, 755, 755, 755, 755, 755, 755, 755, …
## $ friends_count <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
## $ listed_count <int> 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 1…
## $ statuses_count <int> 10620, 10620, 10620, 10620, 10620, 10620, 106…
## $ favourites_count <int> 9945, 9945, 9945, 9945, 9945, 9945, 9945, 994…
## $ account_created_at <dttm> 2018-04-10 10:48:59, 2018-04-10 10:48:59, 20…
## $ verified <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ profile_url <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ profile_expanded_url <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ account_lang <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ profile_banner_url <chr> "https://pbs.twimg.com/profile_banners/983658…
## $ profile_background_url <chr> "http://abs.twimg.com/images/themes/theme1/bg…
## $ profile_image_url <chr> "http://pbs.twimg.com/profile_images/98366123…
## $ views <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, …
## $ n_urls <int> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_uq_urls <int> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_hashtags <int> 4, 0, 2, 0, 1, 0, 0, 1, 0, 0, 1, 0, 6, 0, 11,…
## $ n_uq_hashtags <int> 4, 0, 2, 0, 1, 0, 0, 1, 0, 0, 1, 0, 6, 0, 11,…
## $ n_mentions <int> 0, 1, 1, 0, 2, 0, 0, 0, 0, 0, 4, 0, 1, 0, 0, …
## $ n_uq_mentions <int> 0, 1, 1, 0, 2, 0, 0, 0, 0, 0, 4, 0, 1, 0, 0, …
## $ n_chars <int> 129, 21, 127, 124, 99, 136, 121, 144, 36, 133…
## $ n_uq_chars <int> 34, 13, 37, 28, 27, 33, 33, 27, 18, 38, 25, 1…
## $ n_commas <int> 0, 0, 1, 1, 1, 2, 0, 0, 0, 1, 0, 3, 3, 4, 0, …
## $ n_digits <int> 0, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, …
## $ n_exclaims <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_extraspaces <int> 1, 0, 1, 0, 1, 1, 0, 0, 0, 3, 4, 0, 0, 0, 1, …
## $ n_lowers <int> 97, 20, 104, 113, 92, 116, 98, 138, 32, 109, …
## $ n_lowersp <dbl> 0.7538462, 0.9545455, 0.8203125, 0.9120000, 0…
## $ n_periods <int> 1, 0, 1, 0, 1, 2, 2, 1, 0, 4, 0, 0, 2, 3, 0, …
## $ n_words <int> 17, 3, 19, 19, 15, 22, 22, 17, 3, 23, 13, 5, …
## $ n_uq_words <int> 17, 3, 16, 15, 15, 20, 19, 17, 3, 22, 11, 5, …
## $ n_caps <int> 22, 0, 13, 9, 3, 10, 18, 4, 4, 14, 3, 1, 18, …
## $ n_nonasciis <int> 0, 4, 0, 0, 0, 12, 3, 0, 0, 0, 0, 0, 0, 0, 16…
## $ n_puncts <int> 9, 0, 8, 1, 2, 4, 3, 1, 0, 4, 5, 1, 9, 2, 17,…
## $ n_capsp <dbl> 0.17692308, 0.04545455, 0.10937500, 0.0800000…
## $ n_charsperword <dbl> 7.222222, 5.500000, 6.400000, 6.250000, 6.250…
## $ sent_afinn <dbl> 1, 0, -3, 0, 3, 0, 1, 1, 0, 2, 4, 0, -1, 0, 2…
## $ sent_bing <dbl> 0, 0, -1, 0, 2, 0, 0, 1, 0, 1, 2, 0, 0, 1, 3,…
## $ sent_syuzhet <dbl> 0.60, 0.00, 0.05, 2.20, 1.95, 0.00, 0.85, 0.7…
## $ sent_vader <dbl> 0.4, 0.0, -1.5, 0.0, 5.0, 0.0, 2.3, 1.9, 0.0,…
## $ n_polite <dbl> 0.5000000, 0.0000000, -0.6250000, 0.0000000, …
## $ n_first_person <int> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, …
## $ n_first_personp <int> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_second_person <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_second_personp <int> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ n_third_person <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ n_tobe <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, …
## $ n_prepositions <int> 2, 0, 0, 3, 3, 4, 2, 2, 0, 2, 1, 0, 1, 1, 0, …
## $ w1 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w2 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w3 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w4 <dbl> 0.0000000, 0.0000000, 0.1176471, 0.0000000, 0…
## $ w5 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w6 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0000000…
## $ w7 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w8 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w9 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0000000…
## $ w10 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w11 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w12 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w13 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ w14 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.1176470…
## $ w15 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w16 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w17 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w18 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w19 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w20 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w21 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ w22 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w23 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w24 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ w25 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w26 <dbl> 0.00000000, 0.00000000, 0.11764706, 0.0000000…
## $ w27 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w28 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w29 <dbl> 0.16666667, 0.00000000, 0.00000000, 0.0000000…
## $ w30 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w31 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.1176470…
## $ w32 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w33 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w34 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w35 <dbl> 0.00000000, 0.50000000, 0.00000000, 0.0000000…
## $ w36 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0000000…
## $ w37 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w38 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.1176470…
## $ w39 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ w40 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w41 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w42 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w43 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w44 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ w45 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ w46 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w47 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w48 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w49 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w50 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0000000…
## $ w51 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0588235…
## $ w52 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w53 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ w54 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w55 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w56 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w57 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w58 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w59 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w60 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w61 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w62 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w63 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0000000…
## $ w64 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ w65 <dbl> 0.00000000, 0.00000000, 0.29411765, 0.0000000…
## $ w66 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w67 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ w68 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w69 <dbl> 0.00000000, 0.00000000, 0.05882353, 0.0588235…
## $ w70 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w71 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.1176470…
## $ w72 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w73 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w74 <dbl> 0.00000000, 0.50000000, 0.05882353, 0.0000000…
## $ w75 <dbl> 0.16666667, 0.00000000, 0.00000000, 0.0000000…
## $ w76 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w77 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w78 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w79 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w80 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w81 <dbl> 0.11111111, 0.00000000, 0.00000000, 0.0000000…
## $ w82 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w83 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w84 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w85 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w86 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0588235…
## $ w87 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w88 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w89 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w90 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w91 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w92 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w93 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w94 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w95 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ w96 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ w97 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w98 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0588235…
## $ w99 <dbl> 0.05555556, 0.00000000, 0.00000000, 0.0000000…
## $ w100 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.0000000…
## $ sing_pron <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, …
## $ desc1 <chr> "Epi Job Openings Your \"ONE STOP\" for the l…
## $ row <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14…
## $ doc_id <chr> "text1", "text2", "text3", "text4", "text5", …
## Found 'spacy_condaenv'. spacyr will use this environment
## successfully initialized (spaCy Version: 2.0.16, language model: en)
## (python options: type = "condaenv", value = "spacy_condaenv")
## doc_id sentence_id token_id token lemma pos tag entity
## 1 text1 1 1 Epi epi PROPN NNP ORG_B
## 2 text1 1 2 Job job PROPN NNP ORG_I
## 3 text1 1 3 Openings openings PROPN NNP ORG_I
## 4 text1 2 1 Your -PRON- ADJ PRP$
## 5 text1 2 2 " " PUNCT ``
## 6 text1 2 3 ONE one NUM CD CARDINAL_B
## nounphrase whitespace
## 1 beg TRUE
## 2 mid TRUE
## 3 end_root TRUE
## 4 beg TRUE
## 5 mid FALSE
## 6 mid TRUE
annotated_description | person |
---|---|
Socially Determined Technology startup based in Washington [ GPE_B ] , DC [ GPE_B ] | Using # DataAnalytics [ WORK_OF_ART_B ] to improve health outcomes | Experts in the [ ORG_B ] Social [ ORG_I ] Determinants [ ORG_I ] of [ ORG_I ] Health [ ORG_I ] ( # SDOH ) | org |
Pharma [ ORG_B ] Tech [ ORG_I ] Outlook [ ORG_I ] Pharma Tech Outlook has contributors from the most established organizations who have been presenting their viewpoint using this unique print platform .. Pharma [ ORG_B ] Tech [ ORG_I ] Outlook [ ORG_I ] Pharma Tech Outlook has contributors from the most established organizations who have been presenting their viewpoint using this unique print platform . | org |
postDICOM PostDICOM is the [ DATE_B ] next [ DATE_I ] generation [ DATE_I ] # PACS [ PERSON_B ] which is built using Cloud [ ORG_B ] technologies .. It is especially designed for medical students , # doctors and # hospitals .. It is especially designed for medical students , # doctors and # hospitals . | person |
caitie hawley global health professional working at UW [ ORG_B ] .. 3/31/18 .. all views my own .. all views my own .. she / her / hers. she / her / hers | org |
INCLINE INCLINE is a [ ORG_B ] Climate [ ORG_I ] Change [ ORG_I ] Research [ ORG_I ] Support [ ORG_I ] Center [ ORG_I ] , created in 2011 [ DATE_B ] at the [ ORG_B ] University [ ORG_I ] of [ ORG_I ] São [ ORG_I ] Paulo [ ORG_I ] ( [ ORG_I ] USP [ ORG_I ] ) , Brazil [ GPE_B ] . | org |
Mark [ PERSON_B ] Pienaar [ PERSON_I ] | person |
Peter [ PERSON_B ] Daszak [ PERSON_I ] @EcoHealthNYC President .. @theNASEM. Forum on # microbialthreats Chair [ ORG_B ] .. Zoologist .. Parasitologist .. Ecologist .. British [ NORP_B ] .. American [ NORP_B ] . | person |
UNC Global Solutions Twitter home of Research [ ORG_B ] , [ ORG_I ] Innovation [ ORG_I ] and [ ORG_I ] Global [ ORG_I ] Solutions [ ORG_I ] at UNC [ ORG_B ] Gillings [ ORG_I ] School [ ORG_I ] of [ ORG_I ] Global [ ORG_I ] Public [ ORG_I ] Health [ ORG_I ] .. Find out about global health events and news here ! | org |
Adelaide Terracciano Digital Evangelist | Technology Enthusiast | Talk to me about [ MONEY_B ] # [ MONEY_I ] DigitalStrategy [ MONEY_I ] # [ MONEY_I ] Futurism # AI # IoT [ MONEY_B ]. Adelaide Terracciano Digital Evangelist | Technology Enthusiast | Talk to me about [ MONEY_B ] # [ MONEY_I ] DigitalStrategy [ MONEY_I ] # [ MONEY_I ] Futurism # AI # IoT [ MONEY_B ] | org |
Lisa [ PERSON_B ] Sullivan [ PERSON_I ] Principal-#Foresight [ PERSON_I ] & [ PERSON_I ] # Strategy at INFUSE [ ORG_B ] Corp. [ ORG_I ]. # [ MONEY_B ] education [ MONEY_I ] # [ MONEY_B ] workforce [ MONEY_I ]. # [ MONEY_B ] emergingtech [ MONEY_I ] # socimp # [ MONEY_B ] Futurist [ MONEY_I ] # [ MONEY_B ] Strategist [ MONEY_I ] # [ MONEY_I ] LongNow9125 [ GPE_B ] # FutureReady [ MONEY_B ] | person |
Pulse Lab Kampala Pulse Lab Kampala [ ORG_B ] is a UN [ ORG_B ] inter - agency [ NORP_B ] initiative , exploring opportunities for big data & [ ORG_B ] data innovation to achieve the [ LOC_B ] Global [ LOC_I ] Goals [ LOC_I ] .. http://t.co/RiOsbeH1ey | org |
CDC [ ORG_B ] Genomics [ ORG_I ] & [ ORG_I ] Precision [ ORG_I ] Health [ ORG_I ] CDC [ ORG_I ] Office [ ORG_I ] of [ ORG_I ] Genomics [ ORG_I ] and [ ORG_I ] Precision [ ORG_I ] Public [ ORG_I ] Health [ ORG_I ] : Using Genomics [ ORG_B ] & [ ORG_I ] Precision [ ORG_I ] Health [ ORG_I ] to Save Lives and Prevent Disease .. Links ≠ Endorsements . | org |
Reason [ ORG_B ] Digital [ ORG_I ] Working [ ORG_I ] exclusively in digital with projects that do social good .. Follow us for advice on websites , mobile , apps , email , innovation and more .. Follow us for advice on websites , mobile , apps , email , innovation and more . | org |
into .. AI -. The Global AI Ecosystem # intoAI get into AI [ GPE_B ] .. The # [ CARDINAL_B ] AI Knowledge and Global [ ORG_B ] Community [ ORG_I ] Platform [ ORG_I ] .. Helps you learn more about [ MONEY_B ] # [ MONEY_I ] AI [ MONEY_I ] .. Helps you learn more about [ MONEY_B ] # [ MONEY_I ] AI [ MONEY_I ] .. See @Neurons_AI , @Awards_AI @Events_AI @RoboticsandAI. # intoAI | org |
Edgar [ PERSON_B ] Garcia [ PERSON_I ]. That is why!!!/A mi me daban 2 [ CARDINAL_B ] ! ! !. That is why!!!/A mi me daban 2 [ CARDINAL_B ] ! ! ! | person |
Joseph [ PERSON_B ] Loh [ PERSON_I ] # PrivateEquity [ ORG_B ] and # Investment [ PERSON_B ] Expert [ PERSON_I ] in # EmergingMarkets [ MONEY_B ] .. Click the link to learn the basics of # PrivateEquity [ MONEY_B ] ( # PE [ MONEY_B ] ) . | person |
bryan gottsman Focus [ ORG_B ] on [ ORG_I ] the edge , AI [ PERSON_B ] , emerging tech .. Background in # HealthIT [ MONEY_B ] .. Husband [ PERSON_B ] and father making it happen every day for my family. Husband [ PERSON_B ] and father making it happen every day for my family | person |
Paul [ PERSON_B ] Blaser [ PERSON_I ] Data [ PERSON_I ] design , data visualization , data architecture , data quality .. Munging data in python , SQL [ ORG_B ] , & [ WORK_OF_ART_B ] .net .. # datavisualization # [ MONEY_B ] dataviz [ MONEY_I ] # [ CARDINAL_B ] dataisbeautiful # Tableau [ MONEY_B ] | person |
David [ PERSON_B ] Harris [ PERSON_I ] CEO , American [ ORG_B ] Jewish [ ORG_I ] Committee [ ORG_I ] ( AJC ) …. “. Too often we enjoy the comfort of opinion without the discomfort of thought .. Too often we enjoy the comfort of opinion without the discomfort of thought ..” - President John [ PERSON_B ] F. [ PERSON_I ] Kennedy [ PERSON_I ] | person |
CDC [ ORG_B ] Genomics [ ORG_I ] & [ ORG_I ] Precision [ ORG_I ] Health [ ORG_I ] CDC [ ORG_I ] Office [ ORG_I ] of [ ORG_I ] Genomics [ ORG_I ] and [ ORG_I ] Precision [ ORG_I ] Public [ ORG_I ] Health [ ORG_I ] : Using Genomics [ ORG_B ] & [ ORG_I ] Precision [ ORG_I ] Health [ ORG_I ] to Save Lives and Prevent Disease .. Links ≠ Endorsements . | org |
desc1 |
---|
Brian Martin M2M / IoT Sales Professional. Currently at Vodafone IoT (hence the tweets). Keeping networks, machines and people connected. Passionate about Life & Rugby. |
Monique Thornton, MPH #Healthcomm: @Westat | Editor: @LTPHmedia | Writer: @theddrj | Let's talk: #socialmedia, #edutainment, #digitalhealth | Ops mine. #amwriting #publichealth |
Bee 布瑞雅 Epidemiologist in-training @queensu, artist by design. | Formerly @iepi_mcmaster, @McMaster_MI @ccghr| Bridging data ethics & infectious disease | She/her. |
Kayla Beth Translator terrestrielle / PhD student @uniofoxford @OxfordDemSci / researching experiences of air quality / views are my own |
Ashley Perry Senior Vice President, Socially Determined. Improving the health of communities and the performance of health care organizations through #SDOH #analytics. |
Melinda Mills Professor University of Oxford, @NuffieldCollege, Director @OxfordDemSci PI @sociogenome project |
MPIDR Max Planck Institute for Demographic Research (MPIDR), Rostock, Germany. News & press releases. |
Center for Innovative Design and Analysis The Center of Innovative Design and Analysis is part of the Department of Biostatistics & Informatics in the @Coloradosph at @CUAnschutz. |
𝕍𝕠𝕝𝕜𝕒𝕟 𝕋𝕠𝕡𝕒𝕝𝕝𝕚 Criminologist | Andrew Young School of Policy Studies | Into Urban Crime, Cybercrime, Future Crime, Cashless Economy, Offender Psychology, Crime Policy, Crypto |
Sam Konneh Entrepreneurial blogger, content writer seeking to merge my vast business, governance knowledge with the exciting SEO, marketing and emerging technologies |