Usingscispacy
for analysing fluoride texts - python pacakges for scientific and clinical NER.
#
py_install("python")
#py_install("pytorch")
#py_install("spacy-transformers", pip = TRUE)
py_install("pandas")
#import("torch")
library(tidyverse)
#py_install("ssl")
py_install("spacy")
py_install("scispacy", pip = TRUE)
py_install("https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_scibert-0.4.0.tar.gz", pip = TRUE)
py_install("https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_ner_craft_md-0.4.0.tar.gz", pip = TRUE)
import("spacy")
## Module(spacy)
import("scispacy")
## Module(scispacy)
import("en_core_sci_scibert")
## Module(en_core_sci_scibert)
import flair
import spacy
import scispacy
import pandas as pd
import spacy_transformers
from spacy import displacy
nlp = spacy.load("en_core_sci_scibert")
nlp1 = spacy.load("en_ner_craft_md")
text = "Alterations in the hypocretin receptor 2 and preprohypocretin genes produce narcolepsy in some animals, butter bullfinch, octreotide."
doc = nlp(text)
print(doc.ents )
## (Alterations, hypocretin receptor 2, preprohypocretin, genes, narcolepsy, animals, butter, bullfinch, octreotide)
displacy.render(doc, jupyter = True, style = 'ent')
text = "Alterations in the hypocretin receptor 2 and preprohypocretin genes produce narcolepsy in some animals, butter bullfinch, octreotide."
def createTable(nlp, document):
doc = nlp(document)
values = {}
for x in doc.ents:
values[x.text] = x.label_
return values
import spacy
nlp = spacy.load("en_core_sci_scibert")
createTable(nlp, text)
## {'Alterations': 'ENTITY', 'hypocretin receptor 2': 'ENTITY', 'preprohypocretin': 'ENTITY', 'genes': 'ENTITY', 'narcolepsy': 'ENTITY', 'animals': 'ENTITY', 'butter': 'ENTITY', 'bullfinch': 'ENTITY', 'octreotide': 'ENTITY'}
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained('t5-base')
model = AutoModelWithLMHead.from_pretrained('t5-base', return_dict=True)
## C:\Users\Julian\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\transformers\models\auto\modeling_auto.py:810: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
## FutureWarning,
sequence = ("In May, Churchill was still generally unpopular with many Conservatives and probably most of the Labour Party. Chamberlain "
"remained Conservative Party leader until October when ill health forced his resignation. By that time, Churchill had won the "
"doubters over and his succession as party leader was a formality."
" "
"He began his premiership by forming a five-man war cabinet which included Chamberlain as Lord President of the Council, "
"Labour leader Clement Attlee as Lord Privy Seal (later as Deputy Prime Minister), Halifax as Foreign Secretary and Labour's "
"Arthur Greenwood as a minister without portfolio. In practice, these five were augmented by the service chiefs and ministers "
"who attended the majority of meetings. The cabinet changed in size and membership as the war progressed, one of the key "
"appointments being the leading trades unionist Ernest Bevin as Minister of Labour and National Service. In response to "
"previous criticisms that there had been no clear single minister in charge of the prosecution of the war, Churchill created "
"and took the additional position of Minister of Defence, making him the most powerful wartime Prime Minister in British "
"history. He drafted outside experts into government to fulfil vital functions, especially on the Home Front. These included "
"personal friends like Lord Beaverbrook and Frederick Lindemann, who became the government's scientific advisor."
" "
"At the end of May, with the British Expeditionary Force in retreat to Dunkirk and the Fall of France seemingly imminent, "
"Halifax proposed that the government should explore the possibility of a negotiated peace settlement using the still-neutral "
"Mussolini as an intermediary. There were several high-level meetings from 26 to 28 May, including two with the French "
"premier Paul Reynaud. Churchill's resolve was to fight on, even if France capitulated, but his position remained precarious "
"until Chamberlain resolved to support him. Churchill had the full support of the two Labour members but knew he could not "
"survive as Prime Minister if both Chamberlain and Halifax were against him. In the end, by gaining the support of his outer "
"cabinet, Churchill outmanoeuvred Halifax and won Chamberlain over. Churchill believed that the only option was to fight on "
"and his use of rhetoric hardened public opinion against a peaceful resolution and prepared the British people for a long war "
"– Jenkins says Churchill's speeches were 'an inspiration for the nation, and a catharsis for Churchill himself'."
" "
"His first speech as Prime Minister, delivered to the Commons on 13 May was the 'blood, toil, tears and sweat' speech. It was "
"little more than a short statement but, Jenkins says, 'it included phrases which have reverberated down the decades'.")
inputs = tokenizer.encode("summarize: " + sequence,
return_tensors='pt',
max_length=512,
truncation=True)
inputs[0]
## tensor([21603, 10, 86, 932, 6, 2345, 1092, 47, 341, 2389,
## 73, 27302, 28, 186, 23053, 7, 11, 1077, 167, 13,
## 8, 16117, 3450, 5, 9572, 521, 77, 3, 7361, 23053,
## 3450, 2488, 552, 1797, 116, 3, 1092, 533, 5241, 112,
## 25372, 5, 938, 24, 97, 6, 2345, 1092, 141, 751,
## 8, 3228, 277, 147, 11, 112, 22289, 38, 1088, 2488,
## 47, 3, 9, 4727, 485, 5, 216, 1553, 112, 2761,
## 2009, 57, 3, 10454, 3, 9, 874, 18, 348, 615,
## 4566, 84, 1285, 9572, 521, 77, 38, 2809, 1661, 13,
## 8, 2063, 6, 16117, 2488, 205, 3335, 71, 8692, 15,
## 38, 2809, 276, 5927, 63, 21085, 41, 5867, 52, 38,
## 3, 16911, 5923, 3271, 201, 31150, 38, 11957, 7471, 11,
## 16117, 31, 7, 13962, 1862, 2037, 38, 3, 9, 6323,
## 406, 4833, 5, 86, 1032, 6, 175, 874, 130, 3,
## 28984, 57, 8, 313, 5752, 7, 11, 6323, 7, 113,
## 5526, 8, 2942, 13, 4677, 5, 37, 4566, 2130, 16,
## 812, 11, 4757, 38, 8, 615, 2188, 15, 26, 6,
## 80, 13, 8, 843, 14936, 271, 8, 1374, 1668, 7,
## 7021, 343, 28031, 493, 2494, 38, 3271, 13, 16117, 11,
## 868, 1387, 5, 86, 1773, 12, 1767, 12334, 7, 24,
## 132, 141, 118, 150, 964, 712, 6323, 16, 1567, 13,
## 8, 22670, 13, 8, 615, 6, 2345, 1092, 990, 11,
## 808, 8, 1151, 1102, 13, 3271, 13, 24296, 6, 492,
## 376, 8, 167, 2021, 615, 715, 5923, 3271, 16, 2390,
## 892, 5, 216, 3, 23505, 1067, 2273, 139, 789, 12,
## 23334, 3362, 3621, 6, 902, 30, 8, 1210, 7383, 5,
## 506, 1285, 525, 803, 114, 2809, 28148, 14370, 11, 22705,
## 14482, 15, 2434, 6, 113, 1632, 8, 789, 31, 7,
## 4290, 8815, 5, 486, 8, 414, 13, 932, 6, 28,
## 8, 2390, 31578, 1208, 5205, 16, 10742, 12, 6393, 157,
## 12546, 11, 8, 2589, 13, 1410, 13045, 27432, 6, 31150,
## 4382, 24, 8, 789, 225, 2075, 8, 5113, 13, 3,
## 9, 3, 26262, 3065, 7025, 338, 8, 341, 18, 8992,
## 8792, 6887, 7, 32, 6129, 38, 46, 25960, 63, 5,
## 290, 130, 633, 306, 18, 4563, 4677, 45, 2208, 12,
## 2059, 932, 6, 379, 192, 28, 8, 2379, 2761, 1838,
## 419, 63, 29, 402, 26, 5, 2345, 1092, 31, 7,
## 7785, 47, 12, 2870, 30, 6, 237, 3, 99, 1410,
## 2468, 17680, 920, 6, 68, 112, 1102, 3, 7361, 554,
## 1720, 2936, 552, 9572, 521, 77, 13803, 12, 380, 376,
## 5, 2345, 1092, 141, 8, 423, 380, 13, 8, 192,
## 16117, 724, 68, 2124, 3, 88, 228, 59, 7905, 38,
## 5923, 3271, 3, 99, 321, 9572, 521, 77, 11, 31150,
## 130, 581, 376, 5, 86, 8, 414, 6, 57, 3,
## 11866, 8, 380, 13, 112, 12231, 4566, 6, 2345, 1092,
## 91, 348, 16445, 26, 31150, 11, 751, 9572, 521, 77,
## 147, 5, 2345, 1092, 6141, 24, 8, 163, 1182, 47,
## 12, 2870, 30, 11, 112, 169, 13, 23051, 614, 4632,
## 452, 3474, 581, 3, 9, 9257, 3161, 11, 2657, 8,
## 2390, 151, 21, 3, 9, 307, 615, 3, 104, 28779,
## 845, 2345, 1092, 31, 7, 26147, 130, 3, 31, 152,
## 3773, 1])
from transformers import pipeline
summarizer = pipeline("summarization")
summarized = summarizer(sequence, min_length=75, max_length=300)
# Print summarized text
print(summarized)
## [{'summary_text': ' In May, Churchill was still generally unpopular with many Conservatives and probably most of the Labour Party . Chamberlain remained Conservative Party leader until October when ill health forced his resignation . By that time, Churchill had won the doubters over and his succession as party leader was a formality . He began his premiership by forming a five-man war cabinet which included Chamberlain as Lord President of the Council, Labour leader Clement Attlee as Lord Privy Seal .'}]
summary_ids = model.generate(inputs, max_length = 500, min_length=80, length_penalty=5, num_beams=2)
summary = tokenizer.decode(summary_ids[0])
summary[6:]
## "churchill formed a five-man war cabinet which included chamberlain as Lord President of the Council, labour leader Clement Attlee as Lord Privy Seal (later as Deputy Prime Minister), Halifax as Foreign Secretary and labour's Arthur Greenwood as a minister without portfolio. he drafted outside experts into government to fulfil vital functions, especially on the home front. he was the most powerful wartime prime minister in British history.</s>"
py$summary %>%
as_tibble() %>%
gt::gt()
value |
<pad> churchill formed a five-man war cabinet which included chamberlain as Lord President of the Council, labour leader Clement Attlee as Lord Privy Seal (later as Deputy Prime Minister), Halifax as Foreign Secretary and labour's Arthur Greenwood as a minister without portfolio. he drafted outside experts into government to fulfil vital functions, especially on the home front. he was the most powerful wartime prime minister in British history.</s> |
library(tidytext)
py$sequence %>%
myScrapers::text_summariser(., n = 5) %>%
as_tibble() %>%
gt::gt()
value |
Churchill had the full support of the two labour members but knew he could not survive as prime minister if both chamberlain and halifax were against him. In the end, by gaining the support of his outer cabinet, churchill outmanoeuvred halifax and won chamberlain over. He began his premiership by forming a five-man war cabinet which included chamberlain as lord president of the council, labour leader clement attlee as lord privy seal (later as deputy prime minister), halifax as foreign secretary and labour's arthur greenwood as a minister without portfolio. In may, churchill was still generally unpopular with many conservatives and probably most of the labour party. In response to previous criticisms that there had been no clear single minister in charge of the prosecution of the war, churchill created and took the additional position of minister of defence, making him the most powerful wartime prime minister in british history. |