Set up Python within R environment:
knitr::opts_chunk$set(echo = TRUE)
#reticulate allows for python code in R environment
library(reticulate)
#set environment and options specific for reticulate
use_condaenv("r-reticulate", required=TRUE)
Import packages:
import pandas as pd
import numpy as np
import string
import re
import html
import unicodedata
import nltk
from nltk.corpus import stopwords
import emoji
import matplotlib.pyplot as plt
import datetime as dt
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.svm import SVC
from tensorflow import keras
from tensorflow.keras import layers, models, callbacks
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import tensorflow as tf
from transformers import (
AutoTokenizer,
TFBertForSequenceClassification,
TFTrainingArguments
)
import datasets
import evaluate
Bring in data:
df = pd.read_csv('data_huang_devansh.csv', usecols=['Content','Label'])
See here for information about data: https://data.mendeley.com/datasets/9sxpkmm8xn/1
This is a compilation of over 800 thousand text snippets which have been hand-labeled for whether or not they constitute hate speech.
The data is made up of only two columns: - Content: the text of the tweet - Label: the hand-coded value for whether the tweet is considered hate speech (1) or not (0)
Let’s lightly tidy and then glance at the data:
#retain only string values
df = df[df['Content'].apply(type)==str]
#remove duplicate Content values
df.drop_duplicates(subset='Content', inplace=True)
df.sample(10, random_state=11219)['Content']
## 740181 RT @geechiegal843: I act like ion ee kno dat b...
## 448765 XD FUCK! FUCK! FUUUUUUUUUUUUCK This is amazing...
## 515694 "\n\nand your atual post:\n\n Hello, Sorry I c...
## 410886 You suck Kat and Andre. Poorest form ever!\nOn...
## 742422 Talk about pussy power 😒
## 663120 niggers
## 789583 Saturday night spent in bed on my own cus ever...
## 81354 ITS NOT A PERSONAL ATTACK HES MY BRO STOP B...
## 92492 Quick, tell someone with real power to block m...
## 401243 @iqy007 @alwalawalbaraaa @DanieleRaineri One w...
## Name: Content, dtype: object
Immediately I can tell there are some aspects of the text worth cleaning up for model-building purposes, including: - Capitalization - Stray spaces and punctuation - Removing user @ mentions - Removing “RT” signifier for retweets I also know from the data documentation that there are emojis, datetimes, and hyperlinks in the data. I should account for all of these.
Let’s also quickly check our class distribution:
df['Label'].value_counts()
## Label
## 0 467033
## 1 93352
## Name: count, dtype: int64
While there are notably fewer instances of hate speech than non-hate speech, this is still a workable distribution. Further, in real world social media moderation, it is highly common that only a small minority of content would violate a hate speech policy.
At this point, I think it’s best to sample our data down to about 50k rows (determined by testing). In an ideal world we would train on all of our data, but via trial and error I have determined training on 500k rows is untenable with my setup.
At this stage, I’ll take my samples based on the proportions in the dataset.
hs_sample_count = int(df['Label'].value_counts()[1]/len(df)*50000)
non_hs_sample_count = 50000 - hs_sample_count
#select 25k hate speech samples
df_hs = df[df['Label']==1].sample(hs_sample_count, random_state=11219)
#select 25k non hate speech samples
df_non_hs = df[df['Label']==0].sample(non_hs_sample_count, random_state=11219)
#concatenate
df1 = pd.concat([df_hs, df_non_hs], axis=0)
#sample to randomly order
df1 = df.sample(50000)
Cleaning
Our SVM and neural network may require some different cleaning techniques, but would benefit from some top-line, model-agnostic cleaning.
First let’s see how the emojis are presented in our text data (since sometimes they appear as the actual symbol, othertimes as a standardized code.)
df1['emoji_list'] = df1['Content'].apply(emoji.emoji_list)
df1['emoji_count'] = df1['emoji_list'].apply(len)
df1.sort_values(by='emoji_count', ascending=False).iloc[5,0]
## 'Šupak meraklije haha😂😂😂😂😂😟😟😟😂😂😂😂'
This shows that the emojis themselves appear in the text and should be handled accordingly.
Now I can do some model-agnostic cleaning:
# Pre‑compiled patterns
URL_RE = re.compile(r"https?://\S+|www\.\S+", flags=re.IGNORECASE)
USERHANDLE_RE = re.compile(r"@\w+") # matches @username
RT_RE = re.compile(r"^(RT)\b\s*", flags=re.IGNORECASE)
WS_RE = re.compile(r"\s+")
def clean_shared(text):
text = str(text)
# 1. HTML entity & Unicode NFC normalization
text = html.unescape(text)
text = unicodedata.normalize("NFC", text)
# 2. Remove 'RT' at start
text = RT_RE.sub("", text)
# 3. Remove @mentions anywhere
text = USERHANDLE_RE.sub("", text)
# 4. Replace URLs
text = URL_RE.sub("<URL>", text)
# 5. Collapse whitespace
text = WS_RE.sub(" ", text)
# 6. Final trim & lowercase
return text.strip().lower()
Run cleaning function:
#clean data
df1['text_clean'] = df1['Content'].apply(clean_shared)
#reduce back to only relevant columns
df1 = df1[['Content', 'text_clean', 'Label']]
See cleaned data:
df1.sample(20, random_state=11219)['text_clean']
## 596334 {{unblock|yo
## 363211 ` peace, shaad iko, thank you for referring th...
## 72297 ` == ha, ha! == you are jealous? dont be, no n...
## 616358 " you're welcome and good luck with the cup! "
## 253212 we are currently making the unreferenced claim...
## 586368 " radiation therapy implications jellytussle, ...
## 343630 ` == mind explaining this edit? == this one? i...
## 600782 i was last in newry in 2003, and they were cel...
## 750978 * lives in chicago * thinks gun control works ...
## 98321 ` :::i am not going to do biasing here but the...
## 39241 == evidence on adil's sockpuppetry == hi mores...
## 650232 venanalysis is not my main information source,...
## 6004 ` :::::::::oh mark, for shame! why lie? i didn...
## 334039 ` :::``1894...simple transformer substations, ...
## 391399 == wikicup 2015 fp nomination == hi there, jus...
## 250952 ` == ``please read`` notice == i'm curious why...
## 62127 == marc mysterio == hi, this kww editor appear...
## 414862 so twitter is calling kat crazy eyes!!!! bless...
## 726615 shit i wish it was socially acceptable for me ...
## 754426 i know she is very good friends with siobhan a...
## Name: text_clean, dtype: object
There is still a lot of stray punctuation: we can take that out for our SVM, but retain it for our neural network where punctuation tends to retain its semantic value.
SVM
I will borrow a function from my previous assignment for cleaning and lemmatizing my text.
#pre‑compile once
NUM_RE = re.compile(r"\d+")
PUNCT_REMOVE = string.punctuation.replace("<", "").replace(">", "")
def clean_and_lemmatize(text, lemmatizer):
#remove digits
text = NUM_RE.sub("", text)
#remove punctuation (but keep < and >)
text = text.translate(str.maketrans("", "", PUNCT_REMOVE))
#collapse any new whitespace and trim
text = re.sub(r"\s+", " ", text).strip()
#tokenize & lemmatize
tokens = nltk.word_tokenize(text)
lemmas = [lemmatizer.lemmatize(t) for t in tokens]
return " ".join(lemmas)
lemmatizer = WordNetLemmatizer()
df1['text_clean_SVM'] = df1['text_clean'].apply(lambda x: clean_and_lemmatize(x, lemmatizer))
Now I can vectorize the data. I’ll use TF-IDF vectorization, which relies on relative frequency rather than raw word counts. I’ll use a max_features value of 5000, so as not to overdo it given my relatively small sample.
#create the vectorizer
vectorizer_1g_5k = TfidfVectorizer(
stop_words='english',
max_features=5000
)
#fit and transform the text column
X_SVM = vectorizer_1g_5k.fit_transform(df1['text_clean_SVM'])
And set y:
y_SVM = df1['Label']
Now we can split the data:
X_train_SVM, X_test_SVM, y_train_SVM, y_test_SVM = train_test_split(
X_SVM, y_SVM, test_size=0.2, random_state=11219
)
While in the past I have experimented with different kernels for SVMs, the linear kernel is generally understood to be the best option for text classification
#linear kernel
clf_linear1 = SVC(kernel='linear')
clf_linear1.fit(X_train_SVM, y_train_SVM)
SVC(kernel='linear')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(kernel='linear')
pred_linear1 = clf_linear1.predict(X_test_SVM)
print("Linear Kernel")
## Linear Kernel
print(classification_report(y_test_SVM, pred_linear1))
## precision recall f1-score support
##
## 0 0.89 0.97 0.93 8333
## 1 0.71 0.39 0.51 1667
##
## accuracy 0.87 10000
## macro avg 0.80 0.68 0.72 10000
## weighted avg 0.86 0.87 0.86 10000
Not bad accuracy overall–but much lower for hate speech, especially for hate speech. This could be an issue of class distribution and, specifically, the comparatively low number of hate speech examples even if the overall sample of 50k should be sufficient.
Experiment 1: Class Distribution & Majority Undersampling
Since I’ll be downsampling already, it may be best to simultaneously under-sample the majority class to retain more examples of hate speech. While class balance isn’t required for my models, it may help to ensure any accuracy achievements are well distributed.
#select 25k hate speech samples
df_hs = df[df['Label']==1].sample(25000, random_state=11219)
#select 25k non hate speech samples
df_non_hs = df[df['Label']==0].sample(25000, random_state=11219)
#concatenate
df2 = pd.concat([df_hs, df_non_hs], axis=0)
#sample to randomly order
df2 = df2.sample(50000)
Re-clean
#clean data
df2['text_clean'] = df2['Content'].apply(clean_shared)
#reduce back to only relevant columns
df2 = df2[['Content', 'text_clean', 'Label']]
Re-clean (specific for SVM) and lemmatize:
lemmatizer = WordNetLemmatizer()
df2['text_clean_SVM'] = df2['text_clean'].apply(lambda x: clean_and_lemmatize(x, lemmatizer))
Re-vectorize:
#create the vectorizer
vectorizer_1g_5k = TfidfVectorizer(
stop_words='english',
max_features=5000
)
#fit and transform the text column
X_SVM2 = vectorizer_1g_5k.fit_transform(df2['text_clean_SVM'])
And set y:
y_SVM2 = df2['Label']
Re-split the data:
X_train_SVM2, X_test_SVM2, y_train_SVM2, y_test_SVM2 = train_test_split(
X_SVM2, y_SVM2, test_size=0.2, random_state=11219
)
Re-train model:
#linear kernel
clf_linear2 = SVC(kernel='linear')
clf_linear2.fit(X_train_SVM2, y_train_SVM2)
SVC(kernel='linear')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(kernel='linear')
pred_linear2 = clf_linear2.predict(X_test_SVM2)
Accuracy report:
print("Linear Kernel")
## Linear Kernel
print(classification_report(y_test_SVM2, pred_linear2))
## precision recall f1-score support
##
## 0 0.80 0.82 0.81 4977
## 1 0.82 0.80 0.81 5023
##
## accuracy 0.81 10000
## macro avg 0.81 0.81 0.81 10000
## weighted avg 0.81 0.81 0.81 10000
2-grams and Max Features
We can see a major positive impact on hate speech identification from the re-sampling, though there is a small dip in performance for non-hate speech.
Let’s see if we can boost performance by expanding my features to include bi-grams. While this massively increases my number of features, the cap on the overall max features should prevent the analysis from getting out of hand.
Re-vectorize:
#create the vectorizer
vectorizer_2g_5k = TfidfVectorizer(
stop_words='english',
ngram_range=(1,2),
max_features=5000
)
#fit and transform the text column
X_SVM3 = vectorizer_2g_5k.fit_transform(df2['text_clean_SVM'])
y_SVM3 = df2['Label']
Re-split the data:
X_train_SVM3, X_test_SVM3, y_train_SVM3, y_test_SVM3 = train_test_split(
X_SVM3, y_SVM3, test_size=0.2, random_state=11219
)
Re-train model:
#linear kernel
clf_linear3 = SVC(kernel='linear')
clf_linear3.fit(X_train_SVM3, y_train_SVM3)
SVC(kernel='linear')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(kernel='linear')
pred_linear3 = clf_linear3.predict(X_test_SVM3)
Accuracy report:
print("Linear Kernel")
## Linear Kernel
print(classification_report(y_test_SVM3, pred_linear3))
## precision recall f1-score support
##
## 0 0.80 0.82 0.81 4977
## 1 0.81 0.80 0.81 5023
##
## accuracy 0.81 10000
## macro avg 0.81 0.81 0.81 10000
## weighted avg 0.81 0.81 0.81 10000
Interestingly, we see very little change here. We can stick with 1-grams.
Let’s look at the most important features:
#extract feature names and class coefficients
feature_names = np.array(vectorizer_1g_5k.get_feature_names_out())
coefs = clf_linear2.coef_.toarray() # shape = (n_classes, n_features)
#for binary, coefs[0] is for class “1” vs “0” decision boundary
#identify top‑k features for each class
def top_features(class_coef, names, k=20):
#largest positive = pushes toward class 1 (hate)
top_pos_idxs = np.argsort(class_coef)[-k:][::-1]
#largest negative = pushes toward class 0 (non‑hate)
top_neg_idxs = np.argsort(class_coef)[:k]
return pd.DataFrame({
"token_pos": names[top_pos_idxs],
"coef_pos": class_coef[top_pos_idxs],
"token_neg": names[top_neg_idxs],
"coef_neg": class_coef[top_neg_idxs],
})
top20 = top_features(coefs[0], feature_names, k=20)
print(top20)
## token_pos coef_pos token_neg coef_neg
## 0 idiot 5.672962 xd -2.086868
## 1 asshole 4.778240 claimed -2.002501
## 2 retard 4.448382 ex -1.878344
## 3 faggot 4.401133 bihday -1.674916
## 4 stupid 4.126624 staff -1.672323
## 5 retarded 4.086403 rt -1.663113
## 6 loser 3.897166 tradition -1.656991
## 7 fuck 3.855585 mentioned -1.651375
## 8 sexist 3.763206 happiness -1.651329
## 9 twat 3.742369 pp -1.645938
## 10 hell 3.672426 stated -1.642616
## 11 moron 3.572866 youtube -1.634592
## 12 nigger 3.468446 affected -1.633192
## 13 bullshit 3.452559 stormfront -1.618396
## 14 spic 3.412869 average -1.608695
## 15 piss 3.353462 published -1.600724
## 16 crap 3.347687 pattern -1.594814
## 17 bastard 3.310819 article -1.580812
## 18 whore 3.237551 bias -1.572208
## 19 mongy 3.227393 particularly -1.572154
Building a FastText FFNN model.
#prepare data
texts = df2['text_clean'].tolist()
labels = df2['Label'].values
#tokenize into integer sequences
max_words = 10000
maxlen = 100
tokenizer = Tokenizer(num_words=max_words, oov_token='<OOV>')
tokenizer.fit_on_texts(texts)
seqs = tokenizer.texts_to_sequences(texts)
X = pad_sequences(seqs, maxlen=maxlen, padding='post')
y = labels
#train/val split
X_train, X_val, y_train, y_val = train_test_split(
X, y, test_size=0.2, stratify=y, random_state=11219
)
#build FastText‑style model
model = models.Sequential([
layers.Embedding(input_dim=max_words, output_dim=100, input_length=maxlen),
layers.GlobalAveragePooling1D(),
layers.Dense(64, activation='relu'),
layers.Dropout(0.5),
layers.Dense(1, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
#train
es = callbacks.EarlyStopping(patience=2, restore_best_weights=True)
model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=10,
batch_size=128,
callbacks=[es],
verbose=2
)
## Epoch 1/10
## 313/313 - 2s - loss: 0.5745 - accuracy: 0.6957 - val_loss: 0.4624 - val_accuracy: 0.7813 - 2s/epoch - 6ms/step
## Epoch 2/10
## 313/313 - 2s - loss: 0.4265 - accuracy: 0.8115 - val_loss: 0.4059 - val_accuracy: 0.8260 - 2s/epoch - 5ms/step
## Epoch 3/10
## 313/313 - 1s - loss: 0.3717 - accuracy: 0.8416 - val_loss: 0.3994 - val_accuracy: 0.8234 - 1s/epoch - 5ms/step
## Epoch 4/10
## 313/313 - 1s - loss: 0.3387 - accuracy: 0.8571 - val_loss: 0.4000 - val_accuracy: 0.8249 - 1s/epoch - 5ms/step
## Epoch 5/10
## 313/313 - 1s - loss: 0.3221 - accuracy: 0.8663 - val_loss: 0.4074 - val_accuracy: 0.8240 - 1s/epoch - 5ms/step
## <tf_keras.src.callbacks.History object at 0x38ef320b0>
#evaluate
loss, acc = model.evaluate(X_val, y_val, verbose=0)
print(f'Validation accuracy: {acc:.3f}')
## Validation accuracy: 0.823
Full eval:
#predict on the validation set
y_pred = (model.predict(X_val) >= 0.5).astype(int).ravel()
##
1/313 [..............................] - ETA: 12s
121/313 [==========>...................] - ETA: 0s
257/313 [=======================>......] - ETA: 0s
313/313 [==============================] - 0s 388us/step
#print a report
print(classification_report(y_val, y_pred, target_names=['non‑hate','hate']))
## precision recall f1-score support
##
## non‑hate 0.86 0.77 0.81 5000
## hate 0.79 0.87 0.83 5000
##
## accuracy 0.82 10000
## macro avg 0.83 0.82 0.82 10000
## weighted avg 0.83 0.82 0.82 10000
Make the data as big as possible while retaining class balance
df['Label'].value_counts()
## Label
## 0 467033
## 1 93352
## Name: count, dtype: int64
df3 = pd.concat([df[df['Label']==0].sample(93352, random_state=11219),
df[df['Label']==1]],
axis=0).sample(186704)
df3['text_clean'] = df3['Content'].apply(clean_shared)
df3['Label'].value_counts()
## Label
## 1 93352
## 0 93352
## Name: count, dtype: int64
#prepare data
texts = df3['text_clean'].tolist()
labels = df3['Label'].values
#tokenize into integer sequences
max_words = 10000
maxlen = 100
tokenizer = Tokenizer(num_words=max_words, oov_token='<OOV>')
tokenizer.fit_on_texts(texts)
seqs = tokenizer.texts_to_sequences(texts)
X = pad_sequences(seqs, maxlen=maxlen, padding='post')
y = labels
#train/val split
X_train, X_val, y_train, y_val = train_test_split(
X, y, test_size=0.2, stratify=y, random_state=11219
)
#build FastText‑style model
model = models.Sequential([
layers.Embedding(input_dim=max_words, output_dim=100, input_length=maxlen),
layers.GlobalAveragePooling1D(),
layers.Dense(64, activation='relu'),
layers.Dropout(0.5),
layers.Dense(1, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
#train
es = callbacks.EarlyStopping(patience=2, restore_best_weights=True)
model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=10,
batch_size=128,
callbacks=[es],
verbose=2
)
## Epoch 1/10
## 1167/1167 - 5s - loss: 0.4616 - accuracy: 0.7823 - val_loss: 0.3834 - val_accuracy: 0.8330 - 5s/epoch - 5ms/step
## Epoch 2/10
## 1167/1167 - 5s - loss: 0.3728 - accuracy: 0.8405 - val_loss: 0.3722 - val_accuracy: 0.8374 - 5s/epoch - 5ms/step
## Epoch 3/10
## 1167/1167 - 6s - loss: 0.3532 - accuracy: 0.8484 - val_loss: 0.3702 - val_accuracy: 0.8384 - 6s/epoch - 5ms/step
## Epoch 4/10
## 1167/1167 - 6s - loss: 0.3399 - accuracy: 0.8521 - val_loss: 0.3767 - val_accuracy: 0.8345 - 6s/epoch - 5ms/step
## Epoch 5/10
## 1167/1167 - 5s - loss: 0.3285 - accuracy: 0.8564 - val_loss: 0.3754 - val_accuracy: 0.8360 - 5s/epoch - 4ms/step
## <tf_keras.src.callbacks.History object at 0x38e314130>
#evaluate
loss, acc = model.evaluate(X_val, y_val, verbose=0)
print(f'Validation accuracy: {acc:.3f}')
## Validation accuracy: 0.838
#predict on the validation set
y_pred = (model.predict(X_val) >= 0.5).astype(int).ravel()
##
1/1167 [..............................] - ETA: 27s
124/1167 [==>...........................] - ETA: 0s
258/1167 [=====>........................] - ETA: 0s
392/1167 [=========>....................] - ETA: 0s
526/1167 [============>.................] - ETA: 0s
662/1167 [================>.............] - ETA: 0s
797/1167 [===================>..........] - ETA: 0s
931/1167 [======================>.......] - ETA: 0s
1068/1167 [==========================>...] - ETA: 0s
1167/1167 [==============================] - 0s 375us/step
#print a report
print(classification_report(y_val, y_pred, target_names=['non‑hate','hate']))
## precision recall f1-score support
##
## non‑hate 0.85 0.82 0.84 18671
## hate 0.83 0.86 0.84 18670
##
## accuracy 0.84 37341
## macro avg 0.84 0.84 0.84 37341
## weighted avg 0.84 0.84 0.84 37341
Change layer width:
#build FastText‑style model
model = models.Sequential([
layers.Embedding(input_dim=max_words, output_dim=100, input_length=maxlen),
layers.GlobalAveragePooling1D(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(1, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
#train
es = callbacks.EarlyStopping(patience=2, restore_best_weights=True)
model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=10,
batch_size=128,
callbacks=[es],
verbose=2
)
## Epoch 1/10
## 1167/1167 - 6s - loss: 0.4508 - accuracy: 0.7881 - val_loss: 0.3860 - val_accuracy: 0.8314 - 6s/epoch - 5ms/step
## Epoch 2/10
## 1167/1167 - 5s - loss: 0.3675 - accuracy: 0.8413 - val_loss: 0.3715 - val_accuracy: 0.8385 - 5s/epoch - 5ms/step
## Epoch 3/10
## 1167/1167 - 6s - loss: 0.3491 - accuracy: 0.8492 - val_loss: 0.4087 - val_accuracy: 0.8183 - 6s/epoch - 5ms/step
## Epoch 4/10
## 1167/1167 - 6s - loss: 0.3358 - accuracy: 0.8541 - val_loss: 0.3702 - val_accuracy: 0.8380 - 6s/epoch - 5ms/step
## Epoch 5/10
## 1167/1167 - 6s - loss: 0.3231 - accuracy: 0.8575 - val_loss: 0.3742 - val_accuracy: 0.8375 - 6s/epoch - 5ms/step
## Epoch 6/10
## 1167/1167 - 6s - loss: 0.3148 - accuracy: 0.8605 - val_loss: 0.3810 - val_accuracy: 0.8357 - 6s/epoch - 5ms/step
## <tf_keras.src.callbacks.History object at 0x38e391630>
#evaluate
loss, acc = model.evaluate(X_val, y_val, verbose=0)
print(f'Validation accuracy: {acc:.3f}')
## Validation accuracy: 0.838
#predict on the validation set
y_pred = (model.predict(X_val) >= 0.5).astype(int).ravel()
##
1/1167 [..............................] - ETA: 24s
119/1167 [==>...........................] - ETA: 0s
247/1167 [=====>........................] - ETA: 0s
378/1167 [========>.....................] - ETA: 0s
508/1167 [============>.................] - ETA: 0s
640/1167 [===============>..............] - ETA: 0s
769/1167 [==================>...........] - ETA: 0s
899/1167 [======================>.......] - ETA: 0s
1028/1167 [=========================>....] - ETA: 0s
1160/1167 [============================>.] - ETA: 0s
1167/1167 [==============================] - 0s 389us/step
#print a report
print(classification_report(y_val, y_pred, target_names=['non‑hate','hate']))
## precision recall f1-score support
##
## non‑hate 0.86 0.81 0.83 18671
## hate 0.82 0.87 0.84 18670
##
## accuracy 0.84 37341
## macro avg 0.84 0.84 0.84 37341
## weighted avg 0.84 0.84 0.84 37341