Each day, around 5 000 to 6 000 tweets involve IT directions (or sometimes CIOs).
Twitter is a very interesting social network in terms of audience (decision makers, businesses).
As we are going to see, the influence of tweets can be really high: some tweets about IT directions are retweeted up to 120 000 times !
What if we could analyze automatically all tweets around the world, and interpret them ?
It is our approach here : each tweet about a specific IT direction or IT directions in general is semantically compared to:
- Cost / expense / budget semantic
- Value / service / quality semantic
To achieve a semantic model, we trained a WordToVec algorithm, see the last part of this study for more information.
The following tweets have high similarity with ‘VALUE’ semantic (VALUE score / COST score > 2) :
‘Learn how a data-driven culture enables innovation and empowerment. https://t.co/VU3Ggk2lkN’
‘The CIO Series: The journey to sophisticated artificial intelligence https://t.co/MpXruZtLG1’
The following tweets have high similarity with ‘COST’ semantic (VALUE score / COST score < 0,5) :
‘12 ’best practices’ IT should avoid at all costs | CIO @ITCatalysts https://t.co/0JQrl407vg‘
’ @TMFOtter you’re right! Our CIO criticized the fees & tax inefficiency baked into the strategy. Indexing removes both issues.’
1/ Extract tweets thanks to twitter API
2/ Clean tweets and delete ’’stop words" thanks to a specific dictionnary => The objective here is to keep only words with a good level of semantic information
3/ The key part is to build a semantic similarity model over a huge corpus of english text
4/ Then, with this model, we calculate the similarity of each tweet with both semantic fields :
- Cost / expense / budget semantic
- Value / service / quality semantic
5/ Finally, we geocode tweet locations thanks to google API, and print on the map