First you have to install R as a programming language in its pure form. R Studio is an interface that is based on the language so you can download the newest R version for your operating system under this link:
Official Page of the Comprehensive R Archive Network
After downloading R and opening it you have a blank interface which basically works like a calculator.
Example in R
Here you see some basic examples for what you can do and how the language works. Like in many languages you can use basic algebraic operators and you can give specific objects (here it is x) a value.
However, since this interface is quite hard to organize R Studio is a possibility to take care of a multitude of variables, functions and data at the same time so we will install that to also do our text analysis there.
R Studio can be downloaded here:
After downloading R-Studio you will see the interface which consists of four parts that you can arrange the way you want. You have a part where you only have the code, you have the console which is the same as the blank R interface we saw and which gives you the output of your code, you have a section where you can see the environment which shows you all the datasets, variables, function and other objects you created (like x) and you have a fourth window that shows you plots and the help window. You will also see a part that says “packages”. RQDA, the program that we will be using for our text analysis, is a package. A package is basically someone elses code for doing something specific. So when you want to do something specific you often use packages.
So the same calculation we did in the blank R interface looks this way in R Studio:
R-Studio Interface
Tipp: If you are unsure about something in your code or in your function or you maybe just want to know more about a package just press F1 while being on the respective object that you want to know more about.
RQDA is a package and all packages are installed in the same way. You need to have a functioning internet connection. The command install.packages is used and you put the name of the package in citation marks. Then, whenever you want to use the package you use the command library() and the name of the package without quotation marks.
install.packages("RQDA")
library(RQDA)
After the first installation you will see this window below, it will only pop up the first time, just click ok to install GTK+ which is needed for the application.
Install GTK+
Click OK here!
If there are problems installing RQDA that might be because the respective RGtk2 package on which RQDA is based. However, you can just install RGtk2 manually under this link:
install.packages("https://cran.r-project.org/bin/windows/contrib/3.3/RGtk2_2.20.31.zip", repos=NULL)
Packages can also be loaded manually by just selecting them in the package window and klicking on “install”.
After installing all of this a window will pop up that looks like this:
RQDA window
This window is where we will work for our text analysis. All the analyzed texts, the codes and memos as well as other attributes that you can use are saved in a “project” which is a .rqda file. The .rqda file is basically a SQLite file that is like a database for all the information that we will collect.
So first you create a new project and save it in the folder where you want to have it. Since we will all work on the same project, the project will be saved in our Google drive and after each researcher added their part to the coding and the analysis the next researcher can open it and keep on working on it.
Small Tipp: Working with R can also help you have a better structure of how you organize and name your files. Often R has problems reading file names that have points or blanks in them so just use a _ and try to use short and fitting names.
In this example the new project is called “Beispiel” which is the German word for example. Now you know a new German word, too.
After creating a new project the R console immediately gives you the path of where your project is now saved, which can look like this:
"C:\\Users\\ASUS\\Documents\\Uni\\M.A.-Studium_Uni_Heidelberg_2017\\Fallstudien_quali\\Diskursanalyse\\Beispielinterview\\Beispiel"
What you can also do during your research is creating so called Memos. They are made for you to take different notes on different levels of your research so if you have an important info on a file or a project that is a good place to leave that so you or other researchers can consider that.
Now we read in the texts that we will analyze. They need to be in .txt format which is the most basic format for such writings and allows us to just work with the bare text in order to not have any issues with the text at some time being opened in another format.
The files I use in this example are word files so instead of keeping them as .docx I will go and save them as .txt so they can be read into RQDA. After having saved the texts as .txt you can read them into RQDA by clicking on
Import and selecting the text from the respective folder you used. .
Oft ist es aber auch so, dass man einen fertigen Text einfach direkt in das Programm einlesen will, wie zum Beispiel einen Zeitungsartikel oder eine politische Rede, die man analysieren will. In diesem Fall muss man diesen Umweg nicht gehen, sondern kann einfach durch den Neu Button einen Text manuell einfügen.
However, sometimes you might have the case that you use a text that you directly want to copy and paste into the program, for example if you have it from a newspaper or a homepage. To do so, just click on New (which is Neu in German, another word), paste the text into the window (2.) and click Save To Project
Most importantly the text needs to be coded, which is happening through codes and code categories. The code categories are the larger constructs that are elaborated theoretically in your research. In our example “Sexist Language” can be a code category and it can have different subcategories like “Sexist Slur” or “Devaluation”. The subcategories are what is coded and the greater code category is what shows us in general which concept was used where.
After creating the codes that you need, you open the text file that you want to analyze (1.), select the code that you want to assign to a specific part of the text (2.), select the part of the text that you want to code (3.) and click on Mark (4.).
Now the marked part is blue and assigned to the code, however you can change the colors of the codes under Settings . Looking at the list of your codes you can also double click on the code and you will see all the text fragments that were assigned to the code, which helps you have an overview of what kind of fragments you have chosen so far. If you coded a text fragment wrongly you can just click on the fragment and choose the Unmark option.
The other options here are again Memo where you can leave notes on a code or a text and the rename function that changes the name of the code.
According to the codes you can create Code Categories that you assign a group of codes to. “Hinzufügen” is just “add” in German. Another word.
With these code categories you can then do what Goertz (2012) methodologically calls concept specification where you have a Basic level, which is the general concept, a Second level, which shows the dimensions of the concept and the Indicator/data level which are the specific codes of the concept.
| Basic level | General Concept (p.E. Sexism) | |||
|---|---|---|---|---|
| Second Level | Code Category |
|||
| Indicator/data level | Code |
|||
Therefore you can go on the list of Codes and select Add to Code Category... and just select the code category that the code belongs to.
Besides the information that we can get from our coding there is general info on our files that we want to save, which is for example the wave in which the report was made and the channel. To do that we need to create attributes that we give the files. Lets say we create an attribute for the channel to which the report belongs. First you create the attribute by giving it a name, which here is “Channel”. Then you click on class to define whether the info that you give the files is numeric or a character class. Here we want to give the channels name as info, therefore it is a character class.
To then assign a file to a specific attribute you click on the attribute that you want to give to the files, then click on the file that you want to give the attribute to, click on Attributes and then enter the Value that you give the file. Since the file that is chosen in this example is the TV channel Al Sharqiya and the class is character we just type in Al Sharqiya, leave the text line and click Save and Close.
Doing this for all files helps us get an overview of which texts belong to wich category. To get all given attributes just go on the file section, right click on View Attributes and then you get a list of all the files and the attribute values that they have. Note that the Wave attribute is numeric so you see that depending on which wave it belongs to, the information is just a number.
In the course of your work you will assign a lot of codes and have different code categories. To bring that in context and to be able to have a greater picture that is quantifiable you can go back to R Studio and use the functions that RQDA has to give you summaries of the big .rqda file that you will create.
With getCodingTable() you can get all your codes (which are here the Archetypes that we use in our analysis) with a number of info on them:
| Variable | Meaning |
|---|---|
| cid | Code id |
| fid | File id |
| codename | Code name |
| filename | File name |
| CodingLength | The number of characters used for code |
| index1 | beginning of index of code |
| index2 | end of index of code |
With summaryCodings() you will get, as the name says, a summary of all codings since the above table will be really long after a while. it gives you:
| Variable | Meaning |
|---|---|
| NumOfCoding | Number of coding for each code. |
| AvgLength | Average number of characters in codings for each code. |
| NumOfFile | Number of files coded for each code. |
| CodingOfFile | Number of codings for each file. Returns NULL if byFile is FALSE. |
All of this info can be stored in objects so we can further calculate other interesting quantities like correlations with them to maybe find out which channel had what category the most. Having those quantities you can also plot relations with them like in this example of a network plot of codes and categories:
Picture from:
https://lucidmanager.org/qualitative-data-science/
Goertz G. Social Science Concepts: A User’s Guide. 2012.
Huang, Ronggui. (2018). RQDA: R-based Qualitative Data Analysis. R package version 0.3-1. URL http://rqda.r-forge.r-project.org.
Mayring, Philipp. “Qualitative content analysis.” A companion to qualitative research 1 (2004): 159-176.
Sartori, Giovanni. “Concept misformation in comparative politics.” American political science review 64.4 (1970): 1033-1053.