This Dataset

The dataset created has 55 observations (one for each city) of seven variables. The variables tracked are:

C_ID– City Identification. This is a standard variable to have in datasets. Each city is assigned a number in order of their appearance in the book.

name– Name of the city.

page– the page that the city occurs on in the book. For multiple page cities, this is the page that the city starts on.

chapter– which chapter the city is found in.

category– the category that the city has been assigned to. The categories are as follows: continuous, dead, desire, eyes, hidden, memory, names, signs, sky, thin, and trading.

numincat– The number in its category that each city is assigned. For example, in the above output we can see that Dorothea is 1, because it is the first in the category of desire, Tamara is also 1 because it is the first city in the category of signs, and Zaira is 3 because it is the third city in the category of memory.

length– how long (in pages) each city is.

A Note About The Plots

The plots used for this are beeswarm plots. A beeswarm is a type of scatterplot where the spacing and overlap between points is controlled resulting in better readability, and often an overall effect of the points resembling a beeswarm.

The Plots: Visualising Calvino’s Invisible Cities with R

The first plot shows the chapter number on the x-axis, and number in category on the y-axis. Each dot represents one city, and is colour-coded according to category. Assuming all goes well with knitting the code into the output document, one should be able to hover over each dot with their cursor and see the name of the city represented by that dot, as well as some other information.

The second plot shows category on the x-axis, and page number on the y-axis. The categories are again colour-coded, and shown from left to right in order of first appearance. Again, each dot represents one city, and hovering over each dot should provide further information about the data point.

That being said, something to note as a limitation (or a feature, depending on viewpoint) of this plot is that, because the shape is determined by the page number each city starts on, it is heavily affected by the breaks between chapters and the way that the cities tend to get longer in some later chapters. The book’s breaks in between chapters, where there are pages of Marco Polo and Kublai Khan conversing, can be seen in the the periodical horizontal gaps between points at places such as pages 20 to 30, or 130 to 140. The tendency for the cities to be longer in certain later chapters causes the lack of symmetry in places such as the bottom left and top right corners. These are where chapter 1’s cities being significantly shorter on average than chapter 9’s made the bottom four (memory-category) points make a more equilateral triangle shape, while the top four (hidden-category) points made a longer, more isosceles triangle shape. The irregularities caused by using page number on the y-axis can be corrected if desired by using chapter number instead.

The third plot shows category on the x-axis, and chapter number on the y-axis. The points are still colour-coded to category, which is in order of first appearance, and each dot still represents a single city.

Note not only the cascade of the data across categories and chapters, but also that this plot has perfect twofold rotational symmetry around the city of Baucis. That is to say, if this plot was rotated 180º around the center point (which falls exactly on the data point of Baucis), the overall shape would look exactly the same.