Source file ⇒ lec30.Rmd

Today

  1. Tree structure of an XML document
  2. Writing an XML document
  3. Make KLM document for geographic data (google earth)

1. Tree structure of an XML document

Task for you

Make a tree diagram (starting at the root) for one of the <record> nodes of the appricot XML.

2. Creating an XML documents from within R

First we will see how to creat an XML document from within R. Later we will see how to read/process XML from within R.

It will be helpful to have in your mind the structure of the XML document (i.e the XML tree) before you do anything in R, especially when you’re creating a new XML document.

You will need the XML package and the functions newXMLDoc, newXMLNode, and saveXML.

The syntax for newXMLNode is:

newXMLNode(name,attrs=NULL,doc=NULL, parent=NULL)

where

name is the name of the node
attrs is an attribute
doc is the name of the root node
parent is the name of the parent node

Simple example:

Here are the R commands to create this in R:

library(XML)

doc <- newXMLDoc()  
root <- newXMLNode("toplevel", doc = doc)   
child1 <- newXMLNode("level1", parent = root)   
newXMLNode("level2", "This is the content", parent = child1)
saveXML(doc, file = "/Users/Adam/Desktop/simple.xml")

Note: I only need to store (assign to a variable) nodes that I later need to refer to as parents. For the leaf nodes, I just use newXMLNode without assigning. The names of the nodes in R (e.g.root,child1) are not part of the resulting XML file.

Task for you

Run the above R code in R studio and confirm that you get the desired XML code. Next, add a sibling element (level1a) ,having different content, to the above XML code.

3. Keyhole Markup Language (KML)

KML is a type of XML file specifically for geographic information / visualization (i.e. for use in Google Earth, Google Maps, etc). XML is a broader category of markup that includes KML.

KML, like HTML, has predefined tags (for example, <Document>, or <Placemark>. Here is the definitive introduction to KLM. https://developers.google.com/kml/documentation/kml_tut#for-more-information To find out about KML tags, refer to the KML reference link in the above website.

Here is a simple example of a KML file.

<?xml version="1.0"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
 <Document>
  <Placemark>
   <name>New York City</name>
   <description>New York City</description>
   <Point>
     <coordinates>-74.006393,40.714172,0</coordinates>
   </Point>
  </Placemark>
 </Document>
</kml>

Task for you:

  1. Copy above kml file into Sublime and save it as simple.kml.
  2. Download Google Earth: https://www.google.com/earth/download/ge/agree.html
  3. Open simple.kml with Google Earth.

Some special features of KLM:

The root node is called <kml> and has a special attribute called an extensible markup language namespace (xmlns) <kml xmlns="http://www.opengis.net/kml/2.2"> to indicate that it is a KML file.

The child of the <kml> node is called <Document> and its child is called <Placemark>.

A Placemark is one of the most commonly used features in Google Earth. It marks a position on the Earth’s surface, using a yellow pushpin as the icon. The simplest Placemark includes only a element, which specifies the location of the Placemark. You can specify a name and a custom icon for the Placemark, and you can also add other geometry elements to it.

Placemark has children including <Point>, <TimeStamp>, <Description> etc. For example if you want to put a at each Placemark you need to follow the following syntax described here: https://developers.google.com/kml/documentation/kmlreference#timestamp

Making a KML file

Steps:

  1. Diagram the tree structure of this document.

  2. Load the XML package. library(XML)

  3. Creat the document doc <- newXMLDoc()

  4. Using newXMLNode, create the root node.

root <- newXMLNode("kml",namespaceDefinitions = "http://www.opengis.net/kml/2.2", doc = doc)

  1. Use newXMLNode to create the Document node and its children.
docmt <- newXMLNode("Document", parent = root)
pm <- newXMLNode("Placemark", parent = docmt)
name <- newXMLNode("name", "New York City", parent = pm)
description <- newXMLNode("description", "New York City", parent = pm)
pt <- newXMLNode("Point", parent = pm)
newXMLNode("coordinates", "-74.006393,40.714172,0", parent = pt)
  1. save kml document

saveXML(doc, "/Users/Adam/Desktop/simple.kml")

Task for you

Put these together in a .Rmd file and open simple.kml in sublime. Does it look correct?