Source file ⇒ lec30.Rmd
Make a tree diagram (starting at the root) for one of the <record>
nodes of the appricot XML.
First we will see how to creat an XML document from within R. Later we will see how to read/process XML from within R.
It will be helpful to have in your mind the structure of the XML document (i.e the XML tree) before you do anything in R, especially when you’re creating a new XML document.
You will need the XML package and the functions newXMLDoc
, newXMLNode
, and saveXML
.
The syntax for newXMLNode
is:
newXMLNode(name,attrs=NULL,doc=NULL, parent=NULL)
where
name
is the name of the nodeattrs
is an attributedoc
is the name of the root nodeparent
is the name of the parent node
Here are the R commands to create this in R:
library(XML)
doc <- newXMLDoc()
root <- newXMLNode("toplevel", doc = doc)
child1 <- newXMLNode("level1", parent = root)
newXMLNode("level2", "This is the content", parent = child1)
saveXML(doc, file = "/Users/Adam/Desktop/simple.xml")
Note: I only need to store (assign to a variable) nodes that I later need to refer to as parents. For the leaf nodes, I just use newXMLNode without assigning. The names of the nodes in R (e.g.root,child1) are not part of the resulting XML file.
Run the above R code in R studio and confirm that you get the desired XML code. Next, add a sibling element (level1a) ,having different content, to the above XML code.
KML is a type of XML file specifically for geographic information / visualization (i.e. for use in Google Earth, Google Maps, etc). XML is a broader category of markup that includes KML.
KML, like HTML, has predefined tags (for example, <Document>
, or <Placemark>
. Here is the definitive introduction to KLM. https://developers.google.com/kml/documentation/kml_tut#for-more-information To find out about KML tags, refer to the KML reference link in the above website.
Here is a simple example of a KML file.
<?xml version="1.0"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
<Placemark>
<name>New York City</name>
<description>New York City</description>
<Point>
<coordinates>-74.006393,40.714172,0</coordinates>
</Point>
</Placemark>
</Document>
</kml>
Some special features of KLM:
The root node is called <kml>
and has a special attribute called an extensible markup language namespace (xmlns) <kml xmlns="http://www.opengis.net/kml/2.2">
to indicate that it is a KML file.
The child of the <kml>
node is called <Document>
and its child is called <Placemark>
.
A Placemark is one of the most commonly used features in Google Earth. It marks a position on the Earth’s surface, using a yellow pushpin as the icon. The simplest Placemark includes only a
Placemark has children including <Point>
, <TimeStamp>
, <Description>
etc. For example if you want to put a
Steps:
Diagram the tree structure of this document.
Load the XML
package. library(XML)
Creat the document doc <- newXMLDoc()
Using newXMLNode
, create the root node.
root <- newXMLNode("kml",namespaceDefinitions = "http://www.opengis.net/kml/2.2", doc = doc)
newXMLNode
to create the Document node and its children.docmt <- newXMLNode("Document", parent = root)
pm <- newXMLNode("Placemark", parent = docmt)
name <- newXMLNode("name", "New York City", parent = pm)
description <- newXMLNode("description", "New York City", parent = pm)
pt <- newXMLNode("Point", parent = pm)
newXMLNode("coordinates", "-74.006393,40.714172,0", parent = pt)
saveXML(doc, "/Users/Adam/Desktop/simple.kml")
Put these together in a .Rmd file and open simple.kml in sublime. Does it look correct?