Reading and writing data in XML format

XML (Extensible Markup Language) is a human-readable common text format that can be used for many purposes. Along with JSON, it can be used as a file format for your apps and is also often used to return data from web services.

You can create, open, modify and manage XML using XMLDocument and related classes. The XML document format uses tags similar to what you may see in HTML. These tags create XML nodes that contain your data. The tags are case-sensitive.

Here is example XML that describes three teams in a fictional baseball league:

<?xml version="1.0" encoding="UTF-8"?>
<League>
    <Team name="Seagulls">
            <Player name="Bob" position="1B" />
            <Player name="Tom" position="2B" />
    </Team>
    <Team name="Pigeons">
            <Player name="Bill" position="1B" />
            <Player name="Tim" position="2B" />
    </Team>
    <Team name="Crows">
            <Player name="Ben" position="1B" />
            <Player name="Ty" position="2B" />
    </Team>
</League>

A tag is anything that is between the brackets, such as <Team>. Within a tag you may have attributes, which is what name is within the Team tag. So in this snippet of XML:

<Team name="Seagulls">

Team is a tag and name is an attribute.

The XML structure is very specific. Every tag must have a closing tag, which is the tag name prefixed with a slash. The closing tag for <Team> is </Team>. It is possible to have a single line tag that has its closing tag embedded in it. You can see this here:

<Player name="Bob" position="1B" />

Because this tag ends in “/>” it is considered to close itself.

Tag names can be anything you want. Because this is your file format, you define the tags and other specifics of its format.

There are many classes available to help you read, process and create XML files, including: XMLDocument, XMLNode, XMLElement and XMLNodeList. XMLDocument is the primary class you use to work with XML documents. XMLDocument is used to create new XML documents, modify XML documents and load XML documents.

To create XML documents, you create a new instance of XMLDocument and then add the Nodes for the tags you have defined.

Creating an XML document

To create the XML shown at the beginning of this topic, you first create your XMLDocument instance:

Var xml As New XMLDocument

Now you can add the topmost node (called the root) to the XML document:

Var root As XMLNode
root = xml.AppendChild(xml.CreateElement("League"))

And now you can add the first team to the root:

Var team As XMLNode
team = root.AppendChild(xml.CreateElement("Team"))

The team has an attribute containing its name:

team.SetAttribute("name", "Seagulls")

Now you can add the players to the team:

Var player As XMLNode
player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Bob")
player.SetAttribute("position", "1B")

player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Tom")
player.SetAttribute("position", "2B")

Now you are done with the first team and its players. Repeat the code to do the next two teams:

team = root.AppendChild(xml.CreateElement("Team"))
team.SetAttribute("name", "Pigeons")
player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Bill")
player.SetAttribute("position", "1B")

player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Tim")
player.SetAttribute("position", "2B")
team = root.AppendChild(xml.CreateElement("Team"))
team.SetAttribute("name", "Crows")

player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Ben")
player.SetAttribute("position", "1B")

player = team.AppendChild(xml.CreateElement("Player"))
player.SetAttribute("name", "Ty")
player.SetAttribute("position", "2B")

Lastly, add the code to save the file:

Var f As FolderItem
f = FolderItem.ShowSaveFileDialog("", "league.xml")
If f <> Nil Then
  xml.SaveXml(f)
End If

This prompts you to save the file (with a default name of league.xml). After the file is saved, you can open it using a text editor or a web browser (Firefox displays XML nicely).

Loading an XML document

Speaking of loading an XML file, here is how you would load the contents of this XML file into a Text Area. First, you prompt to select the XML file:

Var f As FolderItem
f = FolderItem.ShowOpenFileDialog("")
If f = Nil Then Return

Now you have a valid file, so try to open it as an XML file:

Var xml As New XMLDocument
Try
  xml.LoadXML(f)
Catch e As XMLException
  MessageBox("XML Error.")
  Return
End Try

If the selected file is not an XML file, this raises an XMLException. The exception is caught and an error message is displayed.

Now you have a valid XML file which you can process using the methods and properties of the XMLDocument and XMLNode classes.

This code has three parts to it. Part one verifies that the XML file has a root node called “League”. Part two is a loop that processes each team. And part three is an inner loop that processes each player on each team. The XML data is output to a Text Area.

Var root As XmlNode
root = xml.FirstChild

If root.Name = "League" Then
  Var team, player As XmlNode
  team = root.FirstChild

  While team <> Nil
    XMLArea.AddText(team.GetAttribute("name") + EndOfLine)
    player = team.FirstChild
    While player <> Nil
      XMLArea.AddText("--->" + player.GetAttribute("name") + " " + player.GetAttribute("position") + EndOfLine)
      player = player.NextSibling
    Wend
    team = team.NextSibling
  Wend
Else
  MessageBox("Not a League XML file.")
End If

Processing large XML files

When you use XMLDocument to load XML files, the entire XML gets loaded into memory at once. This can become a problem for extremely large XML files. In these situations, you can use the XMLReader class to process the XML file in smaller pieces.

XMLReader has a large set of event handlers that are called as parts of the XML file are loaded. You can use these event handlers to look at the data and process it yourself, perhaps only saving the parts you need.

Refer to the Reference Guide for information about the various event handlers available with XMLReader.

Extending an XML file format

XML is a great file format that you should consider using instead of plain text files with a custom format. A major advantage of using an XML file over a plain text file is that XML makes it much easier to update your file format.

For example, if you wanted to update the format to add a “coach” attribute to the Team tag, you can easily do so when the file is saved and you can modify any loading code to only process this “coach” attribute if it exists.

In the saving code, add a new attribute after each Team node is created, such as this:

team.SetAttribute("coach", "Coach Mark")

With this change, create a new XML file and load it using the existing loading code. The XML file loads properly, but the new coach attribute is ignored. You have just extended your XML file format without breaking its ability to be loaded.

Of course, you 'll want to update the loading code so that it can display the coach. To do that, simply add another line to get the coach attribute after getting the team name attribute:

XMLArea.AddText(team.GetAttribute("name") + EndOfLine)
XMLArea.AddText(team.GetAttribute("coach") + EndOfLine)

Searching the XML

To find things in the XML, you have these options:

  • Directly navigate to the tags to get the node you want.

  • Iterate through the tags and their values until you find what you want.

  • Search using XQL to find what you want.

Using the Team XML shown at the beginning of this topic, suppose you want to get the first player on the Crows. You can directly access that by relying on the structure of the XML like this:

Var teamXML As New XMLDocument
teamXML.LoadXml(kTeamXML) ' kTeamXML is the Team XML from the top of this topic

Var crowTeam As XMLNode

' First child is "League" and we want its 3rd child to get the Crows team
crowTeam = teamXML.FirstChild.Child(2)

' The first child of the Crow team is the player "Ben"
Var player As XMLNode = crowTeam.FirstChild

Var playerName As String = player.GetAttribute("name") ' = "Ben"

If you wanted to then change that player's position to shortstop, you can do so like this:

player.SetAttribute("position", "SS")
Var newXML As String = teamXML.ToString ' view the new XML

Or suppose you want to get all the plates on the Crows. In this case you can directly go to the Crows node, but then iterate through all its children like this:

Var teamXML As New XMLDocument
teamXML.LoadXml(kTeamXML)

Var crowTeam As XMLNode

' First child is "League" and we want its 3rd child to get the Crows team
crowTeam = teamXML.FirstChild.Child(2)

' Now iterate through all the children of the Crows team to get all its players
Var playerNames() As String
For i As Integer = 0 To crowTeam.ChildCount - 1
playerNames.Add(crowTeam.Child(i).GetAttribute("name"))
Next

' playerNames = "Ben", "Ty"

Lastly, what if you want to find all the first basemen? There are first basemen on each team so you could iterate through everything to find what you want. Or you can use XQL (XML Query Language) to do this by searching for all nodes with "!B" as the value for the position attribute. XQL has its own syntax which looks like this to find all Players with a position attribute = "1B":

//Player[@position=""1B""]

You can use this with the XMLDocument.XQL method to get back a list of nodes that match the query:

Var teamXML As New XMLDocument
teamXML.LoadXml(kTeamXML)

' Use XQL (XML Query Language) to find all players with "1B" as the position attribute
Var firstBasemen As XmlNodeList
firstBasemen = teamXML.Xql("//Player[@position=""1B""]")

Var firstBasemenNames() As String
For i As Integer = 0 To firstBasemen.Length - 1
firstBasemenNames.Add(firstBasemen.Item(i).GetAttribute("name"))
Next
' firstBasemenNames = "Bob", "Bill", "Ben"

See also

XMLDocument, XMLNode classes;