Working with Child Nodes

Because the nodes in an XML document are arranged in a tree, you need to process child nodes in order to move from level to level. The following code shows how to read all the nodes in a DOM tree and print it out in XML format:

' Import System.Xml and add a project reference to it Imports System.Xml

Module Module1

Sub Main() Dim xd As New XmlDocument() Try

' Create a new XmlDocument object xd.Load("\XmlFile1.xml") Catch e As XmlException

Console.WriteLine("Exception caught: " + e.ToString) End Try

' Now we've parsed the file, get the Document Element Dim doc As XmlNode = xd.DocumentElement

' Process any child nodes If doc.HasChildNodes Then processChildren(doc, 0) End If End Sub

Private Sub processChildren(ByRef xn As XmlNode, ByVal level As Integer)

Dim istr As String istr = indent(level) Select Case xn.NodeType Case XmlNodeType.Comment ' output the comment

Console.WriteLine(istr + "<!—" + xn.Value + "—>") Case XmlNodeType.ProcessingInstruction ' output the PI

Console.WriteLine(istr + "<?" + xn.Name + " " + xn.Value + "

Case XmlNodeType.Text ' output the text

Console.WriteLine(istr + xn.Value) Case XmlNodeType.Element ' Get the child node list Dim ch As XmlNodeList = xn.ChildNodes Dim i As Integer

' Write the start tag Console.Write(istr + "<" + xn.Name)

' Process the attributes

Dim atts As XmlAttributeCollection = xn.Attributes If Not atts Is Nothing Then Dim en As IEnumerator = atts.GetEnumerator While en.MoveNext = True Dim at As XmlNode = CType(en.Current, XmlNode) Console.Write(" " + at.Name + "=" + at.Value) End While End If


' recursively process child nodes

Dim ie As IEnumerator = ch.GetEnumerator

While ie.MoveNext = True Dim nd As XmlNode = CType(ie.Current, XmlNode) processChildren(nd, level + 2) End While

Console.WriteLine(istr + "</" + xn.Name + ">") End Select End Sub

' Function to return a string representing the indentation level Private Function indent(ByVal i As Integer) As String Dim s As String

Return "" End If

Return s End Function End Module

In this code, I start in Main() by using the HasChildNodes property to check whether there is anything to be done. HasChildNodes returns the number of children of the current node and will obviously return zero if there are none. If there are child nodes, processChildren() is called. This routine maintains an indentation level so that the output can be printed to look like properly indented XML. I first call the private indent() function to create a string that can be prepended to lines to maintain the indentation.

Once that has been done, a Select Case statement is used to match the node type. I am not checking for every possible node type, but I am processing the most common ones. Comments and processing instructions have their values printed out enclosed in suitable tags, whereas text is printed out as is. XML elements are more interesting because they can have both attributes and child elements of their own. Attributes are represented by a collection of name/value pairs called an XmlAttributeCollection, whereas child nodes are represented by a list called an XmlNodeList.

I use the ChildNodes property to get a list of the children as an XmlNodeList, and write the starting angle bracket < and the node name.

If there are any attributes for this node, I'll need to put them between the name and the closing angle bracket, so I get the attributes as an XmlAttributeCollection. If there aren't any attributes for this node, a null reference is returned. If there are attributes, the GetEnumerator property gets an enumerator to walk over the collection, and I can then use the Name and Value properties on each node to output the attribute. Once all the attributes have been output, I can write the closing angle bracket for the start tag.

Because all the children are XmlNodes as well, I can process them by making a recursive call to the processChildren() function, increasing the indent level for each nested call.

When you compile and run the program, you'll find that the output looks very similar to the input, differing only on minor points of formatting.

0 0

Post a comment