[python] libxml2 xpath on child node

Let’s say you have XML like:

<clients>
   <client>
     <name>foo</name>
     <address>...</address>
     <email>...</email>
     <orders>
       <order>
          <id>id1</id>
          <items>...</items>
       </order>
       <order>
          <id>id2</id>
          <items>...</items>
       </order>
   </client>
   <client>
   ...
   </client>
</clients>

And now you’d like to get pairs order_id-client_name. And you’d like to make it in an elegant way, using xPath, not using DOM navigation, or worse, SAX parser. Getting all “client” nodes is easy:

import libxml2

doc = libxml2.parseFile('clients.xml')
ctxt = doc.xpathNewContext()
clients = ctxt.xpathEval('/clients/client')

# clean up nicely
doc.freeDoc()
ctxt.xpathFreeContext()

But now, how to run an xPath query on every node you found to get client name and orders? You have to tell the context object to change the scope of context, so the next query would be relative to the node you chose:

for client in clients:
    ctxt.setContextNode(client)
    client_name = ctxt.xpathEval('name')[0].getContent()
	
    orders = ctxt.xpathEval('orders/order')
    for order in orders:
        ctxt.setContextNode(order)
        orderId = ctxt.xpathEval('id')[0].getContent()
        print orderId+" "+client_name

And that’s it. I’m writing it because documentation for libxml2’s python bindings is scarce, and it took me a while to get to know about setContextNode method.

Complete script:

import libxml2

doc = libxml2.parseFile('clients.xml')
ctxt = doc.xpathNewContext()
clients = ctxt.xpathEval('//client')

for client in clients:
    ctxt.setContextNode(client)
    client_name = ctxt.xpathEval('name')[0].getContent()
	
    orders = ctxt.xpathEval('orders/order')
    for order in orders:
        ctxt.setContextNode(order)
        orderId = ctxt.xpathEval('id')[0].getContent()
        print orderId+" "+client_name

# clean up nicely
doc.freeDoc()
ctxt.xpathFreeContext()

3 thoughts on “[python] libxml2 xpath on child node”

  1. 6 years later: Thank you very much, I was lost looking into the poor Python documentation exactly for this! It looks like they didn’t make the docs any better in this whole time :)

    1. Awesome! I’m very happy that the things I’m writing here are useful, especially after this long time. That’s why I’ve started this blog in the first place – to write how I resolved the problems I’ve encountered, so other people don’t have to dig deep to find the solution.

Leave a Reply

Your email address will not be published. Required fields are marked *