Let’s say you have XML like:
<clients> <client> <name>foo</name> <address>...</address> <email>...</email> <orders> <order> <id>id1</id> <items>...</items> </order> <order> <id>id2</id> <items>...</items> </order> </client> <client> ... </client> </clients>
And now you’d like to get pairs order_id-client_name. And you’d like to make it in an elegant way, using xPath, not using DOM navigation, or worse, SAX parser. Getting all “client” nodes is easy:
import libxml2 doc = libxml2.parseFile('clients.xml') ctxt = doc.xpathNewContext() clients = ctxt.xpathEval('/clients/client') # clean up nicely doc.freeDoc() ctxt.xpathFreeContext()
But now, how to run an xPath query on every node you found to get client name and orders? You have to tell the context object to change the scope of context, so the next query would be relative to the node you chose:
for client in clients: ctxt.setContextNode(client) client_name = ctxt.xpathEval('name')[0].getContent() orders = ctxt.xpathEval('orders/order') for order in orders: ctxt.setContextNode(order) orderId = ctxt.xpathEval('id')[0].getContent() print orderId+" "+client_name
And that’s it. I’m writing it because documentation for libxml2’s python bindings is scarce, and it took me a while to get to know about setContextNode method.
Complete script:
import libxml2 doc = libxml2.parseFile('clients.xml') ctxt = doc.xpathNewContext() clients = ctxt.xpathEval('//client') for client in clients: ctxt.setContextNode(client) client_name = ctxt.xpathEval('name')[0].getContent() orders = ctxt.xpathEval('orders/order') for order in orders: ctxt.setContextNode(order) orderId = ctxt.xpathEval('id')[0].getContent() print orderId+" "+client_name # clean up nicely doc.freeDoc() ctxt.xpathFreeContext()