Quick Start#
Python Support#
- This library supports these Python versions::
3.11
This library may be compatible with earlier versions of python as well.
Installation#
Use pip to install this libary:
pip install lxml-util
Introduction#
This is a python library providing the additional functionality to the lxml python library.
Limitations of lxml library:
When e is object of lxml.etree.Element
- e.find() cannot handle prefixed tag names like “c:chart” and raises
exceptions
- e.findall() cannot handle prefixed tag names like “c:chart” and raises
exceptions
- e.makeelement() cannot handle prefixed tag names like “c:chart” and
raises exceptions
- e.get() cannot handle prefixed attribute names like “r:id” and raises
exceptions
- e.set() cannot handle prefixed attribute names like “r:id” and raises
exceptions
- e does not have direct access to its children through the attributes.
e.chart gives attribute error though chart is child of element e
- No handy method in e object to find the local name. Have to use
etree.QName object to find the namespace of the element
- No handy method in e object to find the namespace. Have to use
etree.QName object to find the namespace of the element
- No handy method in e object to get the fully Qualified Name from
prefixed tag name
- Advantages of lxml-util library:
All the above mentioned limitations are handled in this library
from lxmlutil.etree import ElementBase
from lxml import etree
fallback = etree.ElementDefaultClassLookup(ElementBase)
parser = etree.XMLParser()
parser.set_element_class_lookup(fallback)
e = etree.parse("path/to/xml/file", parser).getroot()
e.nsmap
# {'c': 'http://schemas.openxmlformats.org/drawingml/2006/chart',
# 'a': 'http://schemas.openxmlformats.org/drawingml/2006/main',
# 'r': 'http://schemas.openxmlformats.org/officeDocument/2006/relationships',
# 'c16r2': 'http://schemas.microsoft.com/office/drawing/2015/06/chart'}
# Object e now has additional methods defined in lxmlutil.etree.ElementBase
# which are not available in lxml.etree.ElementBase
e.qn("c:chart") # "{http://schemas.openxmlformats.org/drawingml/2006/chart}chart"
e.ln # "chartSpace"
e.ns # "http://schemas.openxmlformats.org/drawingml/2006/chart"
e.find("c:chart") # Gives error, cannot handle prefixed tag
e.findqn("c:chart") # <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}chart at 0x189d739f520>
e.find("c:chart//c:areaChart") # Gives error, cannot handle prefixed tag
e.findqn("c:chart//c:areaChart") # <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}areaChart at 0x189d739f700>
e.findall("c:chart//c:areaChart/c:axId") # Gives error, cannot handle prefixed tag
e.findallqn("c:chart//c:areaChart/c:axId") # [<Element {http://schemas.openxmlformats.org/drawingml/2006/chart}axId at 0x189d739f7a0>, <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}axId at 0x189d739f7f0>]
e.get('r:id') # Gives error, cannot handled prefixed attribute names
e.getqn('r:id') # resolves the prefixed name and gets the attribute
e.set('r:id', "rId4") # Gives error, cannot handle prefixed attribute names
e.setqn('r:id', "rId4") # resolves the prefixed name and sets the attribute
e.makeelement('c:graph') # Gives error, cannot handle prefixed tag names
e.meqn('c:graph') # <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}graph at 0x189d739f7f0>
# Dynamic attributes:
e.c_chart # equivalent to e.findqn('c:chart')
# <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}chart at 0x189d739f7f0>
# c is prefix, chart is localname
e.c_chart__c_plotArea # equivalent to e.findqn('c:chart/c:plotArea')
# <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}plotArea at 0x189d739f7a0>
# Two underscores means './'
e.c_chart___c_areaChart # equivalent to e.findqn('c:chart//c:areaChart')
# <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}areaChart at 0x189d739f520>
# Three underscores means './/'
e.c_externalData_r_id # equivalent to e.findqn('c:externalData[@r:id]')
# <Element {http://schemas.openxmlformats.org/drawingml/2006/chart}externalData at 0x189d739f8e0>
# c is prefix for element tag
# externalData is element localname
# r is attribute prefix for element 'c:externalData'
# id is attribute name for element 'c:externalData'
e.___mc_Choice # equivalent to e.findqn('.//mc:Choice')
# <Element {http://schemas.openxmlformats.org/markup-compatibility/2006}Choice at 0x189d739fd90>
#Though mc is not in nsmap of e, this method tries to find what could
#be the namespace of mc prefix based on nsmap of children of element e