XPath
is a handy expression language for running queries on XML. This post is
about how to use it with XML namespaces in Java (javax.xml.xpath).
This Java code and uses an XPath expression to extract the value
of the bar
attribute from a simple document:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String xml = "<data><foo bar=\"hello\" /></data>";
String value = xpath.evaluate(
"/data/foo/attribute::bar", new InputSource(
new StringReader(xml)));
System.out.println(value); |
When run, it prints hello
on the console.
XML with namespaces
When the XML uses namespaces,
things get a little bit trickier. These two documents are functionally
equivalent:
<?xml version="1.0" encoding="utf-8"?>
<!-- ns1.xml -->
<data xmlns:foo="http://foo" xmlns:bar="http://bar"
xmlns="http://def">
<foo:value>1</foo:value>
<bar:value>2</bar:value>
</data>
<?xml version="1.0" encoding="utf-8"?>
<!-- ns2.xml -->
<data xmlns:bar="http://foo" xmlns:foo="http://bar"
xmlns="http://def">
<bar:value>1</bar:value>
<foo:value>2</foo:value>
</data>
Note that the namespace prefixes (foo
and bar
)
have been swapped round, but the value
element in the
namespace http://foo
contains the value 1
in
both documents. Likewise, the value
element in the http://bar
namespace contains the number 2
in both documents.
Since the namespace prefixes can vary in the documents, a
namespaced XPath expressions need to map their own prefixes to the URIs.
The namespace URIs act as constant identifiers - that's their job! In
the Java API, this mapping is performed by implementing the NamespaceContext
interface.
This code uses a NamespaceContext
to extract the
value in the http://foo
namespace from each of the
documents:
InputSource ns1xml = new InputSource("ns1.xml");
InputSource ns2xml = new InputSource("ns2.xml");
NamespaceContext context = new NamespaceContextMap(
"foo", "http://foo",
"bar", "http://bar",
"def", "http://def");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(context);
XPathExpression expression = xpath.compile("/def:data/foo:value");
System.out.println(expression.evaluate(ns1xml));
System.out.println(expression.evaluate(ns2xml)); |
Note that the expression was compiled for reuse. Output:
1
1
The prefixes given to the context only need to be consistent with
the XPath expressions, not the documents. This code works just as well:
InputSource ns1xml = new InputSource("ns1.xml");
InputSource ns2xml = new InputSource("ns2.xml");
NamespaceContext context = new NamespaceContextMap(
"abc", "http://foo",
"pqr", "http://bar",
"xyz", "http://def");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(context);
XPathExpression expression = xpath
.compile("/xyz:data/abc:value");
System.out.println(expression.evaluate(ns1xml));
System.out.println(expression.evaluate(ns2xml)); |
Unfortunately, there are no implementations of NamespaceContext
provided in the standard library (well, there is one in StAX but it is
of limited utility). If you choose to implement it yourself, take note
of the entire contract as defined in the javadoc. A sample
implementation is provided below.
Note 2011/07: I've corrected the above listings to
remove a namespace mapping of ("", "http://def")
with an
expression starting with /:data
. This expression is not
legal syntax - see the comments for more details.
Listing: NamespaceContextMap.java
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
/**
* An implementation of <a
* href="http://java.sun.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html">
* NamespaceContext </a>. Instances are immutable.
*
* @author McDowell
*/
public final class NamespaceContextMap implements
NamespaceContext {
private final Map<String, String> prefixMap;
private final Map<String, Set<String>> nsMap;
/**
* Constructor that takes a map of XML prefix-namespaceURI values. A defensive
* copy is made of the map. An IllegalArgumentException will be thrown if the
* map attempts to remap the standard prefixes defined in the NamespaceContext
* contract.
*
* @param prefixMappings
* a map of prefix:namespaceURI values
*/
public NamespaceContextMap(
Map<String, String> prefixMappings) {
prefixMap = createPrefixMap(prefixMappings);
nsMap = createNamespaceMap(prefixMap);
}
/**
* Convenience constructor.
*
* @param mappingPairs
* pairs of prefix-namespaceURI values
*/
public NamespaceContextMap(String... mappingPairs) {
this(toMap(mappingPairs));
}
private static Map<String, String> toMap(
String... mappingPairs) {
Map<String, String> prefixMappings = new HashMap<String, String>(
mappingPairs.length / 2);
for (int i = 0; i < mappingPairs.length; i++) {
prefixMappings
.put(mappingPairs[i], mappingPairs[++i]);
}
return prefixMappings;
}
private Map<String, String> createPrefixMap(
Map<String, String> prefixMappings) {
Map<String, String> prefixMap = new HashMap<String, String>(
prefixMappings);
addConstant(prefixMap, XMLConstants.XML_NS_PREFIX,
XMLConstants.XML_NS_URI);
addConstant(prefixMap, XMLConstants.XMLNS_ATTRIBUTE,
XMLConstants.XMLNS_ATTRIBUTE_NS_URI);
return Collections.unmodifiableMap(prefixMap);
}
private void addConstant(Map<String, String> prefixMap,
String prefix, String nsURI) {
String previous = prefixMap.put(prefix, nsURI);
if (previous != null && !previous.equals(nsURI)) {
throw new IllegalArgumentException(prefix + " -> "
+ previous + "; see NamespaceContext contract");
}
}
private Map<String, Set<String>> createNamespaceMap(
Map<String, String> prefixMap) {
Map<String, Set<String>> nsMap = new HashMap<String, Set<String>>();
for (Map.Entry<String, String> entry : prefixMap
.entrySet()) {
String nsURI = entry.getValue();
Set<String> prefixes = nsMap.get(nsURI);
if (prefixes == null) {
prefixes = new HashSet<String>();
nsMap.put(nsURI, prefixes);
}
prefixes.add(entry.getKey());
}
for (Map.Entry<String, Set<String>> entry : nsMap
.entrySet()) {
Set<String> readOnly = Collections
.unmodifiableSet(entry.getValue());
entry.setValue(readOnly);
}
return nsMap;
}
@Override
public String getNamespaceURI(String prefix) {
checkNotNull(prefix);
String nsURI = prefixMap.get(prefix);
return nsURI == null ? XMLConstants.NULL_NS_URI : nsURI;
}
@Override
public String getPrefix(String namespaceURI) {
checkNotNull(namespaceURI);
Set<String> set = nsMap.get(namespaceURI);
return set == null ? null : set.iterator().next();
}
@Override
public Iterator<String> getPrefixes(String namespaceURI) {
checkNotNull(namespaceURI);
Set<String> set = nsMap.get(namespaceURI);
return set.iterator();
}
private void checkNotNull(String value) {
if (value == null) {
throw new IllegalArgumentException("null");
}
}
/**
* @return an unmodifiable map of the mappings in the form prefix-namespaceURI
*/
public Map<String, String> getMap() {
return prefixMap;
}
} |
Listing: NamespaceContextMapTest.java
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
import org.junit.Assert;
import org.junit.Test;
import xml.NamespaceContextMap;
//JUnit 4 test
public class NamespaceContextMapTest {
@Test
public void testContext() {
Map<String, String> mappings = new HashMap<String, String>();
mappings.put("foo", "http://foo");
mappings.put("altfoo", "http://foo");
mappings.put("bar", "http://bar");
mappings.put(XMLConstants.XML_NS_PREFIX,
XMLConstants.XML_NS_URI);
NamespaceContext context = new NamespaceContextMap(
mappings);
for (Map.Entry<String, String> entry : mappings
.entrySet()) {
String prefix = entry.getKey();
String namespaceURI = entry.getValue();
Assert.assertEquals("namespaceURI", namespaceURI,
context.getNamespaceURI(prefix));
boolean found = false;
Iterator<?> prefixes = context
.getPrefixes(namespaceURI);
while (prefixes.hasNext()) {
if (prefix.equals(prefixes.next())) {
found = true;
break;
}
try {
prefixes.remove();
Assert.fail("rw");
} catch (UnsupportedOperationException e) {
}
}
Assert.assertTrue("prefix: " + prefix, found);
Assert.assertNotNull("prefix: " + prefix, context
.getPrefix(namespaceURI));
}
Map<String, String> ctxtMap = ((NamespaceContextMap) context)
.getMap();
for (Map.Entry<String, String> entry : mappings
.entrySet()) {
Assert.assertEquals(entry.getValue(), ctxtMap
.get(entry.getKey()));
}
System.out.println(context.toString());
}
@Test
public void testModify() {
NamespaceContextMap context = new NamespaceContextMap();
try {
Map<String, String> ctxtMap = context.getMap();
ctxtMap.put("a", "b");
Assert.fail("rw");
} catch (UnsupportedOperationException e) {
}
try {
Iterator<String> it = context
.getPrefixes(XMLConstants.XML_NS_URI);
it.next();
it.remove();
Assert.fail("rw");
} catch (UnsupportedOperationException e) {
}
}
@Test
public void testConstants() {
NamespaceContext context = new NamespaceContextMap();
Assert.assertEquals(XMLConstants.XML_NS_URI, context
.getNamespaceURI(XMLConstants.XML_NS_PREFIX));
Assert.assertEquals(
XMLConstants.XMLNS_ATTRIBUTE_NS_URI, context
.getNamespaceURI(XMLConstants.XMLNS_ATTRIBUTE));
Assert.assertEquals(XMLConstants.XML_NS_PREFIX, context
.getPrefix(XMLConstants.XML_NS_URI));
Assert.assertEquals(
XMLConstants.XMLNS_ATTRIBUTE_NS_URI, context
.getNamespaceURI(XMLConstants.XMLNS_ATTRIBUTE));
}
} |
Found a great impl: org.apache.ws.commons.util.NamespaceContextImpl.
ReplyDeleteYou can use the following maven dependency for it:
org.apache.ws.commons
ws-commons-util
1.0.1
test
Thanks for your post, it's a pity we cannot find any implementation of NamespaceContext provided in the standard library.
ReplyDeleteCan I use your sample code for the NamespaceContextMap or is it protected by a copyright ?
Thanks again and seeya
@Anonymous - anyone is free to use the sample code in this post with the caveats noted at the bottom of the page.
ReplyDeleteIs there any way to use default namespace without using ":" (as the standard?) /data/foo:value instead of /:data/foo:value ?
ReplyDeleteI am using xalan 2.7.1 and doesn't work, and if I use saxon I got a Unexpected colon at start of token
@Anonymous - I wasn't aware that implementations varied. I would be inclined to just namespace everything: "/xyz:data/abc:value"
ReplyDeleteAccording to http://www.w3.org/2007/01/applets/xpathApplet.html
ReplyDelete/:data/foo:value is an invalid expression.
@Anonymous - thanks for the link to the applet; I was not aware of it.
ReplyDeletePrompted by the comments, I've checked the spec. Namespace prefixes must be at least one character long.
Here are the relevant parts of the lexical structure:
PrefixedName ::= Prefix ':' LocalPart
Prefix ::= NCName
NCName ::= Name - (Char* ':' Char*) /* An XML Name, minus the ":" */
Name ::= NameStartChar (NameChar)*
Support for XPath in the Java 6 runtime is for version 1.0.
The fact that /:elementName worked as an expression was just an accident of the implementation.
I shall correct the post.
dang, I just implemented this in Clojure as a function that takes a hash-map of prefixes to URI strings, and returns a full implementation of NamespaceContext, and it's literally 7 lines of code.
ReplyDelete(defn namespace-map
[mapping]
(let [prefixes (fn [uri] (map key (filter #(= uri (val %)) mapping)))]
(proxy [Object NamespaceContext] []
(getNamespaceURI [prefix] (get mapping prefix))
(getPrefix [uri] (first (prefixes uri)))
(getPrefixes [uri] (.iterator (prefixes uri))))))
Nice, but note that your type does not meet the class contract for NamespaceContext as it does not perform the special constant handling required by the API documentation.
DeleteThanks! I fixed it up.
Delete(defn namespace-map
"Returns an implementation of NamespaceContext ... actual usefulness TBD"
[mapping]
(let [defaults {XMLConstants/XML_NS_PREFIX XMLConstants/XML_NS_URI
XMLConstants/XMLNS_ATTRIBUTE XMLConstants/XMLNS_ATTRIBUTE_NS_URI}
mapping (merge mapping defaults)
prefixes (fn [uri] (map key (filter #(= uri (val %)) mapping)))]
(proxy [Object NamespaceContext] []
(getNamespaceURI [prefix] (get mapping prefix))
(getPrefix [uri] (first (prefixes uri)))
(getPrefixes [uri] (.iterator (prefixes uri))))))
I find this article http://www.ibm.com/developerworks/xml/library/x-nmspccontext/index.html?ca=drs- from IBM very helpful. It provide 3 approaches:
ReplyDelete1) hard coded solution. Implement the NamespaceContext interface, and hard code the mapping in the code. Only works for the xml you are targeting.
2) read namespaces from the document. use Document.lookupNamespaceURI(String prefix) and Document.lookupPrefix(String namespaceURI). Works for all xml files but need to lookup each time an xpath is evaluated.
3) Read the namespaces from the document and cache them. Only lookup namespaces once in the constructor, then cache the namespaces.