Unfortunately, there is no 100% foolproof process for determining
how to validate an arbitrary XML document. If you are receiving a
document, you should not leave choosing the validation mechanism to a
remote party (e.g. downloading a DTD using its document-specified URI).
Doing so opens your application to, at the very least, a potential
denial-of-service attack. A validation mechanism may not even be
specified in the document: W3
XML Schema (XSD) does not require it; RELAX
NG does not seem to support such a mechanism. Then there are some XML
documents that just don't have a schema of any form.
Nevertheless, there are times when you need to inspect a document
to find out what it is. Most commonly, support is required for multiple
versions of a document, where the structure and validation mechanisms
change over time.
Note: when talking about validation, this post is not
referring to whether the XML is well formed or not. Any XML parser
should be able to check the syntax. This is about external constraints
imposed on the document structure via a schema, DTD, etc.