Why is xml schema so restrictive?
Friday, July 4th, 2008In one of our recent projects we have to deal with xml files that are not 100% correct. We process train tracks schematic plans that are defined in xml. The xml is generated from a database, and the database data are not clean. So don’t always get correct XML. I mean the structure is ok, but the data are not always valid. For example you can have turnout with reference to an invalid track. Also the turnout can have an invalid string or an empty string as a track number.
What bothers me the most is the second case when there is invalid input. Good structure, but instead of number, there is some random text. I validate such things in the xml manually in the code, but it is not nice at all. I also created the xml schema validator, but it doesn’t help much. The xml schemas work in the way everything correct or nothing is correct.
This approach doesn’t make people happy every time. I would like to have some controlled way of reporting warnings and errors, and failing only on error. Have you ever seen anything like that? Some xml validator that is less than strict, that produces warnings. Something that produces a warning on the next fragment:
<track id="11" length="" />
which should be this:
<track id="11" length="25" />
this would produce an error, or a warning, so we know that some part does not have valid data, but continues processing the xml file. That would help a lot with XML processing.