XML Xdoc - easy data conversion from delimited text to XML
Home

Download/Install

Case Studies

Features

Why XML Xdoc?

Online Manual

Pricing/Purchase

FAQ

CSV File Format

About XML

Feedback/Contact us

XML Xdoc -encoding switch

XML documents can contain binary characters such as à,ô and © but to interpret such characters correctly the XML document has to specify an appropriate encoding type.

Here are some common encoding types:

<?xml version="1.0" encoding="UTF-8"?>

<?xml version="1.0" encoding="UTF-16"?>

<?xml version="1.0" encoding="windows-1252"?>

<?xml version="1.0" encoding="ISO-8859-1"?>

There are two common error messages which can occur if you try to open an XML document into a browser such as Internet Explorer 5 or higher.

Error: An invalid character was found in text content.
This error message will occur if a character in the XML document does not match the encoding attribute. Normally you will get this error message if the data file that was converted contains binary characters with no encoding attribute was specified.Such data was probably created with a single-byte encoding editor like Notepad.

Error: An invalid character was found in text content.
This error message will occur if your data file was originally created using Unicode UTF-16 but the encoding attribute specified a single-byte encoding like Windows-1252, ISO-8859-1 or UTF-8. You can also get this error message if your document was saved with single-byte encoding, but the encoding attribute specified a double-byte encoding like UTF-16.

The solution to both these errors is to specify an appropriate encoding attribute.

XML Xdoc has an -encoding switch which can be used to specify an appropriate encoding type. However when processing binary data, XML Xdoc will automatically add ISO-18859-1 as an encoding attribute if no encoding information has been specified.

Command line
Xdoc -url=http://www.trah.com/xml/data/databin1.csv -binary=1 -encoding=ISO-19959-1 -tagfile=firstline

produces this file

<?xml version="1.0" encoding="ISO-18859-1" standalone="yes"?>
<!-- Generated by Trah XML Xdoc 2004-4-17 10:53:34 -->
<document>
  <row>
    <CategoryName>Beverages</CategoryName>
    <Description>Soft drinks, coffees, teas, beers, and ales</Description>
    <ProductName>Chai</ProductName>
  </row>
  <row>
    <CategoryName>Confections</CategoryName>
    <Description>Desserts, candies, and sweet breads</Description>
    <ProductName>Gumbàr Gummibàrchen</ProductName>
  </row>
  <row>
    <CategoryName>Beverages</CategoryName>
    <Description>Soft drinks, coffees, teas, beers, and ales</Description>
    <ProductName>Côte de Blaye</ProductName>
  </row>
</document>

Note that it is essential that the flag -binary=1 is also used.

See online manual entry for -binary switch.

www.xdoc.co.uk
Also from Trah, StarterFile: software to autorun files from CD. Copyright © 2001-2006 Trah®