Facing exception: Invalid byte 2 of 4-byte UTF-8 sequence.
Hi All,
While I'm trying to use some UTF-8 characters in my xml while parsing
the xml using JDOM parser I'm getting this below exception:
Malformed XML, Caused by: 'Invalid byte 2 of 4-byte UTF-8 sequence.'
at com.clarify.boss.utility.xml.SimpleXmlParser.build
(SimpleXmlParser.java:236)
at
com.clarify.boss.msf.handler.RespHeaderInitiateHandler.getStandardHeader
(RespHeaderInitiateHandler.java:366)
at com.clarify.boss.msf.handler.RespHeaderInitiateHandler.execute
(RespHeaderInitiateHandler.java:289)
at
com.clarify.boss.utility.appcontroller.support.AbstractHandler.execute
(AbstractHandler.java:42)
at
com.clarify.boss.utility.appcontroller.support.ApplicationControllerImpl.handleRequest
(ApplicationControllerImpl.java:174)
at
com.clarify.boss.utility.appcontroller.support.ApplicationControllerImpl.execute
(ApplicationControllerImpl.java:311)
at com.clarify.boss.msf.support.ServiceFaultPublisherAB.executeImpl
(ServiceFaultPublisherAB.java:87)
at com.clarify.boss.common.base.BossActionBeanBase.execute
(BossActionBeanBase.java:125)
at com.clarify.boss.sa.msf.xbean.InvokeResponseXB.executeImpl
(InvokeResponseXB.java:198)
at com.clarify.cbo.XBeanImpl.baselineExecuteImpl_(XBeanImpl.java:275)
at com.amdocs.oss.sm.core.common.XBeanBase.baselineExecuteImpl_
(XBeanBase.java:75)
at com.clarify.cbo.XBeanImpl.execute(XBeanImpl.java:197)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:64)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:615)
at com.clarify.sam.JavaDispatch.invokeMethodImp(JavaDispatch.java:
396)
at com.clarify.sam.JavaDispatch.invokeMethod(JavaDispatch.java:348)
at com.clarify.sam.ActionBeanService.invokeBeanMethod
(ActionBeanService.java:509)
at com.clarify.sam.ActionBeanService.invokeAifOperation
(ActionBeanService.java:128)
at com.clarify.sam.AppFrameworkBindingHandler.executeOperation
(AppFrameworkBindingHandler.java:69)
at com.amdocs.aif.consumer.ServiceContext.executeWithRetries
(ServiceContext.java:900)
at com.amdocs.aif.consumer.ServiceContext.executeOperationImpl
(ServiceContext.java:756)
at com.amdocs.aif.consumer.ServiceContext.executeOperation
(ServiceContext.java:676)
at com.amdocs.aif.consumer.ServiceContext.executeOperation
(ServiceContext.java:323)
at
com.clarify.boss.errorhandler.resolver.ResolverLauncherSynchXB.executeImpl
(ResolverLauncherSynchXB.java:157)
... 35 more
Caused by: org.jdom.input.JDOMParseException: Error on line 72:
Invalid byte 2 of 4-byte UTF-8 sequence.
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:468)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:770)
at com.clarify.boss.utility.xml.SimpleXmlParser.build
(SimpleXmlParser.java:231)
... 60 more
Caused by: org.xml.sax.SAXParseException: Invalid byte 2 of 4-byte
UTF-8 sequence.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException
(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl
$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument
(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:453)
... 62 more
I have declared the encoding to be used while parsing, in my xml as
UTF-8:
<?xml version="1.0" encoding="UTF-8"?>
Initially I doubted that the xml backup had some problem because on
the same application server while I was trying to use the same xml as
input it worked but from one of my friends machine it didn't. So is
this could be the cause?
But now I have even something more interesting out of all this. I
tried changing the encoding to ISO-8859-1 i.e. : <?xml version="1.0"
encoding="ISO-8859-1"?> & to surprise it worked.
Now this has led to a confusion. I thought ISO-8859-1 is a charset
which is subset of UTF-8. Then why didn't UTF-8 work whereas
ISO-8859-1 worked?
And lastly I can't change this encoding in my xml as in turn I would
have to do all the regression once again on my application. So please
let me know where I have gone wrong.
The Java code that I'm using is:
/*
* (non-Javadoc)
/ *
* @see com.clarify.boss.utility.xml.XmlParser#build
(org.springframework.core.io.Resource)
*/
public Document build(Resource source) {
try {
return (getSystemId() == null ? getSaxBuilder().build
(source.getInputStream()) : getSaxBuilder().build(
source.getInputStream(), getSystemId()));
} catch (Exception e) {
e.printStackTrace();
BossErrorCode bossErrorCode = new BossErrorCode
(ErrorCode.BOSS_MALFORMED_XML);
throw new BossException(bossErrorCode, new String[] {e.getCause
().getMessage()},e);
}
}
the sax builder method is:
/**
* Getter method for the <b>saxBuilder </b> property
*
* @return Returns the saxBuilder.
*/
private PropertyAwareSAXBuilder getSaxBuilder() {
if (saxBuilder == null) {
PropertyAwareSAXBuilder myParser = new PropertyAwareSAXBuilder(
isValidate());
myParser.setFeature("http://apache.org/xml/features/validation/
schema", isValidate());
myParser.setFeature("http://xml.org/sax/features/namespaces",
true);
//CatalogResolver myResolver = new CatalogResolver();
CatalogResolver myResolver = getCatalogResolver();
myParser.setEntityResolver(myResolver);
setSaxBuilder(myParser);
Iterator it = getProperties().keySet().iterator();
while (it.hasNext()) {
String name = (String) it.next();
saxBuilder.setProperty(name, getProperties().get(name));
}
}
return saxBuilder;
}
Regards,
Dhirendra