I am using org.w3c.dom to extract values from some HTML.

Some html with a span tag like this:<SPAN class='item c123'>-</SPAN>

My code is like this:

StringReader reader = new StringReader(html());
        InputSource inputSource = new InputSource(reader);
        SAX2DOM sax2dom = new SAX2DOM();

        Parser tagSoupParser = new Parser();
        tagSoupParser.setFeature(Parser.namespacesFeature, false);

        Document document = (Document) sax2dom.getDOM();
        NodeList trElements = document.getElementsByTagName("span");

        Node node = trElements.item(0);

I would like to do something like this:

But that throws an "
cannot be cast to org.w3c.dom.Element" exception.

How can I get the value of the class attribute in that span tag?


         String xml = "<SPAN class='item c123'>bla bla</SPAN>";
         DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
         DocumentBuilder db = dbf.newDocumentBuilder();
         Document doc = db.parse(new InputSource(new StringReader(xml )));
         Element elm = (Element)doc.getElementsByTagName("SPAN").item(0);
         System.out.println("content = " +
         System.out.println("class = " + elm.getAttribute("class"));

works here.


