com.lightdev.lib.html
Class HTMLFilter

java.lang.Object
  extended by com.lightdev.lib.html.HTMLFilter
All Implemented Interfaces:
XmlParserCallback

public class HTMLFilter
extends Object
implements XmlParserCallback

HTMLFilter parses an HTML input stream using the Light Development XML parser and strips all HTML text returning only content text.

Version:
1, June 26, 2005
Author:
Ulrich Hilger, Light Development, http://www.lightdev.com, info@lightdev.com, published under the terms and conditions of the BSD License, for details see file license.txt in the distribution package of this software

Constructor Summary
HTMLFilter()
          construct a new object instance of class HTMLFilter
 
Method Summary
 String filter(InputStream in)
          filter a given HTML input stream, i.e. strip all tags and return content text only
 String filter(String expr)
           
 void handleEndTag(String tagName)
          handle an end tag, unused here
 void handleStartTag(String tagName, AttributeSet a)
          handle a start tag, unused here
 void handleText(String text)
          handle text, i.e. append it to the filtered content buffer
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLFilter

public HTMLFilter()
construct a new object instance of class HTMLFilter

Method Detail

filter

public String filter(InputStream in)
              throws IOException,
                     IllegalCharacterException
filter a given HTML input stream, i.e. strip all tags and return content text only

Parameters:
in - the HTML input stream
Returns:
a String containing content text with HTML tags removed
Throws:
IOException
IllegalCharacterException

filter

public String filter(String expr)
              throws IOException,
                     IllegalCharacterException
Throws:
IOException
IllegalCharacterException

handleStartTag

public void handleStartTag(String tagName,
                           AttributeSet a)
handle a start tag, unused here

Specified by:
handleStartTag in interface XmlParserCallback
Parameters:
tagName - the name of the tag that has been encountered
a - the attributes of the tag that has been encountered
See Also:
XmlParserCallback.handleStartTag(java.lang.String, javax.swing.text.AttributeSet)

handleEndTag

public void handleEndTag(String tagName)
handle an end tag, unused here

Specified by:
handleEndTag in interface XmlParserCallback
Parameters:
tagName - the name of the tag that has been encountered
See Also:
XmlParserCallback.handleEndTag(java.lang.String)

handleText

public void handleText(String text)
handle text, i.e. append it to the filtered content buffer

Specified by:
handleText in interface XmlParserCallback
Parameters:
text - the text portion that has been parsed
See Also:
XmlParserCallback.handleText(java.lang.String)