public class TextExtractingVisitor extends NodeVisitor
Parser parser = new Parser(...);
TextExtractingVisitor visitor = new TextExtractingVisitor();
parser.visitAllNodesWith(visitor);
String textInPage = visitor.getExtractedText();
| Constructor | Description |
|---|---|
TextExtractingVisitor() |
| Modifier and Type | Method | Description |
|---|---|---|
java.lang.String |
getExtractedText() |
|
void |
visitEndTag(Tag tag) |
Called for each
Tag visited that is an end tag. |
void |
visitStringNode(Text stringNode) |
Called for each
StringNode visited. |
void |
visitTag(Tag tag) |
Called for each
Tag visited. |
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf, visitRemarkNodepublic java.lang.String getExtractedText()
public void visitStringNode(Text stringNode)
NodeVisitorStringNode visited.visitStringNode in class NodeVisitorstringNode - The string node being visited.public void visitTag(Tag tag)
NodeVisitorTag visited.visitTag in class NodeVisitortag - The tag being visited.public void visitEndTag(Tag tag)
NodeVisitorTag visited that is an end tag.visitEndTag in class NodeVisitortag - The end tag being visited.HTML Parser is an open source library released under LGPL.