Package nu.validator.htmlparser.impl
Class TreeBuilder<T>
- java.lang.Object
-
- nu.validator.htmlparser.impl.TreeBuilder<T>
-
- All Implemented Interfaces:
TokenHandler,TreeBuilderState<T>
- Direct Known Subclasses:
CoalescingTreeBuilder
public abstract class TreeBuilder<T> extends java.lang.Object implements TokenHandler, TreeBuilderState<T>
-
-
Field Summary
Fields Modifier and Type Field Description protected char[]charBufferprotected intcharBufferLenprotected org.xml.sax.ErrorHandlererrorHandlerprotected Tokenizertokenizer
-
Constructor Summary
Constructors Modifier Constructor Description protectedTreeBuilder()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voidaccumulateCharacters(char[] buf, int start, int length)protected abstract voidaddAttributesToElement(T element, HtmlAttributes attributes)protected abstract voidappendCharacters(T parent, char[] buf, int start, int length)protected abstract voidappendChildrenToNewParent(T oldParent, T newParent)protected abstract voidappendComment(T parent, char[] buf, int start, int length)protected abstract voidappendCommentToDocument(char[] buf, int start, int length)protected voidappendDoctypeToDocument(java.lang.String name, java.lang.String publicIdentifier, java.lang.String systemIdentifier)protected abstract voidappendElement(T child, T newParent)protected abstract voidappendIsindexPrompt(T parent)booleancdataSectionAllowed()Checks if the CDATA sections are allowed.voidcharacters(char[] buf, int start, int length)Receive character tokens.voidcomment(char[] buf, int start, int length)Receive a comment token.protected abstract TcreateElement(java.lang.String ns, java.lang.String name, HtmlAttributes attributes)protected TcreateElement(java.lang.String ns, java.lang.String name, HtmlAttributes attributes, T form)protected abstract TcreateHtmlElementSetAsRoot(HtmlAttributes attributes)protected TcurrentNode()protected abstract voiddetachFromParent(T element)voiddoctype(java.lang.String name, java.lang.String publicIdentifier, java.lang.String systemIdentifier, boolean forceQuirks)Receive a doctype token.protected voiddocumentMode(DocumentMode m, java.lang.String publicIdentifier, java.lang.String systemIdentifier, boolean html4SpecificAdditionalErrorChecks)protected voidelementPopped(java.lang.String ns, java.lang.String name, T node)protected voidelementPushed(java.lang.String ns, java.lang.String name, T node)protected voidend()voidendTag(ElementName elementName)Receive an end tag token.voidendTokenization()The perform final cleanup.voideof()The end-of-file token.static java.lang.StringextractCharsetFromContent(java.lang.String attributeValue)C++ memory note: The return value must be released.protected voidfatal()Reports an condition that would make the infoset incompatible with XML 1.0 as fatal.protected voidfatal(java.lang.Exception e)voidflushCharacters()Flushes the pending characters.TgetDeepTreeSurrogateParent()Returns the deepTreeSurrogateParent.org.xml.sax.ErrorHandlergetErrorHandler()Returns the errorHandler.TgetFormPointer()Returns the formPointer.TgetHeadPointer()Returns the headPointer.nu.validator.htmlparser.impl.StackNode<T>[]getListOfActiveFormattingElements()Returns the listOfActiveFormattingElements.intgetListOfActiveFormattingElementsLength()Return the length of the list of active formatting elements.intgetMode()Returns the mode.intgetOriginalMode()Returns the originalMode.nu.validator.htmlparser.impl.StackNode<T>[]getStack()Returns the stack.intgetStackLength()Return the length of the stack.protected abstract booleanhasChildren(T element)protected abstract voidinsertFosterParentedCharacters(char[] buf, int start, int length, T table, T stackParent)protected abstract voidinsertFosterParentedChild(T child, T table, T stackParent)booleanisFramesetOk()Returns the framesetOk.booleanisNeedToDropLF()Returns the needToDropLF.booleanisQuirks()Returns the quirks.booleanisScriptingEnabled()Returns the scriptingEnabled.voidloadState(TreeBuilderState<T> snapshot, Interner interner)protected voidmarkMalformedIfScript(T elt)TreeBuilderState<T>newSnapshot()Creates a comparable snapshot of the tree builder state.protected voidrequestSuspension()voidsetDoctypeExpectation(DoctypeExpectation doctypeExpectation)Sets the doctypeExpectation.voidsetDocumentModeHandler(DocumentModeHandler documentModeHandler)Sets the documentModeHandler.voidsetErrorHandler(org.xml.sax.ErrorHandler errorHandler)Sets the errorHandler.voidsetFragmentContext(java.lang.String context)The argument MUST be an interned string ornull.voidsetFragmentContext(java.lang.String context, java.lang.String ns, T node, boolean quirks)The argument MUST be an interned string ornull.voidsetIgnoringComments(boolean ignoreComments)voidsetNamePolicy(XmlViolationPolicy namePolicy)voidsetReportingDoctype(boolean reportingDoctype)Sets the reportingDoctype.voidsetScriptingEnabled(boolean scriptingEnabled)Sets the scriptingEnabled.booleansnapshotMatches(TreeBuilderState<T> snapshot)protected voidstart(boolean fragmentMode)voidstartTag(ElementName elementName, HtmlAttributes attributes, boolean selfClosing)Receive a start tag token.voidstartTokenization(Tokenizer self)This method is called at the start of tokenization before any other methods on this interface are called.booleanwantsComments()If this handler implementation cares about comments, returntrue.voidzeroOriginatingReplacementCharacter()Reports a U+0000 that's being turned into a U+FFFD.
-
-
-
Field Detail
-
tokenizer
protected Tokenizer tokenizer
-
errorHandler
protected org.xml.sax.ErrorHandler errorHandler
-
charBuffer
protected char[] charBuffer
-
charBufferLen
protected int charBufferLen
-
-
Method Detail
-
fatal
protected void fatal() throws org.xml.sax.SAXExceptionReports an condition that would make the infoset incompatible with XML 1.0 as fatal.- Throws:
org.xml.sax.SAXExceptionorg.xml.sax.SAXParseException
-
fatal
protected final void fatal(java.lang.Exception e) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
startTokenization
public final void startTokenization(Tokenizer self) throws org.xml.sax.SAXException
Description copied from interface:TokenHandlerThis method is called at the start of tokenization before any other methods on this interface are called. Implementations should hold the reference to theTokenizerin order to set the content model flag and in order to be able to query forLocatordata.- Specified by:
startTokenizationin interfaceTokenHandler- Parameters:
self- theTokenizer.- Throws:
org.xml.sax.SAXException- if something went wrong
-
doctype
public final void doctype(java.lang.String name, java.lang.String publicIdentifier, java.lang.String systemIdentifier, boolean forceQuirks) throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerReceive a doctype token.- Specified by:
doctypein interfaceTokenHandler- Parameters:
name- the namepublicIdentifier- the public idsystemIdentifier- the system idforceQuirks- whether the token is correct- Throws:
org.xml.sax.SAXException- if something went wrong
-
comment
public final void comment(char[] buf, int start, int length) throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerReceive a comment token. The data is junk if thewantsComments()returnedfalse.- Specified by:
commentin interfaceTokenHandler- Parameters:
buf- a buffer holding the datastart- the offset into the bufferlength- the number of code units to read- Throws:
org.xml.sax.SAXException- if something went wrong
-
characters
public final void characters(char[] buf, int start, int length) throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerReceive character tokens. This method has the same semantics as the SAX method of the same name.- Specified by:
charactersin interfaceTokenHandler- Parameters:
buf- a buffer holding the datastart- offset into the bufferlength- the number of code units to read- Throws:
org.xml.sax.SAXException- if something went wrong- See Also:
TokenHandler.characters(char[], int, int)
-
zeroOriginatingReplacementCharacter
public void zeroOriginatingReplacementCharacter() throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerReports a U+0000 that's being turned into a U+FFFD.- Specified by:
zeroOriginatingReplacementCharacterin interfaceTokenHandler- Throws:
org.xml.sax.SAXException- if something went wrong- See Also:
TokenHandler.zeroOriginatingReplacementCharacter()
-
eof
public final void eof() throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerThe end-of-file token.- Specified by:
eofin interfaceTokenHandler- Throws:
org.xml.sax.SAXException- if something went wrong
-
endTokenization
public final void endTokenization() throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerThe perform final cleanup.- Specified by:
endTokenizationin interfaceTokenHandler- Throws:
org.xml.sax.SAXException- if something went wrong- See Also:
TokenHandler.endTokenization()
-
startTag
public void startTag(ElementName elementName, HtmlAttributes attributes, boolean selfClosing) throws org.xml.sax.SAXException
Description copied from interface:TokenHandlerReceive a start tag token.- Specified by:
startTagin interfaceTokenHandler- Parameters:
elementName- the tag nameattributes- the attributesselfClosing- TODO- Throws:
org.xml.sax.SAXException- if something went wrong
-
extractCharsetFromContent
public static java.lang.String extractCharsetFromContent(java.lang.String attributeValue)
C++ memory note: The return value must be released.
- Returns:
- Throws:
org.xml.sax.SAXExceptionStopSniffingException
-
endTag
public void endTag(ElementName elementName) throws org.xml.sax.SAXException
Description copied from interface:TokenHandlerReceive an end tag token.- Specified by:
endTagin interfaceTokenHandler- Parameters:
elementName- the tag name- Throws:
org.xml.sax.SAXException- if something went wrong
-
accumulateCharacters
protected void accumulateCharacters(char[] buf, int start, int length) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
requestSuspension
protected final void requestSuspension()
-
createElement
protected abstract T createElement(java.lang.String ns, java.lang.String name, HtmlAttributes attributes) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
createElement
protected T createElement(java.lang.String ns, java.lang.String name, HtmlAttributes attributes, T form) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
createHtmlElementSetAsRoot
protected abstract T createHtmlElementSetAsRoot(HtmlAttributes attributes) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
detachFromParent
protected abstract void detachFromParent(T element) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
hasChildren
protected abstract boolean hasChildren(T element) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
appendElement
protected abstract void appendElement(T child, T newParent) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
appendChildrenToNewParent
protected abstract void appendChildrenToNewParent(T oldParent, T newParent) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
insertFosterParentedChild
protected abstract void insertFosterParentedChild(T child, T table, T stackParent) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
insertFosterParentedCharacters
protected abstract void insertFosterParentedCharacters(char[] buf, int start, int length, T table, T stackParent) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
appendCharacters
protected abstract void appendCharacters(T parent, char[] buf, int start, int length) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
appendIsindexPrompt
protected abstract void appendIsindexPrompt(T parent) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
appendComment
protected abstract void appendComment(T parent, char[] buf, int start, int length) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
appendCommentToDocument
protected abstract void appendCommentToDocument(char[] buf, int start, int length) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
addAttributesToElement
protected abstract void addAttributesToElement(T element, HtmlAttributes attributes) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
markMalformedIfScript
protected void markMalformedIfScript(T elt) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
start
protected void start(boolean fragmentMode) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
end
protected void end() throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
appendDoctypeToDocument
protected void appendDoctypeToDocument(java.lang.String name, java.lang.String publicIdentifier, java.lang.String systemIdentifier) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
elementPushed
protected void elementPushed(java.lang.String ns, java.lang.String name, T node) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
elementPopped
protected void elementPopped(java.lang.String ns, java.lang.String name, T node) throws org.xml.sax.SAXException- Throws:
org.xml.sax.SAXException
-
documentMode
protected void documentMode(DocumentMode m, java.lang.String publicIdentifier, java.lang.String systemIdentifier, boolean html4SpecificAdditionalErrorChecks) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
wantsComments
public boolean wantsComments()
Description copied from interface:TokenHandlerIf this handler implementation cares about comments, returntrue. If not, returnfalse.- Specified by:
wantsCommentsin interfaceTokenHandler- Returns:
- whether this handler wants comments
- See Also:
TokenHandler.wantsComments()
-
setIgnoringComments
public void setIgnoringComments(boolean ignoreComments)
-
setErrorHandler
public final void setErrorHandler(org.xml.sax.ErrorHandler errorHandler)
Sets the errorHandler.- Parameters:
errorHandler- the errorHandler to set
-
getErrorHandler
public org.xml.sax.ErrorHandler getErrorHandler()
Returns the errorHandler.- Returns:
- the errorHandler
-
setFragmentContext
public final void setFragmentContext(java.lang.String context)
The argument MUST be an interned string ornull.- Parameters:
context-
-
cdataSectionAllowed
public boolean cdataSectionAllowed() throws org.xml.sax.SAXExceptionDescription copied from interface:TokenHandlerChecks if the CDATA sections are allowed.- Specified by:
cdataSectionAllowedin interfaceTokenHandler- Returns:
trueif CDATA sections are allowed- Throws:
org.xml.sax.SAXException- if something went wrong- See Also:
TokenHandler.cdataSectionAllowed()
-
setFragmentContext
public final void setFragmentContext(java.lang.String context, java.lang.String ns, T node, boolean quirks)The argument MUST be an interned string ornull.- Parameters:
context-
-
currentNode
protected final T currentNode()
-
isScriptingEnabled
public boolean isScriptingEnabled()
Returns the scriptingEnabled.- Returns:
- the scriptingEnabled
-
setScriptingEnabled
public void setScriptingEnabled(boolean scriptingEnabled)
Sets the scriptingEnabled.- Parameters:
scriptingEnabled- the scriptingEnabled to set
-
setDoctypeExpectation
public void setDoctypeExpectation(DoctypeExpectation doctypeExpectation)
Sets the doctypeExpectation.- Parameters:
doctypeExpectation- the doctypeExpectation to set
-
setNamePolicy
public void setNamePolicy(XmlViolationPolicy namePolicy)
-
setDocumentModeHandler
public void setDocumentModeHandler(DocumentModeHandler documentModeHandler)
Sets the documentModeHandler.- Parameters:
documentModeHandler- the documentModeHandler to set
-
setReportingDoctype
public void setReportingDoctype(boolean reportingDoctype)
Sets the reportingDoctype.- Parameters:
reportingDoctype- the reportingDoctype to set
-
flushCharacters
public final void flushCharacters() throws org.xml.sax.SAXExceptionFlushes the pending characters. Public for document.write use cases only.- Throws:
org.xml.sax.SAXException
-
newSnapshot
public TreeBuilderState<T> newSnapshot() throws org.xml.sax.SAXException
Creates a comparable snapshot of the tree builder state. Snapshot creation is only supported immediately after a script end tag has been processed. In C++ the caller is responsible for callingdeleteon the returned object.- Returns:
- a snapshot.
- Throws:
org.xml.sax.SAXException
-
snapshotMatches
public boolean snapshotMatches(TreeBuilderState<T> snapshot)
-
loadState
public void loadState(TreeBuilderState<T> snapshot, Interner interner) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
getFormPointer
public T getFormPointer()
Description copied from interface:TreeBuilderStateReturns the formPointer.- Specified by:
getFormPointerin interfaceTreeBuilderState<T>- Returns:
- the formPointer
- See Also:
TreeBuilderState.getFormPointer()
-
getHeadPointer
public T getHeadPointer()
Returns the headPointer.- Specified by:
getHeadPointerin interfaceTreeBuilderState<T>- Returns:
- the headPointer
-
getDeepTreeSurrogateParent
public T getDeepTreeSurrogateParent()
Returns the deepTreeSurrogateParent.- Specified by:
getDeepTreeSurrogateParentin interfaceTreeBuilderState<T>- Returns:
- the deepTreeSurrogateParent
-
getListOfActiveFormattingElements
public nu.validator.htmlparser.impl.StackNode<T>[] getListOfActiveFormattingElements()
Description copied from interface:TreeBuilderStateReturns the listOfActiveFormattingElements.- Specified by:
getListOfActiveFormattingElementsin interfaceTreeBuilderState<T>- Returns:
- the listOfActiveFormattingElements
- See Also:
TreeBuilderState.getListOfActiveFormattingElements()
-
getStack
public nu.validator.htmlparser.impl.StackNode<T>[] getStack()
Description copied from interface:TreeBuilderStateReturns the stack.- Specified by:
getStackin interfaceTreeBuilderState<T>- Returns:
- the stack
- See Also:
TreeBuilderState.getStack()
-
getMode
public int getMode()
Returns the mode.- Specified by:
getModein interfaceTreeBuilderState<T>- Returns:
- the mode
-
getOriginalMode
public int getOriginalMode()
Returns the originalMode.- Specified by:
getOriginalModein interfaceTreeBuilderState<T>- Returns:
- the originalMode
-
isFramesetOk
public boolean isFramesetOk()
Returns the framesetOk.- Specified by:
isFramesetOkin interfaceTreeBuilderState<T>- Returns:
- the framesetOk
-
isNeedToDropLF
public boolean isNeedToDropLF()
Returns the needToDropLF.- Specified by:
isNeedToDropLFin interfaceTreeBuilderState<T>- Returns:
- the needToDropLF
-
isQuirks
public boolean isQuirks()
Returns the quirks.- Specified by:
isQuirksin interfaceTreeBuilderState<T>- Returns:
- the quirks
-
getListOfActiveFormattingElementsLength
public int getListOfActiveFormattingElementsLength()
Description copied from interface:TreeBuilderStateReturn the length of the list of active formatting elements.- Specified by:
getListOfActiveFormattingElementsLengthin interfaceTreeBuilderState<T>- Returns:
- the length of the list of active formatting elements.
- See Also:
TreeBuilderState.getListOfActiveFormattingElementsLength()
-
getStackLength
public int getStackLength()
Description copied from interface:TreeBuilderStateReturn the length of the stack.- Specified by:
getStackLengthin interfaceTreeBuilderState<T>- Returns:
- the length of the stack.
- See Also:
TreeBuilderState.getStackLength()
-
-