Class XhtmlBaseParser
- java.lang.Object
-
- org.apache.maven.doxia.parser.AbstractParser
-
- org.apache.maven.doxia.parser.AbstractXmlParser
-
- org.apache.maven.doxia.parser.XhtmlBaseParser
-
- All Implemented Interfaces:
LogEnabled,HtmlMarkup,Markup,XmlMarkup,Parser
- Direct Known Subclasses:
FmlContentParser,XdocParser,XhtmlParser
public class XhtmlBaseParser extends AbstractXmlParser implements HtmlMarkup
Common base parser for xhtml events.- Since:
- 1.1
- Version:
- $Id: XhtmlBaseParser.java 1726411 2016-01-23 16:34:09Z hboutemy $
- Author:
- Jason van Zyl, ltheussl
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
AbstractXmlParser.CachedFileEntityResolver
-
-
Field Summary
-
Fields inherited from interface org.apache.maven.doxia.markup.HtmlMarkup
A, ABBR, ACRONYM, ADDRESS, APPLET, AREA, B, BASE, BASEFONT, BDO, BIG, BLOCKQUOTE, BODY, BR, BUTTON, CAPTION, CDATA_TYPE, CENTER, CITE, CODE, COL, COLGROUP, DD, DEL, DFN, DIR, DIV, DL, DT, EM, ENTITY_TYPE, FIELDSET, FONT, FORM, FRAME, FRAMESET, H1, H2, H3, H4, H5, H6, HEAD, HR, HTML, I, IFRAME, IMG, INPUT, INS, ISINDEX, KBD, LABEL, LEGEND, LI, LINK, MAP, MENU, META, NOFRAMES, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, P, PARAM, PRE, Q, S, SAMP, SCRIPT, SELECT, SMALL, SPAN, STRIKE, STRONG, STYLE, SUB, SUP, TABLE, TAG_TYPE_END, TAG_TYPE_SIMPLE, TAG_TYPE_START, TBODY, TD, TEXTAREA, TFOOT, TH, THEAD, TITLE, TR, TT, U, UL, VAR
-
Fields inherited from interface org.apache.maven.doxia.markup.Markup
COLON, EOL, EQUAL, GREATER_THAN, LEFT_CURLY_BRACKET, LEFT_SQUARE_BRACKET, LESS_THAN, MINUS, PLUS, QUOTE, RIGHT_CURLY_BRACKET, RIGHT_SQUARE_BRACKET, SEMICOLON, SLASH, SPACE, STAR
-
Fields inherited from interface org.apache.maven.doxia.parser.Parser
ROLE, TXT_TYPE, UNKNOWN_TYPE, XML_TYPE
-
Fields inherited from interface org.apache.maven.doxia.markup.XmlMarkup
BANG, CDATA, DOCTYPE_START, ENTITY_START, XML_NAMESPACE
-
-
Constructor Summary
Constructors Constructor Description XhtmlBaseParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleanbaseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html end tags.protected booleanbaseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html start tags.protected voidconsecutiveSections(int newLevel, Sink sink)Make sure sections are nested consecutively.protected intgetSectionLevel()Return the current section level.protected voidhandleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles CDATA sections.protected voidhandleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles comments.protected voidhandleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through the possible end tags.protected voidhandleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through the possible start tags.protected voidhandleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles text events.protected voidinit()Initialize the parser.protected voidinitXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)Initializes the parser with custom entities or other options.protected booleanisScriptBlock()Checks if we are currently inside a <script> tag.protected booleanisVerbatim()Checks if we are currently inside a <pre> tag.voidparse(java.io.Reader source, Sink sink)Parses the given source model and emits Doxia events into the given sink.protected voidsetSectionLevel(int newLevel)Set the current section level.protected java.lang.StringvalidAnchor(java.lang.String id)Checks if the given id is a valid Doxia id and if not, returns a transformed one.protected voidverbatim()Start verbatim mode.protected voidverbatim_()Stop verbatim mode.-
Methods inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
getAttributesFromParser, getLocalEntities, getText, getType, handleEntity, handleUnknown, isCollapsibleWhitespace, isIgnorableWhitespace, isTrimmableWhitespace, isValidate, parse, setCollapsibleWhitespace, setIgnorableWhitespace, setTrimmableWhitespace, setValidate
-
Methods inherited from class org.apache.maven.doxia.parser.AbstractParser
doxiaVersion, enableLogging, executeMacro, getBasedir, getLog, getMacroManager, isEmitComments, isSecondParsing, parse, setEmitComments, setSecondParsing
-
-
-
-
Method Detail
-
parse
public void parse(java.io.Reader source, Sink sink) throws ParseExceptionParses the given source model and emits Doxia events into the given sink.- Specified by:
parsein interfaceParser- Overrides:
parsein classAbstractXmlParser- Parameters:
source- not null reader that provides the source document. You could usenewReadermethods fromReaderFactory.sink- A sink that consumes the Doxia events.- Throws:
ParseException- if the model could not be parsed.
-
initXmlParser
protected void initXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionInitializes the parser with custom entities or other options. Adds all XHTML (HTML 4.0) entities to the parser so that they can be recognized and resolved without additional DTD.- Overrides:
initXmlParserin classAbstractXmlParser- Parameters:
parser- A parser, not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem initializing the parser
-
baseStartTag
protected boolean baseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html start tags. These include only tags that can go into the body of a xhtml document and so should be re-usable by different xhtml-based parsers.
The currently handled tags are:
<h2>, <h3>, <h4>, <h5>, <h6>, <p>, <pre>, <ul>, <ol>, <li>, <dl>, <dt>, <dd>, <b>, <strong>, <i>, <em>, <code>, <samp>, <tt>, <a>, <table>, <tr>, <th>, <td>, <caption>, <br/>, <hr/>, <img/>.- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, i.e. the tag was recognized, false otherwise.
-
baseEndTag
protected boolean baseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html end tags. These should be re-usable by different xhtml-based parsers. The tags handled here are the same as for
baseStartTag(XmlPullParser,Sink), except for the empty elements (<br/>, <hr/>, <img/>).- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, false otherwise.
-
handleStartTag
protected void handleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible start tags. Just callsbaseStartTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleStartTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleEndTag
protected void handleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible end tags. Just callsbaseEndTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleEndTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleText
protected void handleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles text events.This is a default implementation, if the parser points to a non-empty text element, it is emitted as a text event into the specified sink.
- Overrides:
handleTextin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
handleComment
protected void handleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles comments.This is a default implementation, all data are emitted as comment events into the specified sink.
- Overrides:
handleCommentin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
handleCdsect
protected void handleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles CDATA sections.This is a default implementation, all data are emitted as text events into the specified sink.
- Overrides:
handleCdsectin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
consecutiveSections
protected void consecutiveSections(int newLevel, Sink sink)Make sure sections are nested consecutively.HTML doesn't have any sections, only sectionTitles (<h2> etc), that means we have to open close any sections that are missing in between.
For instance, if the following sequence is parsed:
<h3></h3> <h6></h6>
we have to insert two section starts before we open the<h6>. In the following sequence<h6></h6> <h3></h3>
we have to close two sections before we open the<h3>.The current level is set to newLevel afterwards.
- Parameters:
newLevel- the new section level, all upper levels have to be closed.sink- the sink to receive the events.
-
getSectionLevel
protected int getSectionLevel()
Return the current section level.- Returns:
- the current section level.
-
setSectionLevel
protected void setSectionLevel(int newLevel)
Set the current section level.- Parameters:
newLevel- the new section level.
-
verbatim_
protected void verbatim_()
Stop verbatim mode.
-
verbatim
protected void verbatim()
Start verbatim mode.
-
isVerbatim
protected boolean isVerbatim()
Checks if we are currently inside a <pre> tag.- Returns:
- true if we are currently in verbatim mode.
-
isScriptBlock
protected boolean isScriptBlock()
Checks if we are currently inside a <script> tag.- Returns:
- true if we are currently inside
<script>tags. - Since:
- 1.1.1.
-
validAnchor
protected java.lang.String validAnchor(java.lang.String id)
Checks if the given id is a valid Doxia id and if not, returns a transformed one.- Parameters:
id- The id to validate.- Returns:
- A transformed id or the original id if it was already valid.
- See Also:
DoxiaUtils.encodeId(String)
-
init
protected void init()
Initialize the parser. This is called first byParser.parse(java.io.Reader, org.apache.maven.doxia.sink.Sink)and can be used to set the parser into a clear state so it can be re-used.- Overrides:
initin classAbstractParser
-
-