From gitlab at salsa.debian.org Fri May 1 13:48:59 2026 From: gitlab at salsa.debian.org (bastif (@bastif)) Date: Fri, 01 May 2026 12:48:59 +0000 Subject: [Git][java-team/libhtml5parser-java][pristine-tar] pristine-tar data for libhtml5parser-java_1.4+r20260416.orig.tar.xz Message-ID: <69f4a13b535b6_52ffdd54402b6@godard.mail> bastif pushed to branch pristine-tar at Debian Java Maintainers / libhtml5parser-java Commits: 03768c8f by Fab Stz at 2026-05-01T14:42:21+02:00 pristine-tar data for libhtml5parser-java_1.4+r20260416.orig.tar.xz - - - - - 2 changed files: - + libhtml5parser-java_1.4+r20260416.orig.tar.xz.delta - + libhtml5parser-java_1.4+r20260416.orig.tar.xz.id Changes: ===================================== libhtml5parser-java_1.4+r20260416.orig.tar.xz.delta ===================================== Binary files /dev/null and b/libhtml5parser-java_1.4+r20260416.orig.tar.xz.delta differ ===================================== libhtml5parser-java_1.4+r20260416.orig.tar.xz.id ===================================== @@ -0,0 +1 @@ +ed16c3dadfb172bbf4906696309855f1cd7894f6 View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/03768c8fd805784a227ae98276f44a2bc1185c0f -- View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/03768c8fd805784a227ae98276f44a2bc1185c0f You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at salsa.debian.org Fri May 1 13:49:12 2026 From: gitlab at salsa.debian.org (bastif (@bastif)) Date: Fri, 01 May 2026 12:49:12 +0000 Subject: [Git][java-team/libhtml5parser-java][master] 3 commits: New upstream version 1.4+r20260416 Message-ID: <69f4a1486da9f_52ffdbc4405df@godard.mail> bastif pushed to branch master at Debian Java Maintainers / libhtml5parser-java Commits: 5c6cfbbf by Fab Stz at 2026-05-01T14:42:20+02:00 New upstream version 1.4+r20260416 - - - - - 0ebed37d by Fab Stz at 2026-05-01T14:42:21+02:00 Update upstream source from tag 'upstream/1.4+r20260416' Update to upstream version '1.4+r20260416' with Debian dir 5e8917984c28eeee6bf5550e9507b67bb09d4117 - - - - - 3f5082d3 by Fab Stz at 2026-05-01T14:48:28+02:00 Update changelog to 1.4+r20260416-1 - - - - - 18 changed files: - ? .github/dependabot.yml - .github/workflows/build.yml - + CONTRIBUTING.md - debian/changelog - gwt-src/nu/validator/htmlparser/gwt/BrowserTreeBuilder.java - src/nu/validator/htmlparser/dom/DOMTreeBuilder.java - src/nu/validator/htmlparser/impl/AttributeName.java - src/nu/validator/htmlparser/impl/ElementName.java - src/nu/validator/htmlparser/impl/Portability.java - src/nu/validator/htmlparser/impl/Tokenizer.java - src/nu/validator/htmlparser/impl/TreeBuilder.java - src/nu/validator/htmlparser/sax/SAXTreeBuilder.java - src/nu/validator/htmlparser/xom/XOMTreeBuilder.java - src/nu/validator/saxtree/CharBufferNode.java - src/nu/validator/saxtree/ParentNode.java - translator-src/nu/validator/htmlparser/cpptranslate/CppTypes.java - translator-src/nu/validator/htmlparser/cpptranslate/CppVisitor.java - translator-src/nu/validator/htmlparser/cpptranslate/HVisitor.java Changes: ===================================== .github/dependabot.yml deleted ===================================== @@ -1,10 +0,0 @@ -version: 2 -updates: - - package-ecosystem: "github-actions" - directory: "/" - schedule: - interval: "weekly" - - package-ecosystem: "maven" - directory: "/" - schedule: - interval: "weekly" ===================================== .github/workflows/build.yml ===================================== @@ -11,7 +11,7 @@ jobs: runs-on: ${{ matrix.os }} strategy: matrix: - java: [24, 21, 17, 11.0.23] + java: [25, 21, 17, 11.0.23] os: [ubuntu-latest, macos-latest, windows-latest] name: Java ${{ matrix.java }} steps: ===================================== CONTRIBUTING.md ===================================== @@ -0,0 +1,96 @@ +# Contributing to htmlparser + +## Adding new elements + +When adding new elements to the parser, you must regenerate the element name hash tables in `src/nu/validator/htmlparser/impl/ElementName.java`. + +### Step 1: Add the new element constant + +Add a new `static final ElementName` constant for your element, following the existing pattern: + +```java +public static final ElementName MYNEWELEMENT = new ElementName( + "mynewelement", "mynewelement", + // CPPONLY: NS_NewHTMLElement, + // CPPONLY: NS_NewSVGUnknownElement, + TreeBuilder.OTHER); +``` + +The flags (like `TreeBuilder.OTHER`, `SPECIAL`, `SCOPING`, etc.) depend on how the element should be handled by the tree builder. + +### Step 2: Uncomment the code generation sections + +Uncomment three sections in `ElementName.java`: + +1. **The imports** near the top (~lines 26-39): + - `java.io.*` + - `java.util.*` + - `java.util.regex.*` + +2. **`implements Comparable`** on the class declaration (~line 49) + +3. **The code generation block** marked with: + `"START CODE ONLY USED FOR GENERATING CODE uncomment and run to regenerate"` + That includes the `main()` method and helper functions (~lines 272-659) + +### Step 3: Add case to treeBuilderGroupToName() if needed + +If your element uses a new `TreeBuilder` group constant, add a case for it in the `treeBuilderGroupToName()` method within the code generation block. + +### Step 4: Compile and run + +Compile the project: + +```bash +mvn compile +``` + +Run the `ElementName` class with paths to the Gecko tag-list files: + +```bash +java -cp target/classes nu.validator.htmlparser.impl.ElementName \ + /path/to/nsHTMLTagList.h \ + /path/to/SVGTagList.h +``` + +**For Java-only builds** (not Gecko), you can use empty dummy files: + +```bash +mkdir -p /tmp/tagfiles +touch /tmp/tagfiles/nsHTMLTagList.h /tmp/tagfiles/SVGTagList.h +java -cp target/classes nu.validator.htmlparser.impl.ElementName \ + /tmp/tagfiles/nsHTMLTagList.h \ + /tmp/tagfiles/SVGTagList.h +``` + +> [!NOTE] +> Using empty files means the `CPPONLY` comments will all show `NS_NewHTMLUnknownElement`. For Gecko builds, use the actual files from moz-central: +> - `parser/htmlparser/nsHTMLTagList.h` +> - `dom/svg/SVGTagList.h` + +### Step 5: Update the generated arrays + +The program outputs: +1. All element constant definitions (with updated `CPPONLY` comments if using real Gecko tag files) +2. The `ELEMENT_NAMES` array in level-order binary search tree order +3. The `ELEMENT_HASHES` array with corresponding hash values + +Replace the existing `ELEMENT_NAMES` and `ELEMENT_HASHES` arrays in the file with the generated output. The arrays must stay in sync?element at position N in `ELEMENT_NAMES` must have its hash at position N in `ELEMENT_HASHES`. + +### Step 6: Re-comment the code generation sections + +After regeneration, comment out the sections you uncommented in Step 2 to restore the file to its normal state. + +### Step 7: Run tests + +Verify your changes work correctly: + +```bash +mvn test +``` + +### Technical Details + +The hash function (`bufToHash`) creates a unique integer for each element name using the element's length and specific character positions. The arrays are organized as a level-order binary search tree for O(log n) lookup performance. + +If you encounter a hash collision (two elements with the same hash), the regeneration will report an error. That would require modifying the hash function, which has not been necessary historically. ===================================== debian/changelog ===================================== @@ -1,3 +1,9 @@ +libhtml5parser-java (1.4+r20260416-1) UNRELEASED; urgency=medium + + * New upstream version 1.4+r20260416-1 + + -- Fab Stz Fri, 01 May 2026 14:42:33 +0200 + libhtml5parser-java (1.4+r20250916-1) unstable; urgency=medium [ Fab Stz ] ===================================== gwt-src/nu/validator/htmlparser/gwt/BrowserTreeBuilder.java ===================================== @@ -474,4 +474,107 @@ class BrowserTreeBuilder extends CoalescingTreeBuilder { fatal(e); } } + + private static native JavaScriptObject getNextSibling( + JavaScriptObject node) /*-{ + return node.nextSibling; + }-*/; + + private static native String getLocalName( + JavaScriptObject node) /*-{ + return node.localName; + }-*/; + + private static native String getNamespaceURI( + JavaScriptObject node) /*-{ + return node.namespaceURI; + }-*/; + + private static native boolean hasAttribute( + JavaScriptObject node, String name) /*-{ + return node.hasAttribute(name); + }-*/; + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(JavaScriptObject option) + throws SAXException { + try { + // Find the nearest ancestor + JavaScriptObject selectedContent = findSelectedContent( + select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelectedAttr = hasAttribute(option, "selected"); + if (!hasSelectedAttr && hasChildNodes(selectedContent)) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + while (hasChildNodes(selectedContent)) { + removeChild(selectedContent, getFirstChild(selectedContent)); + } + for (JavaScriptObject child = getFirstChild(option); + child != null; child = getNextSibling(child)) { + appendChild(selectedContent, cloneNodeDeep(child)); + } + } catch (JavaScriptException e) { + fatal(e); + } + } + + private JavaScriptObject findSelectedContent( + JavaScriptObject root) { + JavaScriptObject current = getFirstChild(root); + if (current == null) { + return null; + } + JavaScriptObject next; + for (;;) { + if (getNodeType(current) == 1 + && "selectedcontent".equals(getLocalName(current)) + && "http://www.w3.org/1999/xhtml".equals( + getNamespaceURI(current))) { + return current; + } + if ((next = getFirstChild(current)) != null) { + current = next; + continue; + } + for (;;) { + if (current == root) { + return null; + } + if ((next = getNextSibling(current)) != null) { + current = next; + break; + } + current = getParentNode(current); + } + } + } } ===================================== src/nu/validator/htmlparser/dom/DOMTreeBuilder.java ===================================== @@ -354,4 +354,89 @@ class DOMTreeBuilder extends CoalescingTreeBuilder { fatal(e); } } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + try { + // Find the nearest ancestor + Element selectedContent = findSelectedContent(select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.hasAttribute("selected"); + if (!hasSelected && selectedContent.hasChildNodes()) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + while (selectedContent.hasChildNodes()) { + selectedContent.removeChild(selectedContent.getFirstChild()); + } + for (Node child = option.getFirstChild(); child != null; + child = child.getNextSibling()) { + selectedContent.appendChild(child.cloneNode(true)); + } + } catch (DOMException e) { + fatal(e); + } + } + + private Element findSelectedContent(Element root) { + Node current = root.getFirstChild(); + if (current == null) { + return null; + } + Node next; + for (;;) { + if (current.getNodeType() == Node.ELEMENT_NODE) { + Element elt = (Element) current; + if ("selectedcontent".equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + elt.getNamespaceURI())) { + return elt; + } + } + if ((next = current.getFirstChild()) != null) { + current = next; + continue; + } + for (;;) { + if (current == root) { + return null; + } + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + current = current.getParentNode(); + } + } + } } ===================================== src/nu/validator/htmlparser/impl/AttributeName.java ===================================== @@ -806,7 +806,9 @@ public final class AttributeName public static final AttributeName SRCDOC = new AttributeName(ALL_NO_NS, "srcdoc", "srcdoc", "srcdoc", "srcdoc", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName STDDEVIATION = new AttributeName(ALL_NO_NS, "stddeviation", "stddeviation", "stdDeviation", "stddeviation", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName SANDBOX = new AttributeName(ALL_NO_NS, "sandbox", "sandbox", "sandbox", "sandbox", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); + public static final AttributeName SHADOWROOTCUSTOMELEMENTREGISTRY = new AttributeName(ALL_NO_NS, "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName SHADOWROOTDELEGATESFOCUS = new AttributeName(ALL_NO_NS, "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); + public static final AttributeName SHADOWROOTSLOTASSIGNMENT = new AttributeName(ALL_NO_NS, "shadowrootslotassignment", "shadowrootslotassignment", "shadowrootslotassignment", "shadowrootslotassignment", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName WORD_SPACING = new AttributeName(ALL_NO_NS, "word-spacing", "word-spacing", "word-spacing", "word-spacing", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName ACCENTUNDER = new AttributeName(ALL_NO_NS, "accentunder", "accentunder", "accentunder", "accentunder", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName ACCEPT_CHARSET = new AttributeName(ALL_NO_NS, "accept-charset", "accept-charset", "accept-charset", "accept-charset", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); @@ -1199,37 +1201,37 @@ public final class AttributeName public static final AttributeName RY = new AttributeName(ALL_NO_NS, "ry", "ry", "ry", "ry", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName REFY = new AttributeName(ALL_NO_NS, "refy", "refy", "refY", "refy", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); private final static @NoLength AttributeName[] ATTRIBUTE_NAMES = { - MARKERUNITS, - BASELINE, - STOP_COLOR, + MARKERWIDTH, + BASELINE_SHIFT, + SHAPE, CLEAR, - XREF, - AUTOPLAY, - FONT_STYLE, + PROFILE, + XLINK_SHOW, + FONT_WEIGHT, ARIA_DISABLED, OPACITY, - ONBEFOREPRINT, - PATH, - ALINK, - ONMOUSEDOWN, - COLS, - COLUMNLINES, + ONMESSAGE, + ONCHANGE, + ZOOMANDPAN, + ONMOUSEOUT, + CLASSID, + ACCUMULATE, Y, ARIA_MULTISELECTABLE, ROTATE, SHADOWROOTCLONABLE, - LINEBREAK, - REPEATDUR, - ORIGIN, - RADIUS, - TABLEVALUES, - POINTSATZ, - NUMOCTAVES, - CLIPPATHUNITS, - ONDRAGEND, - ROWS, - PATTERNTRANSFORM, - VIEWTARGET, + INTERCEPT, + ROLE, + MARGINHEIGHT, + OPTIMUM, + SCALE, + POINTSATX, + FLOOD_OPACITY, + CLIP_RULE, + ONDRAGENTER, + ROWSPAN, + ONSTART, + VALUE, MIN, K3, ARIA_CHANNEL, @@ -1237,31 +1239,31 @@ public final class AttributeName LOCAL, ONABORT, HIDDEN, - ACCEPT_CHARSET, - DIRECTION, - OBJECT, - ONBEFORECUT, - SIZE, - IMAGE_RENDERING, - MATHBACKGROUND, - DIVISOR, - LINK, - FILL_OPACITY, - FORM, - OPEN, - XLINK_TITLE, - COLOR_INTERPOLATION, - ONZOOM, - STROKE, - LOOP, - COORDS, - STARTOFFSET, - LOWSRC, - CONTEXTMENU, - KEYTIMES, - TEXT_DECORATION, - REQUIRED, - CY, + WORD_SPACING, + DEFER, + ONBEFOREUNLOAD, + ONKEYPRESS, + SPREADMETHOD, + IMAGESIZES, + HIGH, + BEGIN, + VISIBILITY, + FILL_RULE, + FRAMESPACING, + KERNELUNITLENGTH, + WHEN, + COLOR_PROFILE, + ONFOCUSIN, + STROKE_LINEJOIN, + HTTP_EQUIV, + ATTRIBUTETYPE, + ONDRAGSTART, + KEYSYSTEM, + CONTROLS, + FONTSIZE, + SYSTEMLANGUAGE, + ONSUBMIT, + REFX, END, SRC, Y1, @@ -1276,183 +1278,183 @@ public final class AttributeName FETCHPRIORITY, BORDER, RENDERING_INTENT, - SANDBOX, - BEVELLED, - CODEBASE, - FACE, - NAME, - ONRESET, - ONSELECTSTART, - REFERRERPOLICY, - STRETCHY, - HREFLANG, - DRAGGABLE, - LONGDESC, - TARGETY, - MATHSIZE, - ACTIVE, - MANIFEST, - TABINDEX, - MASK, - CELLPADDING, - REPLACE, - FRAMEBORDER, - SUMMARY, - KERNELMATRIX, - POINTER_EVENTS, - TRANSFORM, - XMLNS, - AUTOCAPITALIZE, - EXPONENT, - ONMOUSEENTER, - ONMOUSEUP, - STROKE_DASHARRAY, - COMPACT, - GLYPH_ORIENTATION_HORIZONTAL, - SHAPE_RENDERING, - ABBR, - NOHREF, - OPERATOR, - BIAS, - CLASS, - PRESERVEALPHA, - ALTTEXT, - FILTER, - FONT_SIZE_ADJUST, - RT, - RESTART, - WRITING_MODE, - GROUPALIGN, - VALUES, - FX, - RY, - DIR, - IN2, - REL, - R, - K1, - X2, - XML_SPACE, - ARIA_LABELLEDBY, - ARIA_SELECTED, - ARIA_PRESSED, - ARIA_SECRET, - ARIA_TEMPLATEID, - ARIA_MULTILINE, - ARIA_RELEVANT, - ARIA_AUTOCOMPLETE, - ARIA_HASPOPUP, - DEFAULT, - HSPACE, - MOVABLELIMITS, - RSPACE, - SEPARATORS, - ENABLE_BACKGROUND, - CHECKED, - ONSCROLL, - SPECULAREXPONENT, - GRADIENTTRANSFORM, - LOADING, - SEED, - SRCDOC, - WORD_SPACING, + STDDEVIATION, ACCENT, - BASELINE_SHIFT, CODE, - DEFER, EDGE, - INTERCEPT, LINETHICKNESS, - ONBEFOREUNLOAD, ORDER, - ONMESSAGE, ORIENTATION, - ONKEYPRESS, ONRESIZE, - ROLE, SIZES, - SPREADMETHOD, DIFFUSECONSTANT, - PROFILE, ALIGNMENT_BASELINE, - IMAGESIZES, LANG, - MARGINHEIGHT, TARGET, - HIGH, MATHVARIANT, - ONCHANGE, ACTIONTYPE, - BEGIN, LIMITINGCONEANGLE, - OPTIMUM, SCRIPTSIZEMULTIPLIER, - VISIBILITY, MARKERHEIGHT, - MARKERWIDTH, AMPLITUDE, - FILL_RULE, ONCLICK, - SCALE, AZIMUTH, - FRAMESPACING, PRIMITIVEUNITS, - ZOOMANDPAN, EVENT, - KERNELUNITLENGTH, ONEND, - POINTSATX, STANDBY, - WHEN, XLINK_ARCROLE, - XLINK_SHOW, AUTOCOMPLETE, - COLOR_PROFILE, COLOR_INTERPOLATION_FILTERS, - FLOOD_OPACITY, ONLOAD, - ONFOCUSIN, ONMOUSELEAVE, - ONMOUSEOUT, RQUOTE, - STROKE_LINEJOIN, STROKE_WIDTH, - CLIP_RULE, DISPLAYSTYLE, - HTTP_EQUIV, SCOPED, - SHAPE, TEMPLATE, - ATTRIBUTETYPE, CHARSET, - ONDRAGENTER, ONDRAGDROP, - ONDRAGSTART, AS, - CLASSID, CLOSURE, - KEYSYSTEM, MINSIZE, - ROWSPAN, SUBSCRIPTSHIFT, - CONTROLS, ENCTYPE, - FONT_WEIGHT, FONT_FAMILY, - FONTSIZE, LIST, - ONSTART, PATTERNUNITS, - SYSTEMLANGUAGE, TEXTLENGTH, - ACCUMULATE, COLUMNSPACING, - ONSUBMIT, RESULT, - VALUE, CX, - REFX, FY, + DIR, + IN2, + REL, + R, + K1, + X2, + XML_SPACE, + ARIA_LABELLEDBY, + ARIA_SELECTED, + ARIA_PRESSED, + ARIA_SECRET, + ARIA_TEMPLATEID, + ARIA_MULTILINE, + ARIA_RELEVANT, + ARIA_AUTOCOMPLETE, + ARIA_HASPOPUP, + DEFAULT, + HSPACE, + MOVABLELIMITS, + RSPACE, + SEPARATORS, + ENABLE_BACKGROUND, + CHECKED, + ONSCROLL, + SPECULAREXPONENT, + GRADIENTTRANSFORM, + LOADING, + SEED, + SRCDOC, + SHADOWROOTCUSTOMELEMENTREGISTRY, + ACCEPT_CHARSET, + BEVELLED, + BASELINE, + CODEBASE, + DIRECTION, + FACE, + LINEBREAK, + NAME, + OBJECT, + ONRESET, + ONBEFOREPRINT, + ONSELECTSTART, + ONBEFORECUT, + REFERRERPOLICY, + REPEATDUR, + STRETCHY, + SIZE, + HREFLANG, + XREF, + DRAGGABLE, + IMAGE_RENDERING, + LONGDESC, + ORIGIN, + TARGETY, + MATHBACKGROUND, + MATHSIZE, + PATH, + ACTIVE, + DIVISOR, + MANIFEST, + RADIUS, + TABINDEX, + LINK, + MASK, + MARKERUNITS, + CELLPADDING, + FILL_OPACITY, + REPLACE, + TABLEVALUES, + FRAMEBORDER, + FORM, + SUMMARY, + ALINK, + KERNELMATRIX, + OPEN, + POINTER_EVENTS, + POINTSATZ, + TRANSFORM, + XLINK_TITLE, + XMLNS, + AUTOPLAY, + AUTOCAPITALIZE, + COLOR_INTERPOLATION, + EXPONENT, + NUMOCTAVES, + ONMOUSEENTER, + ONZOOM, + ONMOUSEUP, + ONMOUSEDOWN, + STROKE_DASHARRAY, + STROKE, + COMPACT, + CLIPPATHUNITS, + GLYPH_ORIENTATION_HORIZONTAL, + LOOP, + SHAPE_RENDERING, + STOP_COLOR, + ABBR, + COORDS, + NOHREF, + ONDRAGEND, + OPERATOR, + STARTOFFSET, + BIAS, + COLS, + CLASS, + LOWSRC, + PRESERVEALPHA, + ROWS, + ALTTEXT, + CONTEXTMENU, + FILTER, + FONT_STYLE, + FONT_SIZE_ADJUST, + KEYTIMES, + RT, + PATTERNTRANSFORM, + RESTART, + TEXT_DECORATION, + WRITING_MODE, + COLUMNLINES, + GROUPALIGN, + REQUIRED, + VALUES, + VIEWTARGET, + FX, + CY, REFY, ALT, DUR, @@ -1511,7 +1513,8 @@ public final class AttributeName SHADOWROOTMODE, SHADOWROOTREFERENCETARGET, SHADOWROOTSERIALIZABLE, - STDDEVIATION, + SHADOWROOTSLOTASSIGNMENT, + SANDBOX, SHADOWROOTDELEGATESFOCUS, ACCENTUNDER, ACCESSKEY, @@ -1707,262 +1710,263 @@ public final class AttributeName RX, BY, DY, - }; - private final static int[] ATTRIBUTE_HASHES = { - 1854497003, - 1747939528, - 1941454586, - 1681174213, - 1776114564, - 1915025672, - 2001669450, - 1680165421, - 1721347639, - 1754792749, - 1805715716, - 1898428101, - 1922699851, - 1983347764, - 2016787611, - 71827457, - 1680282148, - 1689324870, - 1740045858, - 1752985897, - 1756471625, - 1788254870, - 1823580230, - 1874698443, - 1906423097, - 1921894426, - 1933145837, - 1972863609, - 1991392548, - 2007019632, - 2060302634, - 57205395, - 911736834, - 1680181996, - 1680368221, - 1685882101, - 1704526375, - 1734182982, - 1747299630, - 1749027145, - 1754606246, - 1754907227, - 1757053236, - 1785174319, - 1804036350, - 1816144023, - 1853862084, - 1867620412, - 1884343396, - 1905628916, - 1910441627, - 1916278099, - 1922567078, - 1924585254, - 1937777860, - 1966439670, - 1974849131, - 1988132214, - 2000162011, - 2004199576, - 2009071951, - 2024616088, - 2081947650, - 53006051, - 60345635, - 885522434, - 1680095865, - 1680165533, - 1680229115, - 1680343801, - 1680437801, - 1682440540, - 1687620127, - 1692408896, - 1716623661, - 1731048742, - 1739583824, - 1740130375, - 1747792072, - 1748552744, - 1749856356, - 1754214628, - 1754645079, - 1754858317, - 1756190926, - 1756804936, - 1767875272, - 1782518297, - 1786821704, - 1791070327, - 1804235064, - 1814656326, - 1820928104, - 1824377064, - 1854464212, - 1865910347, - 1873590471, - 1884142379, - 1891186903, - 1903612236, - 1906408542, - 1908462185, - 1910503637, - 1915394254, - 1917327080, - 1922413292, - 1922671417, - 1924462384, - 1932870919, - 1934917372, - 1941409583, - 1965349396, - 1972196486, - 1972909592, - 1982640164, - 1983461061, - 1990062797, - 1999273799, - 2001578182, - 2001814704, - 2005925890, - 2008084807, - 2010452700, - 2018908874, - 2026741958, - 2066743298, - 2089811970, - 52488851, - 55077603, - 59825747, - 68157441, - 878182402, - 901775362, - 1037879561, - 1680159327, - 1680165437, - 1680165692, - 1680198203, - 1680231247, - 1680315086, - 1680345965, - 1680413393, - 1680452349, - 1681879063, - 1683805446, - 1686731997, - 1689048326, - 1689839946, - 1699185409, - 1714763319, - 1721189160, - 1723336432, - 1733874289, - 1736416327, - 1739927860, - 1740096054, + RY, + }; + private final static int[] ATTRIBUTE_HASHES = { + 1854474395, + 1747839118, + 1941438085, + 1681174213, + 1772032615, + 1910527802, + 2001634459, + 1680165421, + 1721347639, + 1754647353, + 1804978712, + 1894552650, + 1922679386, + 1983266615, + 2015950026, + 71827457, + 1680282148, + 1689324870, + 1740045858, + 1751679545, + 1756302628, + 1787193500, + 1822002839, + 1874261045, + 1906419001, + 1917953597, + 1932986153, + 1972744939, + 1991021879, + 2006516551, + 2026975253, + 57205395, + 911736834, + 1680181996, + 1680368221, + 1685882101, + 1704526375, + 1734182982, 1742183484, + 1748869205, + 1754546894, + 1754872618, + 1756874572, + 1785051290, + 1801312388, + 1814986837, + 1825677514, + 1867448617, + 1884267068, + 1903759600, + 1909819252, + 1916210285, + 1922470745, + 1924570799, + 1935597338, + 1965561677, + 1972962123, + 1987410233, + 2000125224, + 2001898808, + 2008408414, + 2023146024, + 2075005220, + 53006051, + 60345635, + 885522434, + 1680095865, + 1680165533, + 1680229115, + 1680343801, + 1680437801, + 1682440540, + 1687620127, + 1692408896, + 1716623661, + 1731048742, + 1739583824, + 1740119884, 1747446838, - 1747839118, 1748306996, - 1748869205, 1749399124, - 1751679545, 1753297133, - 1754546894, 1754643237, - 1754647353, 1754798923, - 1754872618, 1754958648, - 1756302628, 1756737685, - 1756874572, 1765800271, - 1772032615, 1780975314, - 1785051290, 1786740932, - 1787193500, 1790814502, - 1801312388, 1804069019, - 1804978712, 1814558026, - 1814986837, 1820262641, - 1822002839, 1823841492, - 1825677514, 1854302364, - 1854474395, 1864698185, - 1867448617, 1872034503, - 1874261045, 1881750231, - 1884267068, 1889633006, - 1894552650, 1900548965, - 1903759600, 1905754853, - 1906419001, 1907701479, - 1909819252, 1910441773, - 1910527802, 1915295948, - 1916210285, 1916337499, - 1917953597, 1922319046, - 1922470745, 1922665052, - 1922679386, 1924206934, - 1924570799, 1924738716, - 1932986153, 1933508940, - 1935597338, 1941253366, - 1941438085, 1942026440, - 1965561677, 1966454567, - 1972744939, 1972904522, - 1972962123, 1980235778, - 1983266615, 1983416119, - 1987410233, 1988788535, - 1991021879, 1991643278, - 2000125224, 2001210183, - 2001634459, 2001710299, - 2001898808, 2004957380, - 2006516551, 2007064812, - 2008408414, 2009141482, - 2015950026, 2016910397, - 2023146024, 2024763702, - 2026975253, 2065170434, - 2075005220, 2083520514, + 52488851, + 55077603, + 59825747, + 68157441, + 878182402, + 901775362, + 1037879561, + 1680159327, + 1680165437, + 1680165692, + 1680198203, + 1680231247, + 1680315086, + 1680345965, + 1680413393, + 1680452349, + 1681879063, + 1683805446, + 1686731997, + 1689048326, + 1689839946, + 1699185409, + 1714763319, + 1721189160, + 1723336432, + 1733874289, + 1736416327, + 1739927860, + 1740096054, + 1740185423, + 1747299630, + 1747792072, + 1747939528, + 1748552744, + 1749027145, + 1749856356, + 1752985897, + 1754214628, + 1754606246, + 1754645079, + 1754792749, + 1754858317, + 1754907227, + 1756190926, + 1756471625, + 1756804936, + 1757053236, + 1767875272, + 1776114564, + 1782518297, + 1785174319, + 1786821704, + 1788254870, + 1791070327, + 1804036350, + 1804235064, + 1805715716, + 1814656326, + 1816144023, + 1820928104, + 1823580230, + 1824377064, + 1853862084, + 1854464212, + 1854497003, + 1865910347, + 1867620412, + 1873590471, + 1874698443, + 1884142379, + 1884343396, + 1891186903, + 1898428101, + 1903612236, + 1905628916, + 1906408542, + 1906423097, + 1908462185, + 1910441627, + 1910503637, + 1915025672, + 1915394254, + 1916278099, + 1917327080, + 1921894426, + 1922413292, + 1922567078, + 1922671417, + 1922699851, + 1924462384, + 1924585254, + 1932870919, + 1933145837, + 1934917372, + 1937777860, + 1941409583, + 1941454586, + 1965349396, + 1966439670, + 1972196486, + 1972863609, + 1972909592, + 1974849131, + 1982640164, + 1983347764, + 1983461061, + 1988132214, + 1990062797, + 1991392548, + 1999273799, + 2000162011, + 2001578182, + 2001669450, + 2001814704, + 2004199576, + 2005925890, + 2007019632, + 2008084807, + 2009071951, + 2010452700, + 2016787611, + 2018908874, + 2024616088, + 2026741958, + 2060302634, + 2066743298, + 2081947650, 2091784484, 50917059, 52489043, @@ -2021,7 +2025,8 @@ public final class AttributeName 1739914974, 1739962169, 1740045862, - 1740119884, + 1740109544, + 1740130375, 1740222216, 1747295467, 1747309881, @@ -2217,5 +2222,6 @@ public final class AttributeName 2073034754, 2081423362, 2082471938, + 2089811970, }; } ===================================== src/nu/validator/htmlparser/impl/ElementName.java ===================================== @@ -1424,7 +1424,11 @@ TreeBuilder.OTHER); public static final ElementName SELECT = new ElementName("select", "select", // CPPONLY: NS_NewHTMLSelectElement, // CPPONLY: NS_NewSVGUnknownElement, -TreeBuilder.SELECT | SPECIAL); +TreeBuilder.SELECT | SPECIAL | SCOPING); +public static final ElementName SELECTEDCONTENT = new ElementName("selectedcontent", "selectedcontent", +// CPPONLY: NS_NewHTMLElement, +// CPPONLY: NS_NewSVGUnknownElement, +TreeBuilder.OTHER); public static final ElementName SLOT = new ElementName("slot", "slot", // CPPONLY: NS_NewHTMLSlotElement, // CPPONLY: NS_NewSVGUnknownElement, @@ -1484,18 +1488,18 @@ TreeBuilder.TBODY_OR_THEAD_OR_TFOOT | SPECIAL | FOSTER_PARENTING | OPTIONAL_END_ private final static @NoLength ElementName[] ELEMENT_NAMES = { FIGCAPTION, CITE, -FRAMESET, +FEOFFSET, H1, CLIPPATH, METER, -RADIALGRADIENT, +SELECT, B, BGSOUND, SOURCE, DL, RP, -NOFRAMES, -MTEXT, +PROGRESS, +NOSCRIPT, VIEW, DIV, G, @@ -1507,10 +1511,10 @@ TEXTPATH, ANIMATETRANSFORM, SECTION, HR, -CANVAS, -BASEFONT, -FEDISTANTLIGHT, -OUTPUT, +DEFS, +DATALIST, +FONT, +PLAINTEXT, TFOOT, FEMORPHOLOGY, COL, @@ -1533,14 +1537,14 @@ OPTION, VIDEO, BR, FOOTER, -TR, -DETAILS, -DT, -FOREIGNOBJECT, -FESPOTLIGHT, -INPUT, -RT, -TT, +ADDRESS, +MS, +APPLET, +FIELDSET, +FEPOINTLIGHT, +LINEARGRADIENT, +OBJECT, +RECT, SLOT, MENU, FECONVOLVEMATRIX, @@ -1585,23 +1589,23 @@ SAMP, ANIMATECOLOR, FECOMPONENTTRANSFER, HEADER, -NOBR, -ADDRESS, -DEFS, -MS, -PROGRESS, -APPLET, -DATALIST, -FIELDSET, -FEOFFSET, -FEPOINTLIGHT, -FONT, -LINEARGRADIENT, -NOSCRIPT, -OBJECT, -PLAINTEXT, -RECT, -SELECT, +TR, +CANVAS, +DETAILS, +NOFRAMES, +DT, +BASEFONT, +FOREIGNOBJECT, +FRAMESET, +FESPOTLIGHT, +FEDISTANTLIGHT, +INPUT, +MTEXT, +RT, +OUTPUT, +TT, +RADIALGRADIENT, +SELECTEDCONTENT, SCRIPT, TEXT, FEDROPSHADOW, @@ -1689,22 +1693,23 @@ FEFUNCR, FILTER, FEGAUSSIANBLUR, MARKER, +NOBR, }; private final static int[] ELEMENT_HASHES = { 1900845386, 1748359220, -2001349720, +2001349736, 876609538, 1798686984, 1971465813, -2007781534, +2008125638, 59768833, 1730965751, 1756474198, 1864368130, 1938817026, -1988763672, -2005324101, +1990037800, +2005719336, 2060065124, 52490899, 62390273, @@ -1716,10 +1721,10 @@ private final static int[] ELEMENT_HASHES = { 1881498736, 1907661127, 1967128578, -1982935782, -1999397992, -2001392798, -2006329158, +1983533124, +2000525512, +2001495140, +2006896969, 2008851557, 2085266636, 51961587, @@ -1742,14 +1747,14 @@ private final static int[] ELEMENT_HASHES = { 1925844629, 1963982850, 1967795958, -1973420034, -1983633431, -1998585858, -2001309869, -2001392795, -2003183333, -2005925890, -2006974466, +1982173479, +1986527234, +1998724870, +2001349704, +2001392796, +2004635806, +2006028454, +2007601444, 2008325940, 2021937364, 2068523856, @@ -1794,23 +1799,23 @@ private final static int[] ELEMENT_HASHES = { 1965334268, 1967788867, 1968836118, -1971938532, -1982173479, -1983533124, -1986527234, -1990037800, -1998724870, -2000525512, -2001349704, -2001349736, -2001392796, -2001495140, -2004635806, -2005719336, -2006028454, -2006896969, -2007601444, -2008125638, +1973420034, +1982935782, +1983633431, +1988763672, +1998585858, +1999397992, +2001309869, +2001349720, +2001392795, +2001392798, +2003183333, +2005324101, +2005925890, +2006329158, +2006974466, +2007781534, +2008305999, 2008340774, 2008994116, 2051837468, @@ -1898,5 +1903,6 @@ private final static int[] ELEMENT_HASHES = { 1967795910, 1968053806, 1971461414, +1971938532, }; } ===================================== src/nu/validator/htmlparser/impl/Portability.java ===================================== @@ -31,6 +31,7 @@ import nu.validator.htmlparser.common.Interner; public final class Portability { + // [NOCPP[ public static int checkedAdd(int a, int b) throws SAXException { // This can't be translated code, because in C++ signed integer overflow is UB, so the below code would be wrong. assert a >= 0; @@ -41,6 +42,7 @@ public final class Portability { } return sum; } + // ]NOCPP] // Allocating methods ===================================== src/nu/validator/htmlparser/impl/Tokenizer.java ===================================== @@ -932,7 +932,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - HtmlAttributes emptyAttributes() { + @Inline HtmlAttributes emptyAttributes() { // [NOCPP[ if (newAttributesEachTime) { return new HtmlAttributes(mappingLangToXmlLang); @@ -944,7 +944,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] } - @Inline private void appendCharRefBuf(char c) { + private void appendCharRefBuf(char c) { // CPPONLY: assert charRefBufLen < charRefBuf.length: // CPPONLY: "RELEASE: Attempted to overrun charRefBuf!"; charRefBuf[charRefBufLen++] = c; @@ -983,11 +983,8 @@ public class Tokenizer implements Locator, Locator2 { * the UTF-16 code unit to append */ @Inline private void appendStrBuf(char c) { - // CPPONLY: assert strBufLen < strBuf.length: "Previous buffer length insufficient."; // CPPONLY: if (strBufLen == strBuf.length) { - // CPPONLY: if (!EnsureBufferSpace(1)) { - // CPPONLY: assert false: "RELEASE: Unable to recover from buffer reallocation failure"; - // CPPONLY: } // TODO: Add telemetry when outer if fires but inner does not + // CPPONLY: EnsureBufferSpaceShouldNeverHappen(1); // CPPONLY: } strBuf[strBufLen++] = c; } @@ -1000,9 +997,22 @@ public class Tokenizer implements Locator, Locator2 { * * @return the buffer as a string */ - protected String strBufToString() { + @Inline protected String strBufToString() { + // CPPONLY: String digitAtom = TryAtomizeForSingleDigit(); + // CPPONLY: if (digitAtom) { + // CPPONLY: return digitAtom; + // CPPONLY: } + // CPPONLY: + // CPPONLY: boolean maybeAtomize = false; + // CPPONLY: if (!newAttributesEachTime) { + // CPPONLY: if (attributeName == AttributeName.CLASS || + // CPPONLY: attributeName == AttributeName.TYPE) { + // CPPONLY: maybeAtomize = true; + // CPPONLY: } + // CPPONLY: } + // CPPONLY: String str = Portability.newStringFromBuffer(strBuf, 0, strBufLen - // CPPONLY: , tokenHandler, !newAttributesEachTime && attributeName == AttributeName.CLASS + // CPPONLY: , tokenHandler, maybeAtomize ); clearStrBufAfterUse(); return str; @@ -1014,7 +1024,7 @@ public class Tokenizer implements Locator, Locator2 { * * @return the buffer as local name */ - private void strBufToDoctypeName() { + @Inline private void strBufToDoctypeName() { doctypeName = Portability.newLocalNameFromBuffer(strBuf, strBufLen, interner); clearStrBufAfterUse(); } @@ -1025,7 +1035,7 @@ public class Tokenizer implements Locator, Locator2 { * @throws SAXException * if the token handler threw */ - private void emitStrBuf() throws SAXException { + @Inline private void emitStrBuf() throws SAXException { if (strBufLen > 0) { tokenHandler.characters(strBuf, 0, strBufLen); clearStrBufAfterUse(); @@ -1094,13 +1104,12 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] } - private void appendStrBuf(@NoLength char[] buffer, int offset, int length) throws SAXException { - int newLen = Portability.checkedAdd(strBufLen, length); - // CPPONLY: assert newLen <= strBuf.length: "Previous buffer length insufficient."; + @Inline private void appendStrBuf(@NoLength char[] buffer, int offset, int length) throws SAXException { + // Years of crash stats have shown that the this addition doesn't overflow, as it logically + // shouldn't. + int newLen = strBufLen + length; // CPPONLY: if (strBuf.length < newLen) { - // CPPONLY: if (!EnsureBufferSpace(length)) { - // CPPONLY: assert false: "RELEASE: Unable to recover from buffer reallocation failure"; - // CPPONLY: } // TODO: Add telemetry when outer if fires but inner does not + // CPPONLY: EnsureBufferSpaceShouldNeverHappen(length); // CPPONLY: } System.arraycopy(buffer, offset, strBuf, strBufLen, length); strBufLen = newLen; @@ -1455,12 +1464,6 @@ public class Tokenizer implements Locator, Locator2 { */ int pos = start - 1; - /** - * The index of the first char in buf that is - * part of a coalesced run of character tokens or - * Integer.MAX_VALUE if there is not a current run being - * coalesced. - */ switch (state) { case DATA: case RCDATA: @@ -1486,19 +1489,24 @@ public class Tokenizer implements Locator, Locator2 { break; } - /** - * The number of chars in buf that have - * meaning. (The rest of the array is garbage and should not be - * examined.) - */ // CPPONLY: if (mViewSource) { // CPPONLY: mViewSource.SetBuffer(buffer); - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopViewSourceSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } else { + // CPPONLY: pos = StateLoopViewSourceALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } // CPPONLY: mViewSource.DropBuffer((pos == buffer.getEnd()) ? pos : pos + 1); // CPPONLY: } else if (tokenHandler.WantsLineAndColumn()) { - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopLineColSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } else { + // CPPONLY: pos = StateLoopLineColALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } + // CPPONLY: } else if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopFastestSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); // CPPONLY: } else { - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: pos = StateLoopFastestALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); // CPPONLY: } // [NOCPP[ pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, @@ -1547,7 +1555,7 @@ public class Tokenizer implements Locator, Locator2 { } // ]NOCPP] - @SuppressWarnings("unused") private int stateLoop(int state, char c, + @SuppressWarnings("unused") @Inline private int stateLoop(int state, char c, int pos, @NoLength char[] buf, boolean reconsume, int returnState, int endPos) throws SAXException { boolean reportedConsecutiveHyphens = false; @@ -1626,7 +1634,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementData(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -2201,7 +2213,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementAttributeValueDoubleQuoted(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -2698,7 +2714,11 @@ public class Tokenizer implements Locator, Locator2 { // CPPONLY: MOZ_FALLTHROUGH; case COMMENT: commentloop: for (;;) { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementComment(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3194,7 +3214,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementCdataSection(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3281,7 +3305,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementAttributeValueSingleQuoted(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3893,7 +3921,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementPlaintext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4005,7 +4037,12 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // RCDATA and DATA have the same set of characters that they are indifferent to, hence accelerateData. + // CPPONLY: pos += accelerateAdvancementData(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4056,7 +4093,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementRawtext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4340,7 +4381,12 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // Using `accelerateAdvancementRawtext`, because this states has the same characters of interest as RAWTEXT. + // CPPONLY: pos += accelerateAdvancementRawtext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4536,7 +4582,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementScriptDataEscaped(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -6348,24 +6398,24 @@ public class Tokenizer implements Locator, Locator2 { forceQuirks = false; } - private void adjustDoubleHyphenAndAppendToStrBufCarriageReturn() + @Inline private void adjustDoubleHyphenAndAppendToStrBufCarriageReturn() throws SAXException { silentCarriageReturn(); adjustDoubleHyphenAndAppendToStrBufAndErr('\n', false); } - private void adjustDoubleHyphenAndAppendToStrBufLineFeed() + @Inline private void adjustDoubleHyphenAndAppendToStrBufLineFeed() throws SAXException { silentLineFeed(); adjustDoubleHyphenAndAppendToStrBufAndErr('\n', false); } - private void appendStrBufLineFeed() { + @Inline private void appendStrBufLineFeed() { silentLineFeed(); appendStrBuf('\n'); } - private void appendStrBufCarriageReturn() { + @Inline private void appendStrBufCarriageReturn() { silentCarriageReturn(); appendStrBuf('\n'); } @@ -6383,7 +6433,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - private void emitCarriageReturn(@NoLength char[] buf, int pos) + @Inline private void emitCarriageReturn(@NoLength char[] buf, int pos) throws SAXException { silentCarriageReturn(); flushChars(buf, pos); @@ -6412,7 +6462,7 @@ public class Tokenizer implements Locator, Locator2 { cstart = pos + 1; } - private void setAdditionalAndRememberAmpersandLocation(char add) { + @Inline private void setAdditionalAndRememberAmpersandLocation(char add) { additional = add; // [NOCPP[ ampersandLocation = new LocatorImpl(this); @@ -7077,7 +7127,7 @@ public class Tokenizer implements Locator, Locator2 { * happened in a non-text context, this method turns that deferred suspension * request into an immediately-pending suspension request. */ - private void suspendIfRequestedAfterCurrentNonTextToken() { + @Inline private void suspendIfRequestedAfterCurrentNonTextToken() { if (suspendAfterCurrentNonTextToken) { suspendAfterCurrentNonTextToken = false; shouldSuspend = true; @@ -7221,7 +7271,7 @@ public class Tokenizer implements Locator, Locator2 { * @param val * @throws SAXException */ - private void emitOrAppendTwo(@Const @NoLength char[] val, int returnState) + @Inline private void emitOrAppendTwo(@Const @NoLength char[] val, int returnState) throws SAXException { if ((returnState & DATA_AND_RCDATA_MASK) != 0) { appendStrBuf(val[0]); @@ -7231,7 +7281,7 @@ public class Tokenizer implements Locator, Locator2 { } } - private void emitOrAppendOne(@Const @NoLength char[] val, int returnState) + @Inline private void emitOrAppendOne(@Const @NoLength char[] val, int returnState) throws SAXException { if ((returnState & DATA_AND_RCDATA_MASK) != 0) { appendStrBuf(val[0]); @@ -7268,7 +7318,7 @@ public class Tokenizer implements Locator, Locator2 { } } - public void requestSuspension() { + @Inline public void requestSuspension() { shouldSuspend = true; } @@ -7311,7 +7361,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - public boolean isInDataState() { + @Inline public boolean isInDataState() { return (stateSave == DATA); } ===================================== src/nu/validator/htmlparser/impl/TreeBuilder.java ===================================== @@ -226,12 +226,6 @@ public abstract class TreeBuilder implements TokenHandler, // no fall-through - private static final int IN_SELECT_IN_TABLE = 10; - - private static final int IN_SELECT = 11; - - // no fall-through - private static final int AFTER_BODY = 12; // no fall-through @@ -952,9 +946,6 @@ public abstract class TreeBuilder implements TokenHandler, * current node. */ break charactersloop; - case IN_SELECT: - case IN_SELECT_IN_TABLE: - break charactersloop; case IN_TABLE: case IN_TABLE_BODY: case IN_ROW: @@ -1166,9 +1157,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; i--; continue; - case IN_SELECT: - case IN_SELECT_IN_TABLE: - break charactersloop; case AFTER_BODY: errNonSpaceAfterBody(); fatal(); @@ -1334,8 +1322,6 @@ public abstract class TreeBuilder implements TokenHandler, case IN_TABLE_BODY: case IN_ROW: case IN_TABLE: - case IN_SELECT_IN_TABLE: - case IN_SELECT: case IN_COLUMN_GROUP: case FRAMESET_OK: case IN_CAPTION: @@ -1531,12 +1517,19 @@ public abstract class TreeBuilder implements TokenHandler, if (!(group == FONT && !(attributes.contains(AttributeName.COLOR) || attributes.contains(AttributeName.FACE) || attributes.contains(AttributeName.SIZE)))) { errHtmlStartTagInForeignContext(name); - if (!fragment) { - while (!isSpecialParentInForeign(stack[currentPtr])) { - popForeign(-1, -1); - } + // Pop until we reach an HTML namespace element, + // HTML integration point, or MathML text integration point. + // In fragment case, stop before popping the context element. + while (currentPtr > 0 && !isSpecialParentInForeign(stack[currentPtr])) { + popForeign(-1, -1); + } + if (currentPtr > 0 || isSpecialParentInForeign(stack[currentPtr])) { + // Popped to an HTML element or integration point continue starttagloop; - } // else fall thru + } + // In fragment case with foreign context, fall through + // to let switch(mode) handle the token in HTML namespace + break; } // CPPONLY: MOZ_FALLTHROUGH; default: @@ -2163,6 +2156,25 @@ public abstract class TreeBuilder implements TokenHandler, break starttagloop; case HR: implicitlyCloseP(); + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "hr"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags." + generateImpliedEndTags(); + // "2. If the stack of open elements has an option element + // in scope or has an optgroup element in scope, then + // this is a parse error." + if (errorHandler != null + && (findLastInScope("option") != TreeBuilder.NOT_FOUND_ON_STACK + || findLastInScope("optgroup") != TreeBuilder.NOT_FOUND_ON_STACK)) { + errUnclosedElementsImplied( + findLastInScope("option") != TreeBuilder.NOT_FOUND_ON_STACK + ? findLastInScope("option") + : findLastInScope("optgroup"), + name); + } + } appendVoidElementToCurrentMayFoster( elementName, attributes); @@ -2177,7 +2189,31 @@ public abstract class TreeBuilder implements TokenHandler, elementName = ElementName.IMG; continue starttagloop; case IMG: + reconstructTheActiveFormattingElements(); + appendVoidElementToCurrentMayFoster( + elementName, attributes, + formPointer); + selfClosing = false; + // [NOCPP[ + voidElement = true; + // ]NOCPP] + attributes = null; // CPP + break starttagloop; case INPUT: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "input"" + // "If the stack of open elements has a select element + // in scope:" + eltPos = findLastInScope("select"); + if (eltPos != TreeBuilder.NOT_FOUND_ON_STACK) { + // "Parse error." + errStartTagWithSelectOpen(name); + // "Pop elements until a select element has been popped." + while (currentPtr >= eltPos) { + pop(); + } + continue starttagloop; + } reconstructTheActiveFormattingElements(); appendVoidElementToCurrentMayFoster( elementName, attributes, @@ -2228,31 +2264,100 @@ public abstract class TreeBuilder implements TokenHandler, attributes = null; // CPP break starttagloop; case SELECT: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "select"" + // "If the parser was created as part of the HTML fragment + // parsing algorithm and the context element is a select + // element:" + if (fragment && "select" == contextName) { + // "Parse error. Ignore the token." + errStartSelectWhereEndSelectExpected(); + break starttagloop; + } + // "Otherwise, if the stack of open elements has a select + // element in scope:" + eltPos = findLastInScope(name); + if (eltPos != TreeBuilder.NOT_FOUND_ON_STACK) { + // "Parse error." + errStartSelectWhereEndSelectExpected(); + // "Pop elements until a select element has been popped." + while (currentPtr >= eltPos) { + pop(); + } + break starttagloop; + } + // "Otherwise:" + // "Reconstruct the active formatting elements, if any." reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." appendToCurrentNodeAndPushElementMayFoster( elementName, attributes, formPointer); - switch (mode) { - case IN_TABLE: - case IN_CAPTION: - case IN_COLUMN_GROUP: - case IN_TABLE_BODY: - case IN_ROW: - case IN_CELL: - mode = IN_SELECT_IN_TABLE; - break; - default: - mode = IN_SELECT; - break; + // "Set the frameset-ok flag to "not ok"." + framesetOk = false; + attributes = null; // CPP + break starttagloop; + case OPTION: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "option"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags except for optgroup elements." + generateImpliedEndTagsExceptFor("optgroup"); + // "2. If the stack of open elements has an option element + // in scope, then this is a parse error." + if (errorHandler != null) { + int optionPos = findLastInScope("option"); + if (optionPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optionPos, name); + } + } + } else { + // "Otherwise, if the current node is an option element, + // then pop the current node from the stack of open elements." + if (isCurrent("option")) { + pop(); + } } + // "Reconstruct the active formatting elements, if any." + reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." + appendToCurrentNodeAndPushElementMayFoster( + elementName, + attributes); attributes = null; // CPP break starttagloop; case OPTGROUP: - case OPTION: - if (isCurrent("option")) { - pop(); + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "optgroup"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags." + generateImpliedEndTags(); + // "2. If the stack of open elements has an option element + // in scope or has an optgroup element in scope, then + // this is a parse error." + if (errorHandler != null) { + int optionPos = findLastInScope("option"); + if (optionPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optionPos, name); + } else { + int optgroupPos = findLastInScope("optgroup"); + if (optgroupPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optgroupPos, name); + } + } + } + } else { + // "Otherwise, if the current node is an option element, + // then pop the current node from the stack of open elements." + if (isCurrent("option")) { + pop(); + } } + // "Reconstruct the active formatting elements, if any." reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." appendToCurrentNodeAndPushElementMayFoster( elementName, attributes); @@ -2322,14 +2427,18 @@ public abstract class TreeBuilder implements TokenHandler, attributes = null; // CPP break starttagloop; case CAPTION: - case COL: - case COLGROUP: case TBODY_OR_THEAD_OR_TFOOT: case TR: case TD_OR_TH: + case COL: + case COLGROUP: case FRAME: case FRAMESET: case HEAD: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is one of: "caption", "col", + // "colgroup", "frame", "frameset", "head", "tbody", "td", + // "tfoot", "th", "thead", "tr"" errStrayStartTag(name); break starttagloop; case OUTPUT: @@ -2507,111 +2616,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; continue; } - case IN_SELECT_IN_TABLE: - switch (group) { - case CAPTION: - case TBODY_OR_THEAD_OR_TFOOT: - case TR: - case TD_OR_TH: - case TABLE: - errStartTagWithSelectOpen(name); - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break starttagloop; // http://www.w3.org/Bugs/Public/show_bug.cgi?id=8375 - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - default: - // fall through to IN_SELECT - } - // CPPONLY: MOZ_FALLTHROUGH; - case IN_SELECT: - switch (group) { - case HTML: - errStrayStartTag(name); - if (!fragment) { - addAttributesToHtml(attributes); - attributes = null; // CPP - } - break starttagloop; - case OPTION: - if (isCurrent("option")) { - pop(); - } - appendToCurrentNodeAndPushElement( - elementName, - attributes); - attributes = null; // CPP - break starttagloop; - case OPTGROUP: - if (isCurrent("option")) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } - appendToCurrentNodeAndPushElement( - elementName, - attributes); - attributes = null; // CPP - break starttagloop; - case SELECT: - errStartSelectWhereEndSelectExpected(); - eltPos = findLastInTableScope(name); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - errNoSelectInTableScope(); - break starttagloop; - } else { - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - break starttagloop; - } - case INPUT: - case TEXTAREA: - errStartTagWithSelectOpen(name); - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break starttagloop; - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - case SCRIPT: - startTagScriptInHead(elementName, attributes); - attributes = null; // CPP - break starttagloop; - case TEMPLATE: - startTagTemplateInHead(elementName, attributes); - attributes = null; // CPP - break starttagloop; - case HR: - if (isCurrent("option")) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } - appendVoidElementToCurrent(elementName, attributes); - selfClosing = false; - // [NOCPP[ - voidElement = true; - // ]NOCPP] - attributes = null; // CPP - break starttagloop; - default: - errStrayStartTag(name); - break starttagloop; - } case AFTER_BODY: switch (group) { case HTML: @@ -2992,9 +2996,11 @@ public abstract class TreeBuilder implements TokenHandler, boolean shadowRootIsClonable = attributes.contains(AttributeName.SHADOWROOTCLONABLE); boolean shadowRootIsSerializable = attributes.contains(AttributeName.SHADOWROOTSERIALIZABLE); boolean shadowRootDelegatesFocus = attributes.contains(AttributeName.SHADOWROOTDELEGATESFOCUS); + boolean shadowRootCustomElementRegistry = attributes.contains(AttributeName.SHADOWROOTCUSTOMELEMENTREGISTRY); String shadowRootReferenceTarget = attributes.getValue(AttributeName.SHADOWROOTREFERENCETARGET); + String shadowRootSlotAssignment = attributes.getValue(AttributeName.SHADOWROOTSLOTASSIGNMENT); - return getShadowRootFromHost(currentNode, templateNode, shadowRootMode, shadowRootIsClonable, shadowRootIsSerializable, shadowRootDelegatesFocus, shadowRootReferenceTarget); + return getShadowRootFromHost(currentNode, templateNode, shadowRootMode, shadowRootIsClonable, shadowRootIsSerializable, shadowRootDelegatesFocus, shadowRootCustomElementRegistry, shadowRootSlotAssignment, shadowRootReferenceTarget); } /** @@ -3220,6 +3226,11 @@ public abstract class TreeBuilder implements TokenHandler, for (;;) { if (eltPos == 0) { assert fragment: "We can get this close to the root of the stack in foreign content only in the fragment case."; + // For

and
, continue to mode handling + // which will create implied start tags + if (group == P || group == BR) { + break; // break from inner loop, continue to switch(mode) + } break endtagloop; } if (stack[eltPos].name == name) { @@ -3513,7 +3524,11 @@ public abstract class TreeBuilder implements TokenHandler, case PRE_OR_LISTING: case FIELDSET: case BUTTON: + case SELECT: case ADDRESS_OR_ARTICLE_OR_ASIDE_OR_DETAILS_OR_DIALOG_OR_DIR_OR_FIGCAPTION_OR_FIGURE_OR_FOOTER_OR_HEADER_OR_HGROUP_OR_MAIN_OR_NAV_OR_SEARCH_OR_SECTION_OR_SUMMARY: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "An end tag whose tag name is one of: "address", "article", + // ..., "select", ..., "ul"" eltPos = findLastInScope(name); if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { errStrayEndTag(name); @@ -3564,12 +3579,10 @@ public abstract class TreeBuilder implements TokenHandler, eltPos = findLastInButtonScope("p"); if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { errNoElementToCloseButEndTagSeen("p"); - // XXX Can the 'in foreign' case happen anymore? if (isInForeign()) { errHtmlStartTagInForeignContext(name); - // Check for currentPtr for the fragment - // case. - while (currentPtr >= 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { + // Pop foreign elements, but keep context element in fragment case + while (currentPtr > 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { pop(); } } @@ -3650,11 +3663,9 @@ public abstract class TreeBuilder implements TokenHandler, case BR: errEndTagBr(); if (isInForeign()) { - // XXX can this happen anymore? errHtmlStartTagInForeignContext(name); - // Check for currentPtr for the fragment - // case. - while (currentPtr >= 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { + // Pop foreign elements, but keep context element in fragment case + while (currentPtr > 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { pop(); } } @@ -3677,7 +3688,6 @@ public abstract class TreeBuilder implements TokenHandler, case IFRAME: case NOEMBED: // XXX??? case NOFRAMES: // XXX?? - case SELECT: case TABLE: case TEXTAREA: // XXX?? errStrayEndTag(name); @@ -3787,72 +3797,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; continue; } - case IN_SELECT_IN_TABLE: - switch (group) { - case CAPTION: - case TABLE: - case TBODY_OR_THEAD_OR_TFOOT: - case TR: - case TD_OR_TH: - errEndTagSeenWithSelectOpen(name); - if (findLastInTableScope(name) != TreeBuilder.NOT_FOUND_ON_STACK) { - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break endtagloop; // http://www.w3.org/Bugs/Public/show_bug.cgi?id=8375 - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - } else { - break endtagloop; - } - default: - // fall through to IN_SELECT - } - // CPPONLY: MOZ_FALLTHROUGH; - case IN_SELECT: - switch (group) { - case OPTION: - if (isCurrent("option")) { - pop(); - break endtagloop; - } else { - errStrayEndTag(name); - break endtagloop; - } - case OPTGROUP: - if (isCurrent("option") - && "optgroup" == stack[currentPtr - 1].name) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } else { - errStrayEndTag(name); - } - break endtagloop; - case SELECT: - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - errStrayEndTag(name); - break endtagloop; - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - break endtagloop; - case TEMPLATE: - endTagTemplateInHead(); - break endtagloop; - default: - errStrayEndTag(name); - break endtagloop; - } case AFTER_BODY: switch (group) { case HTML: @@ -4312,23 +4256,7 @@ public abstract class TreeBuilder implements TokenHandler, return; } } - if ("select" == name) { - int ancestorIndex = i; - while (ancestorIndex > 0) { - StackNode ancestor = stack[ancestorIndex--]; - if ("http://www.w3.org/1999/xhtml" == ancestor.ns) { - if ("template" == ancestor.name) { - break; - } - if ("table" == ancestor.name) { - mode = IN_SELECT_IN_TABLE; - return; - } - } - } - mode = IN_SELECT; - return; - } else if ("td" == name || "th" == name) { + if ("td" == name || "th" == name) { mode = IN_CELL; return; } else if ("tr" == name) { @@ -5089,6 +5017,9 @@ public abstract class TreeBuilder implements TokenHandler, private void pop() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; elementPopped(node.ns, node.popName, node.node); node.release(this); @@ -5100,6 +5031,9 @@ public abstract class TreeBuilder implements TokenHandler, markMalformedIfScript(node.node); } assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; elementPopped(node.ns, node.popName, node.node); node.release(this); @@ -5108,6 +5042,7 @@ public abstract class TreeBuilder implements TokenHandler, private void silentPop() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + assert node.getGroup() != OPTION; currentPtr--; node.release(this); } @@ -5115,6 +5050,9 @@ public abstract class TreeBuilder implements TokenHandler, private void popOnEof() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; markMalformedIfScript(node.node); elementPopped(node.ns, node.popName, node.node); @@ -5443,6 +5381,7 @@ public abstract class TreeBuilder implements TokenHandler, T getShadowRootFromHost(T host, T template, String shadowRootMode, boolean shadowRootIsClonable, boolean shadowRootIsSerializable, boolean shadowRootDelegatesFocus, + boolean shadowRootCustomElementRegistry, String shadowRootSlotAssignment, String shadowRootReferenceTarget) { return null; } @@ -5752,6 +5691,18 @@ public abstract class TreeBuilder implements TokenHandler, protected abstract void detachFromParent(T element) throws SAXException; + /** + * Called when an option element is popped from the stack. + * + * https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + * Implements "maybe clone an option into selectedcontent" for + * customizable select. Subclasses that support DOM operations + * should override this to perform the cloning. + */ + protected void optionElementPopped(T option) throws SAXException { + // Default: no-op (streaming/SAX mode ignores cloning) + } + protected abstract boolean hasChildren(T element) throws SAXException; protected abstract void appendElement(T child, T newParent) @@ -6499,10 +6450,6 @@ public abstract class TreeBuilder implements TokenHandler, } } - private void errNoSelectInTableScope() throws SAXException { - err("No \u201Cselect\u201D in table scope."); - } - private void errStartSelectWhereEndSelectExpected() throws SAXException { err("\u201Cselect\u201D start tag where end tag expected."); } @@ -6580,14 +6527,6 @@ public abstract class TreeBuilder implements TokenHandler, err("Saw an end tag after \u201Cbody\u201D had been closed."); } - private void errEndTagSeenWithSelectOpen(@Local String name) throws SAXException { - if (errorHandler == null) { - return; - } - errNoCheck("\u201C" + name - + "\u201D end tag with \u201Cselect\u201D open."); - } - private void errGarbageInColgroup() throws SAXException { err("Garbage in \u201Ccolgroup\u201D fragment."); } ===================================== src/nu/validator/htmlparser/sax/SAXTreeBuilder.java ===================================== @@ -34,6 +34,7 @@ import nu.validator.saxtree.Document; import nu.validator.saxtree.DocumentFragment; import nu.validator.saxtree.Element; import nu.validator.saxtree.Node; +import nu.validator.saxtree.NodeType; import nu.validator.saxtree.ParentNode; class SAXTreeBuilder extends TreeBuilder { @@ -197,4 +198,140 @@ class SAXTreeBuilder extends TreeBuilder { throws SAXException { element.detach(); } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + // Find the nearest ancestor + Element selectedContent = findDescendant(select, "selectedcontent"); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.getAttributes().getIndex("", "selected") >= 0; + if (!hasSelected && selectedContent.getFirstChild() != null) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + selectedContent.clearChildren(); + deepCloneChildren(option, selectedContent); + } + + private Element findAncestor(Element element, String localName) { + ParentNode parent = element.getParentNode(); + while (parent != null) { + if (parent.getNodeType() == NodeType.ELEMENT) { + Element elt = (Element) parent; + if (localName.equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals(elt.getUri())) { + return elt; + } + } + if (parent instanceof Node) { + parent = ((Node) parent).getParentNode(); + } else { + break; + } + } + return null; + } + + private Element findDescendant(Element root, String localName) { + Node current = root.getFirstChild(); + if (current == null) { + return null; + } + Node next; + for (;;) { + if (current.getNodeType() == NodeType.ELEMENT) { + Element elt = (Element) current; + if (localName.equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + elt.getUri())) { + return elt; + } + } + if ((next = current.getFirstChild()) != null) { + current = next; + continue; + } + for (;;) { + if (current.getParentNode() == root) { + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + return null; + } + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + current = (Node) current.getParentNode(); + } + } + } + + private void deepCloneChildren(Element source, Element destination) + throws SAXException { + Node current = source.getFirstChild(); + if (current == null) { + return; + } + ParentNode destParent = destination; + Node next; + outer: + for (;;) { + switch (current.getNodeType()) { + case ELEMENT: + Element srcElem = (Element) current; + Element cloneElem = new Element(null, + srcElem.getUri(), + srcElem.getLocalName(), + srcElem.getQName(), + srcElem.getAttributes(), + false, + srcElem.getPrefixMappings()); + destParent.appendChild(cloneElem); + if ((next = srcElem.getFirstChild()) != null) { + destParent = cloneElem; + current = next; + continue outer; + } + break; + case CHARACTERS: + Characters srcChars = (Characters) current; + char[] buf = srcChars.getBuffer(); + destParent.appendChild( + new Characters(null, buf, 0, buf.length)); + break; + default: + break; + } + for (;;) { + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + if (current.getParentNode() == source) { + return; + } + current = (Node) current.getParentNode(); + destParent = (ParentNode) destParent.getParentNode(); + } + } + } } ===================================== src/nu/validator/htmlparser/xom/XOMTreeBuilder.java ===================================== @@ -23,6 +23,8 @@ package nu.validator.htmlparser.xom; +import java.util.ArrayDeque; + import nu.validator.htmlparser.common.DocumentMode; import nu.validator.htmlparser.impl.CoalescingTreeBuilder; import nu.validator.htmlparser.impl.HtmlAttributes; @@ -348,4 +350,79 @@ class XOMTreeBuilder extends CoalescingTreeBuilder { cachedTableIndex = -1; cachedTable = null; } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + try { + // Find the nearest ancestor + Element selectedContent = findSelectedContent(select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.getAttribute("selected") != null; + if (!hasSelected && selectedContent.getChildCount() > 0) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + selectedContent.removeChildren(); + for (int i = 0; i < option.getChildCount(); i++) { + selectedContent.appendChild(option.getChild(i).copy()); + } + } catch (XMLException e) { + fatal(e); + } + } + + private Element findSelectedContent(Element root) { + ArrayDeque stack = new ArrayDeque<>(); + for (int i = root.getChildCount() - 1; i >= 0; i--) { + Node child = root.getChild(i); + if (child instanceof Element) { + stack.push((Element) child); + } + } + while (!stack.isEmpty()) { + Element current = stack.pop(); + if ("selectedcontent".equals(current.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + current.getNamespaceURI())) { + return current; + } + for (int i = current.getChildCount() - 1; i >= 0; i--) { + Node child = current.getChild(i); + if (child instanceof Element) { + stack.push((Element) child); + } + } + } + return null; + } } ===================================== src/nu/validator/saxtree/CharBufferNode.java ===================================== @@ -50,6 +50,14 @@ public abstract class CharBufferNode extends Node { System.arraycopy(buf, start, buffer, 0, length); } + /** + * Returns the buffer. + * @return the buffer + */ + public char[] getBuffer() { + return buffer; + } + /** * Returns the wrapped buffer as a string. * ===================================== src/nu/validator/saxtree/ParentNode.java ===================================== @@ -202,7 +202,22 @@ public abstract class ParentNode extends Node { prev.setNextSibling(node.getNextSibling()); if (lastChild == node) { lastChild = prev; - } + } + } + } + + /** + * Remove all children from this node. + */ + public void clearChildren() { + Node child = firstChild; + while (child != null) { + Node next = child.getNextSibling(); + child.setParentNode(null); + child.setNextSibling(null); + child = next; } + firstChild = null; + lastChild = null; } } ===================================== translator-src/nu/validator/htmlparser/cpptranslate/CppTypes.java ===================================== @@ -81,8 +81,14 @@ public class CppTypes { reservedWords.add("unicode"); } + private static Map methodRenames = new HashMap(); + + static { + methodRenames.put("htmlaccelEnabled", "mozilla::htmlaccel::htmlaccelEnabled"); + } + private static final String[] TREE_BUILDER_INCLUDES = { "jArray", - "mozilla/ImportScanner", "mozilla/Likely", + "mozilla/ImportScanner", "nsAHtml5TreeBuilderState", "nsAtom", "nsContentUtils", "nsGkAtoms", "nsHtml5ArrayCopy", "nsHtml5AtomTable", "nsHtml5DocumentMode", "nsHtml5Highlighter", "nsHtml5OplessBuilder", "nsHtml5Parser", @@ -91,12 +97,12 @@ public class CppTypes { "nsHtml5TreeOpExecutor", "nsHtml5ViewSourceUtils", "nsIContent", "nsIContentHandle", "nsNameSpaceManager", "nsTraceRefcnt", }; - private static final String[] TOKENIZER_INCLUDES = { "jArray", + private static final String[] TOKENIZER_INCLUDES = { "jArray", "nsAHtml5TreeBuilderState", "nsAtom", "nsGkAtoms", "nsHtml5ArrayCopy", "nsHtml5AtomTable", "nsHtml5DocumentMode", "nsHtml5Highlighter", "nsHtml5Macros", "nsHtml5NamedCharacters", - "nsHtml5NamedCharactersAccel", "nsHtml5String", - "nsIContent", "nsTraceRefcnt" }; + "nsHtml5NamedCharactersAccel", "nsHtml5String", "nsHtml5TreeBuilder", + "nsIContent", "nsTraceRefcnt", "mozilla/htmlaccel/htmlaccelEnabled" }; private static final String[] STACK_NODE_INCLUDES = { "nsAtom", "nsHtml5AtomTable", "nsHtml5HtmlAttributes", "nsHtml5String", "nsNameSpaceManager", "nsIContent", @@ -359,6 +365,14 @@ public class CppTypes { return candidate; } + public String mapMethodName(String method) { + String mapped = methodRenames.get(method); + if (mapped == null) { + return method; + } + return mapped; + } + public String stringForLiteral(String literal) { return '"' + literal + '"'; } @@ -486,6 +500,10 @@ public class CppTypes { return "P::checkChar"; } + public String policyPrefix() { + return "P::"; + } + public String silentLineFeed() { return "P::silentLineFeed"; } @@ -537,8 +555,4 @@ public class CppTypes { public String crashMacro() { return "MOZ_CRASH"; } - - public String loopPolicyInclude() { - return "nsHtml5TokenizerLoopPolicies"; - } } ===================================== translator-src/nu/validator/htmlparser/cpptranslate/CppVisitor.java ===================================== @@ -220,7 +220,7 @@ public class CppVisitor extends AnnotationHelperVisitor { private boolean inConstructorBody = false; - private String currentMethod = null; + protected String currentMethod = null; private Set labels = null; @@ -439,16 +439,6 @@ public class CppVisitor extends AnnotationHelperVisitor { printer.print(className); printer.printLn(".h\""); printer.printLn(); - - if ("Tokenizer".equals(javaClassName)) { - String loopPolicyInclude = cppTypes.loopPolicyInclude(); - if (loopPolicyInclude != null) { - printer.print("#include \""); - printer.print(loopPolicyInclude); - printer.printLn(".h\""); - printer.printLn(); - } - } } public void visit(EmptyTypeDeclaration n, LocalSymbolTable arg) { @@ -1320,6 +1310,9 @@ public class CppVisitor extends AnnotationHelperVisitor { } else if ("checkChar".equals(n.getName()) && n.getScope() == null) { visitCheckChar(n, arg); + } else if (n.getName().startsWith("accelerateAdvancement") + && n.getScope() == null) { + visitAccelerateAdvancement(n, arg); } else if ("silentCarriageReturn".equals(n.getName()) && n.getScope() == null) { visitSilentCarriageReturn(n, arg); @@ -1402,7 +1395,7 @@ public class CppVisitor extends AnnotationHelperVisitor { } } printTypeArgs(n.getTypeArgs(), arg); - printer.print(n.getName()); + printer.print(cppTypes.mapMethodName(n.getName())); if ("stateLoop".equals(n.getName()) && "Tokenizer".equals(javaClassName) && cppTypes.stateLoopPolicies().length > 0) { @@ -1646,15 +1639,11 @@ public class CppVisitor extends AnnotationHelperVisitor { printModifiers(n.getModifiers()); } - if (cppTypes.requiresTemplateParameter(currentMethod) + if (!inHeader() && cppTypes.requiresTemplateParameter(currentMethod) && "Tokenizer".equals(javaClassName) && cppTypes.stateLoopPolicies().length > 0) { printer.print("template"); - if (inHeader()) { - printer.print(" "); - } else { - printer.printLn(); - } + printer.printLn(); } printTypeParameters(n.getTypeParameters(), arg); @@ -1956,6 +1945,23 @@ public class CppVisitor extends AnnotationHelperVisitor { printer.print(")"); } + private void visitAccelerateAdvancement(MethodCallExpr call, LocalSymbolTable arg) { + List args = call.getArgs(); + printer.print(cppTypes.policyPrefix()); + printer.print(call.getName()); + printer.print("(this, "); + if (call.getArgs() != null) { + for (Iterator i = call.getArgs().iterator(); i.hasNext();) { + Expression e = i.next(); + e.accept(this, arg); + if (i.hasNext()) { + printer.print(", "); + } + } + } + printer.print(")"); + } + private void visitSilentLineFeed(MethodCallExpr call, LocalSymbolTable arg) { printer.print(cppTypes.silentLineFeed()); printer.print("(this)"); ===================================== translator-src/nu/validator/htmlparser/cpptranslate/HVisitor.java ===================================== @@ -182,6 +182,12 @@ public class HVisitor extends CppVisitor { previousVisibility = Visibility.PUBLIC; } } + if (cppTypes.requiresTemplateParameter(currentMethod) + && "Tokenizer".equals(javaClassName) + && cppTypes.stateLoopPolicies().length > 0) { + printer.print("template"); + printer.printLn(); + } if (inline()) { printer.print("inline "); } View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/compare/01509c70330e6998604aade7075c1a82b040358e...3f5082d3f440f85d05174fb7130803ef6addab5e -- View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/compare/01509c70330e6998604aade7075c1a82b040358e...3f5082d3f440f85d05174fb7130803ef6addab5e You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at salsa.debian.org Fri May 1 13:49:16 2026 From: gitlab at salsa.debian.org (bastif (@bastif)) Date: Fri, 01 May 2026 12:49:16 +0000 Subject: [Git][java-team/libhtml5parser-java] Pushed new tag upstream/1.4+r20260416 Message-ID: <69f4a14c8a8e3_52ffdd7c4086b@godard.mail> bastif pushed new tag upstream/1.4+r20260416 at Debian Java Maintainers / libhtml5parser-java -- View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/tree/upstream/1.4+r20260416 You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at salsa.debian.org Fri May 1 13:49:31 2026 From: gitlab at salsa.debian.org (bastif (@bastif)) Date: Fri, 01 May 2026 12:49:31 +0000 Subject: [Git][java-team/libhtml5parser-java][upstream] New upstream version 1.4+r20260416 Message-ID: <69f4a15ba0ca9_52ffde5841156@godard.mail> bastif pushed to branch upstream at Debian Java Maintainers / libhtml5parser-java Commits: 5c6cfbbf by Fab Stz at 2026-05-01T14:42:20+02:00 New upstream version 1.4+r20260416 - - - - - 17 changed files: - ? .github/dependabot.yml - .github/workflows/build.yml - + CONTRIBUTING.md - gwt-src/nu/validator/htmlparser/gwt/BrowserTreeBuilder.java - src/nu/validator/htmlparser/dom/DOMTreeBuilder.java - src/nu/validator/htmlparser/impl/AttributeName.java - src/nu/validator/htmlparser/impl/ElementName.java - src/nu/validator/htmlparser/impl/Portability.java - src/nu/validator/htmlparser/impl/Tokenizer.java - src/nu/validator/htmlparser/impl/TreeBuilder.java - src/nu/validator/htmlparser/sax/SAXTreeBuilder.java - src/nu/validator/htmlparser/xom/XOMTreeBuilder.java - src/nu/validator/saxtree/CharBufferNode.java - src/nu/validator/saxtree/ParentNode.java - translator-src/nu/validator/htmlparser/cpptranslate/CppTypes.java - translator-src/nu/validator/htmlparser/cpptranslate/CppVisitor.java - translator-src/nu/validator/htmlparser/cpptranslate/HVisitor.java Changes: ===================================== .github/dependabot.yml deleted ===================================== @@ -1,10 +0,0 @@ -version: 2 -updates: - - package-ecosystem: "github-actions" - directory: "/" - schedule: - interval: "weekly" - - package-ecosystem: "maven" - directory: "/" - schedule: - interval: "weekly" ===================================== .github/workflows/build.yml ===================================== @@ -11,7 +11,7 @@ jobs: runs-on: ${{ matrix.os }} strategy: matrix: - java: [24, 21, 17, 11.0.23] + java: [25, 21, 17, 11.0.23] os: [ubuntu-latest, macos-latest, windows-latest] name: Java ${{ matrix.java }} steps: ===================================== CONTRIBUTING.md ===================================== @@ -0,0 +1,96 @@ +# Contributing to htmlparser + +## Adding new elements + +When adding new elements to the parser, you must regenerate the element name hash tables in `src/nu/validator/htmlparser/impl/ElementName.java`. + +### Step 1: Add the new element constant + +Add a new `static final ElementName` constant for your element, following the existing pattern: + +```java +public static final ElementName MYNEWELEMENT = new ElementName( + "mynewelement", "mynewelement", + // CPPONLY: NS_NewHTMLElement, + // CPPONLY: NS_NewSVGUnknownElement, + TreeBuilder.OTHER); +``` + +The flags (like `TreeBuilder.OTHER`, `SPECIAL`, `SCOPING`, etc.) depend on how the element should be handled by the tree builder. + +### Step 2: Uncomment the code generation sections + +Uncomment three sections in `ElementName.java`: + +1. **The imports** near the top (~lines 26-39): + - `java.io.*` + - `java.util.*` + - `java.util.regex.*` + +2. **`implements Comparable`** on the class declaration (~line 49) + +3. **The code generation block** marked with: + `"START CODE ONLY USED FOR GENERATING CODE uncomment and run to regenerate"` + That includes the `main()` method and helper functions (~lines 272-659) + +### Step 3: Add case to treeBuilderGroupToName() if needed + +If your element uses a new `TreeBuilder` group constant, add a case for it in the `treeBuilderGroupToName()` method within the code generation block. + +### Step 4: Compile and run + +Compile the project: + +```bash +mvn compile +``` + +Run the `ElementName` class with paths to the Gecko tag-list files: + +```bash +java -cp target/classes nu.validator.htmlparser.impl.ElementName \ + /path/to/nsHTMLTagList.h \ + /path/to/SVGTagList.h +``` + +**For Java-only builds** (not Gecko), you can use empty dummy files: + +```bash +mkdir -p /tmp/tagfiles +touch /tmp/tagfiles/nsHTMLTagList.h /tmp/tagfiles/SVGTagList.h +java -cp target/classes nu.validator.htmlparser.impl.ElementName \ + /tmp/tagfiles/nsHTMLTagList.h \ + /tmp/tagfiles/SVGTagList.h +``` + +> [!NOTE] +> Using empty files means the `CPPONLY` comments will all show `NS_NewHTMLUnknownElement`. For Gecko builds, use the actual files from moz-central: +> - `parser/htmlparser/nsHTMLTagList.h` +> - `dom/svg/SVGTagList.h` + +### Step 5: Update the generated arrays + +The program outputs: +1. All element constant definitions (with updated `CPPONLY` comments if using real Gecko tag files) +2. The `ELEMENT_NAMES` array in level-order binary search tree order +3. The `ELEMENT_HASHES` array with corresponding hash values + +Replace the existing `ELEMENT_NAMES` and `ELEMENT_HASHES` arrays in the file with the generated output. The arrays must stay in sync?element at position N in `ELEMENT_NAMES` must have its hash at position N in `ELEMENT_HASHES`. + +### Step 6: Re-comment the code generation sections + +After regeneration, comment out the sections you uncommented in Step 2 to restore the file to its normal state. + +### Step 7: Run tests + +Verify your changes work correctly: + +```bash +mvn test +``` + +### Technical Details + +The hash function (`bufToHash`) creates a unique integer for each element name using the element's length and specific character positions. The arrays are organized as a level-order binary search tree for O(log n) lookup performance. + +If you encounter a hash collision (two elements with the same hash), the regeneration will report an error. That would require modifying the hash function, which has not been necessary historically. ===================================== gwt-src/nu/validator/htmlparser/gwt/BrowserTreeBuilder.java ===================================== @@ -474,4 +474,107 @@ class BrowserTreeBuilder extends CoalescingTreeBuilder { fatal(e); } } + + private static native JavaScriptObject getNextSibling( + JavaScriptObject node) /*-{ + return node.nextSibling; + }-*/; + + private static native String getLocalName( + JavaScriptObject node) /*-{ + return node.localName; + }-*/; + + private static native String getNamespaceURI( + JavaScriptObject node) /*-{ + return node.namespaceURI; + }-*/; + + private static native boolean hasAttribute( + JavaScriptObject node, String name) /*-{ + return node.hasAttribute(name); + }-*/; + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(JavaScriptObject option) + throws SAXException { + try { + // Find the nearest ancestor + JavaScriptObject selectedContent = findSelectedContent( + select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelectedAttr = hasAttribute(option, "selected"); + if (!hasSelectedAttr && hasChildNodes(selectedContent)) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + while (hasChildNodes(selectedContent)) { + removeChild(selectedContent, getFirstChild(selectedContent)); + } + for (JavaScriptObject child = getFirstChild(option); + child != null; child = getNextSibling(child)) { + appendChild(selectedContent, cloneNodeDeep(child)); + } + } catch (JavaScriptException e) { + fatal(e); + } + } + + private JavaScriptObject findSelectedContent( + JavaScriptObject root) { + JavaScriptObject current = getFirstChild(root); + if (current == null) { + return null; + } + JavaScriptObject next; + for (;;) { + if (getNodeType(current) == 1 + && "selectedcontent".equals(getLocalName(current)) + && "http://www.w3.org/1999/xhtml".equals( + getNamespaceURI(current))) { + return current; + } + if ((next = getFirstChild(current)) != null) { + current = next; + continue; + } + for (;;) { + if (current == root) { + return null; + } + if ((next = getNextSibling(current)) != null) { + current = next; + break; + } + current = getParentNode(current); + } + } + } } ===================================== src/nu/validator/htmlparser/dom/DOMTreeBuilder.java ===================================== @@ -354,4 +354,89 @@ class DOMTreeBuilder extends CoalescingTreeBuilder { fatal(e); } } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + try { + // Find the nearest ancestor + Element selectedContent = findSelectedContent(select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.hasAttribute("selected"); + if (!hasSelected && selectedContent.hasChildNodes()) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + while (selectedContent.hasChildNodes()) { + selectedContent.removeChild(selectedContent.getFirstChild()); + } + for (Node child = option.getFirstChild(); child != null; + child = child.getNextSibling()) { + selectedContent.appendChild(child.cloneNode(true)); + } + } catch (DOMException e) { + fatal(e); + } + } + + private Element findSelectedContent(Element root) { + Node current = root.getFirstChild(); + if (current == null) { + return null; + } + Node next; + for (;;) { + if (current.getNodeType() == Node.ELEMENT_NODE) { + Element elt = (Element) current; + if ("selectedcontent".equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + elt.getNamespaceURI())) { + return elt; + } + } + if ((next = current.getFirstChild()) != null) { + current = next; + continue; + } + for (;;) { + if (current == root) { + return null; + } + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + current = current.getParentNode(); + } + } + } } ===================================== src/nu/validator/htmlparser/impl/AttributeName.java ===================================== @@ -806,7 +806,9 @@ public final class AttributeName public static final AttributeName SRCDOC = new AttributeName(ALL_NO_NS, "srcdoc", "srcdoc", "srcdoc", "srcdoc", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName STDDEVIATION = new AttributeName(ALL_NO_NS, "stddeviation", "stddeviation", "stdDeviation", "stddeviation", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName SANDBOX = new AttributeName(ALL_NO_NS, "sandbox", "sandbox", "sandbox", "sandbox", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); + public static final AttributeName SHADOWROOTCUSTOMELEMENTREGISTRY = new AttributeName(ALL_NO_NS, "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", "shadowrootcustomelementregistry", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName SHADOWROOTDELEGATESFOCUS = new AttributeName(ALL_NO_NS, "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", "shadowrootdelegatesfocus", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); + public static final AttributeName SHADOWROOTSLOTASSIGNMENT = new AttributeName(ALL_NO_NS, "shadowrootslotassignment", "shadowrootslotassignment", "shadowrootslotassignment", "shadowrootslotassignment", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName WORD_SPACING = new AttributeName(ALL_NO_NS, "word-spacing", "word-spacing", "word-spacing", "word-spacing", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName ACCENTUNDER = new AttributeName(ALL_NO_NS, "accentunder", "accentunder", "accentunder", "accentunder", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName ACCEPT_CHARSET = new AttributeName(ALL_NO_NS, "accept-charset", "accept-charset", "accept-charset", "accept-charset", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); @@ -1199,37 +1201,37 @@ public final class AttributeName public static final AttributeName RY = new AttributeName(ALL_NO_NS, "ry", "ry", "ry", "ry", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); public static final AttributeName REFY = new AttributeName(ALL_NO_NS, "refy", "refy", "refY", "refy", ALL_NO_PREFIX, NCNAME_HTML | NCNAME_FOREIGN | NCNAME_LANG); private final static @NoLength AttributeName[] ATTRIBUTE_NAMES = { - MARKERUNITS, - BASELINE, - STOP_COLOR, + MARKERWIDTH, + BASELINE_SHIFT, + SHAPE, CLEAR, - XREF, - AUTOPLAY, - FONT_STYLE, + PROFILE, + XLINK_SHOW, + FONT_WEIGHT, ARIA_DISABLED, OPACITY, - ONBEFOREPRINT, - PATH, - ALINK, - ONMOUSEDOWN, - COLS, - COLUMNLINES, + ONMESSAGE, + ONCHANGE, + ZOOMANDPAN, + ONMOUSEOUT, + CLASSID, + ACCUMULATE, Y, ARIA_MULTISELECTABLE, ROTATE, SHADOWROOTCLONABLE, - LINEBREAK, - REPEATDUR, - ORIGIN, - RADIUS, - TABLEVALUES, - POINTSATZ, - NUMOCTAVES, - CLIPPATHUNITS, - ONDRAGEND, - ROWS, - PATTERNTRANSFORM, - VIEWTARGET, + INTERCEPT, + ROLE, + MARGINHEIGHT, + OPTIMUM, + SCALE, + POINTSATX, + FLOOD_OPACITY, + CLIP_RULE, + ONDRAGENTER, + ROWSPAN, + ONSTART, + VALUE, MIN, K3, ARIA_CHANNEL, @@ -1237,31 +1239,31 @@ public final class AttributeName LOCAL, ONABORT, HIDDEN, - ACCEPT_CHARSET, - DIRECTION, - OBJECT, - ONBEFORECUT, - SIZE, - IMAGE_RENDERING, - MATHBACKGROUND, - DIVISOR, - LINK, - FILL_OPACITY, - FORM, - OPEN, - XLINK_TITLE, - COLOR_INTERPOLATION, - ONZOOM, - STROKE, - LOOP, - COORDS, - STARTOFFSET, - LOWSRC, - CONTEXTMENU, - KEYTIMES, - TEXT_DECORATION, - REQUIRED, - CY, + WORD_SPACING, + DEFER, + ONBEFOREUNLOAD, + ONKEYPRESS, + SPREADMETHOD, + IMAGESIZES, + HIGH, + BEGIN, + VISIBILITY, + FILL_RULE, + FRAMESPACING, + KERNELUNITLENGTH, + WHEN, + COLOR_PROFILE, + ONFOCUSIN, + STROKE_LINEJOIN, + HTTP_EQUIV, + ATTRIBUTETYPE, + ONDRAGSTART, + KEYSYSTEM, + CONTROLS, + FONTSIZE, + SYSTEMLANGUAGE, + ONSUBMIT, + REFX, END, SRC, Y1, @@ -1276,183 +1278,183 @@ public final class AttributeName FETCHPRIORITY, BORDER, RENDERING_INTENT, - SANDBOX, - BEVELLED, - CODEBASE, - FACE, - NAME, - ONRESET, - ONSELECTSTART, - REFERRERPOLICY, - STRETCHY, - HREFLANG, - DRAGGABLE, - LONGDESC, - TARGETY, - MATHSIZE, - ACTIVE, - MANIFEST, - TABINDEX, - MASK, - CELLPADDING, - REPLACE, - FRAMEBORDER, - SUMMARY, - KERNELMATRIX, - POINTER_EVENTS, - TRANSFORM, - XMLNS, - AUTOCAPITALIZE, - EXPONENT, - ONMOUSEENTER, - ONMOUSEUP, - STROKE_DASHARRAY, - COMPACT, - GLYPH_ORIENTATION_HORIZONTAL, - SHAPE_RENDERING, - ABBR, - NOHREF, - OPERATOR, - BIAS, - CLASS, - PRESERVEALPHA, - ALTTEXT, - FILTER, - FONT_SIZE_ADJUST, - RT, - RESTART, - WRITING_MODE, - GROUPALIGN, - VALUES, - FX, - RY, - DIR, - IN2, - REL, - R, - K1, - X2, - XML_SPACE, - ARIA_LABELLEDBY, - ARIA_SELECTED, - ARIA_PRESSED, - ARIA_SECRET, - ARIA_TEMPLATEID, - ARIA_MULTILINE, - ARIA_RELEVANT, - ARIA_AUTOCOMPLETE, - ARIA_HASPOPUP, - DEFAULT, - HSPACE, - MOVABLELIMITS, - RSPACE, - SEPARATORS, - ENABLE_BACKGROUND, - CHECKED, - ONSCROLL, - SPECULAREXPONENT, - GRADIENTTRANSFORM, - LOADING, - SEED, - SRCDOC, - WORD_SPACING, + STDDEVIATION, ACCENT, - BASELINE_SHIFT, CODE, - DEFER, EDGE, - INTERCEPT, LINETHICKNESS, - ONBEFOREUNLOAD, ORDER, - ONMESSAGE, ORIENTATION, - ONKEYPRESS, ONRESIZE, - ROLE, SIZES, - SPREADMETHOD, DIFFUSECONSTANT, - PROFILE, ALIGNMENT_BASELINE, - IMAGESIZES, LANG, - MARGINHEIGHT, TARGET, - HIGH, MATHVARIANT, - ONCHANGE, ACTIONTYPE, - BEGIN, LIMITINGCONEANGLE, - OPTIMUM, SCRIPTSIZEMULTIPLIER, - VISIBILITY, MARKERHEIGHT, - MARKERWIDTH, AMPLITUDE, - FILL_RULE, ONCLICK, - SCALE, AZIMUTH, - FRAMESPACING, PRIMITIVEUNITS, - ZOOMANDPAN, EVENT, - KERNELUNITLENGTH, ONEND, - POINTSATX, STANDBY, - WHEN, XLINK_ARCROLE, - XLINK_SHOW, AUTOCOMPLETE, - COLOR_PROFILE, COLOR_INTERPOLATION_FILTERS, - FLOOD_OPACITY, ONLOAD, - ONFOCUSIN, ONMOUSELEAVE, - ONMOUSEOUT, RQUOTE, - STROKE_LINEJOIN, STROKE_WIDTH, - CLIP_RULE, DISPLAYSTYLE, - HTTP_EQUIV, SCOPED, - SHAPE, TEMPLATE, - ATTRIBUTETYPE, CHARSET, - ONDRAGENTER, ONDRAGDROP, - ONDRAGSTART, AS, - CLASSID, CLOSURE, - KEYSYSTEM, MINSIZE, - ROWSPAN, SUBSCRIPTSHIFT, - CONTROLS, ENCTYPE, - FONT_WEIGHT, FONT_FAMILY, - FONTSIZE, LIST, - ONSTART, PATTERNUNITS, - SYSTEMLANGUAGE, TEXTLENGTH, - ACCUMULATE, COLUMNSPACING, - ONSUBMIT, RESULT, - VALUE, CX, - REFX, FY, + DIR, + IN2, + REL, + R, + K1, + X2, + XML_SPACE, + ARIA_LABELLEDBY, + ARIA_SELECTED, + ARIA_PRESSED, + ARIA_SECRET, + ARIA_TEMPLATEID, + ARIA_MULTILINE, + ARIA_RELEVANT, + ARIA_AUTOCOMPLETE, + ARIA_HASPOPUP, + DEFAULT, + HSPACE, + MOVABLELIMITS, + RSPACE, + SEPARATORS, + ENABLE_BACKGROUND, + CHECKED, + ONSCROLL, + SPECULAREXPONENT, + GRADIENTTRANSFORM, + LOADING, + SEED, + SRCDOC, + SHADOWROOTCUSTOMELEMENTREGISTRY, + ACCEPT_CHARSET, + BEVELLED, + BASELINE, + CODEBASE, + DIRECTION, + FACE, + LINEBREAK, + NAME, + OBJECT, + ONRESET, + ONBEFOREPRINT, + ONSELECTSTART, + ONBEFORECUT, + REFERRERPOLICY, + REPEATDUR, + STRETCHY, + SIZE, + HREFLANG, + XREF, + DRAGGABLE, + IMAGE_RENDERING, + LONGDESC, + ORIGIN, + TARGETY, + MATHBACKGROUND, + MATHSIZE, + PATH, + ACTIVE, + DIVISOR, + MANIFEST, + RADIUS, + TABINDEX, + LINK, + MASK, + MARKERUNITS, + CELLPADDING, + FILL_OPACITY, + REPLACE, + TABLEVALUES, + FRAMEBORDER, + FORM, + SUMMARY, + ALINK, + KERNELMATRIX, + OPEN, + POINTER_EVENTS, + POINTSATZ, + TRANSFORM, + XLINK_TITLE, + XMLNS, + AUTOPLAY, + AUTOCAPITALIZE, + COLOR_INTERPOLATION, + EXPONENT, + NUMOCTAVES, + ONMOUSEENTER, + ONZOOM, + ONMOUSEUP, + ONMOUSEDOWN, + STROKE_DASHARRAY, + STROKE, + COMPACT, + CLIPPATHUNITS, + GLYPH_ORIENTATION_HORIZONTAL, + LOOP, + SHAPE_RENDERING, + STOP_COLOR, + ABBR, + COORDS, + NOHREF, + ONDRAGEND, + OPERATOR, + STARTOFFSET, + BIAS, + COLS, + CLASS, + LOWSRC, + PRESERVEALPHA, + ROWS, + ALTTEXT, + CONTEXTMENU, + FILTER, + FONT_STYLE, + FONT_SIZE_ADJUST, + KEYTIMES, + RT, + PATTERNTRANSFORM, + RESTART, + TEXT_DECORATION, + WRITING_MODE, + COLUMNLINES, + GROUPALIGN, + REQUIRED, + VALUES, + VIEWTARGET, + FX, + CY, REFY, ALT, DUR, @@ -1511,7 +1513,8 @@ public final class AttributeName SHADOWROOTMODE, SHADOWROOTREFERENCETARGET, SHADOWROOTSERIALIZABLE, - STDDEVIATION, + SHADOWROOTSLOTASSIGNMENT, + SANDBOX, SHADOWROOTDELEGATESFOCUS, ACCENTUNDER, ACCESSKEY, @@ -1707,262 +1710,263 @@ public final class AttributeName RX, BY, DY, - }; - private final static int[] ATTRIBUTE_HASHES = { - 1854497003, - 1747939528, - 1941454586, - 1681174213, - 1776114564, - 1915025672, - 2001669450, - 1680165421, - 1721347639, - 1754792749, - 1805715716, - 1898428101, - 1922699851, - 1983347764, - 2016787611, - 71827457, - 1680282148, - 1689324870, - 1740045858, - 1752985897, - 1756471625, - 1788254870, - 1823580230, - 1874698443, - 1906423097, - 1921894426, - 1933145837, - 1972863609, - 1991392548, - 2007019632, - 2060302634, - 57205395, - 911736834, - 1680181996, - 1680368221, - 1685882101, - 1704526375, - 1734182982, - 1747299630, - 1749027145, - 1754606246, - 1754907227, - 1757053236, - 1785174319, - 1804036350, - 1816144023, - 1853862084, - 1867620412, - 1884343396, - 1905628916, - 1910441627, - 1916278099, - 1922567078, - 1924585254, - 1937777860, - 1966439670, - 1974849131, - 1988132214, - 2000162011, - 2004199576, - 2009071951, - 2024616088, - 2081947650, - 53006051, - 60345635, - 885522434, - 1680095865, - 1680165533, - 1680229115, - 1680343801, - 1680437801, - 1682440540, - 1687620127, - 1692408896, - 1716623661, - 1731048742, - 1739583824, - 1740130375, - 1747792072, - 1748552744, - 1749856356, - 1754214628, - 1754645079, - 1754858317, - 1756190926, - 1756804936, - 1767875272, - 1782518297, - 1786821704, - 1791070327, - 1804235064, - 1814656326, - 1820928104, - 1824377064, - 1854464212, - 1865910347, - 1873590471, - 1884142379, - 1891186903, - 1903612236, - 1906408542, - 1908462185, - 1910503637, - 1915394254, - 1917327080, - 1922413292, - 1922671417, - 1924462384, - 1932870919, - 1934917372, - 1941409583, - 1965349396, - 1972196486, - 1972909592, - 1982640164, - 1983461061, - 1990062797, - 1999273799, - 2001578182, - 2001814704, - 2005925890, - 2008084807, - 2010452700, - 2018908874, - 2026741958, - 2066743298, - 2089811970, - 52488851, - 55077603, - 59825747, - 68157441, - 878182402, - 901775362, - 1037879561, - 1680159327, - 1680165437, - 1680165692, - 1680198203, - 1680231247, - 1680315086, - 1680345965, - 1680413393, - 1680452349, - 1681879063, - 1683805446, - 1686731997, - 1689048326, - 1689839946, - 1699185409, - 1714763319, - 1721189160, - 1723336432, - 1733874289, - 1736416327, - 1739927860, - 1740096054, + RY, + }; + private final static int[] ATTRIBUTE_HASHES = { + 1854474395, + 1747839118, + 1941438085, + 1681174213, + 1772032615, + 1910527802, + 2001634459, + 1680165421, + 1721347639, + 1754647353, + 1804978712, + 1894552650, + 1922679386, + 1983266615, + 2015950026, + 71827457, + 1680282148, + 1689324870, + 1740045858, + 1751679545, + 1756302628, + 1787193500, + 1822002839, + 1874261045, + 1906419001, + 1917953597, + 1932986153, + 1972744939, + 1991021879, + 2006516551, + 2026975253, + 57205395, + 911736834, + 1680181996, + 1680368221, + 1685882101, + 1704526375, + 1734182982, 1742183484, + 1748869205, + 1754546894, + 1754872618, + 1756874572, + 1785051290, + 1801312388, + 1814986837, + 1825677514, + 1867448617, + 1884267068, + 1903759600, + 1909819252, + 1916210285, + 1922470745, + 1924570799, + 1935597338, + 1965561677, + 1972962123, + 1987410233, + 2000125224, + 2001898808, + 2008408414, + 2023146024, + 2075005220, + 53006051, + 60345635, + 885522434, + 1680095865, + 1680165533, + 1680229115, + 1680343801, + 1680437801, + 1682440540, + 1687620127, + 1692408896, + 1716623661, + 1731048742, + 1739583824, + 1740119884, 1747446838, - 1747839118, 1748306996, - 1748869205, 1749399124, - 1751679545, 1753297133, - 1754546894, 1754643237, - 1754647353, 1754798923, - 1754872618, 1754958648, - 1756302628, 1756737685, - 1756874572, 1765800271, - 1772032615, 1780975314, - 1785051290, 1786740932, - 1787193500, 1790814502, - 1801312388, 1804069019, - 1804978712, 1814558026, - 1814986837, 1820262641, - 1822002839, 1823841492, - 1825677514, 1854302364, - 1854474395, 1864698185, - 1867448617, 1872034503, - 1874261045, 1881750231, - 1884267068, 1889633006, - 1894552650, 1900548965, - 1903759600, 1905754853, - 1906419001, 1907701479, - 1909819252, 1910441773, - 1910527802, 1915295948, - 1916210285, 1916337499, - 1917953597, 1922319046, - 1922470745, 1922665052, - 1922679386, 1924206934, - 1924570799, 1924738716, - 1932986153, 1933508940, - 1935597338, 1941253366, - 1941438085, 1942026440, - 1965561677, 1966454567, - 1972744939, 1972904522, - 1972962123, 1980235778, - 1983266615, 1983416119, - 1987410233, 1988788535, - 1991021879, 1991643278, - 2000125224, 2001210183, - 2001634459, 2001710299, - 2001898808, 2004957380, - 2006516551, 2007064812, - 2008408414, 2009141482, - 2015950026, 2016910397, - 2023146024, 2024763702, - 2026975253, 2065170434, - 2075005220, 2083520514, + 52488851, + 55077603, + 59825747, + 68157441, + 878182402, + 901775362, + 1037879561, + 1680159327, + 1680165437, + 1680165692, + 1680198203, + 1680231247, + 1680315086, + 1680345965, + 1680413393, + 1680452349, + 1681879063, + 1683805446, + 1686731997, + 1689048326, + 1689839946, + 1699185409, + 1714763319, + 1721189160, + 1723336432, + 1733874289, + 1736416327, + 1739927860, + 1740096054, + 1740185423, + 1747299630, + 1747792072, + 1747939528, + 1748552744, + 1749027145, + 1749856356, + 1752985897, + 1754214628, + 1754606246, + 1754645079, + 1754792749, + 1754858317, + 1754907227, + 1756190926, + 1756471625, + 1756804936, + 1757053236, + 1767875272, + 1776114564, + 1782518297, + 1785174319, + 1786821704, + 1788254870, + 1791070327, + 1804036350, + 1804235064, + 1805715716, + 1814656326, + 1816144023, + 1820928104, + 1823580230, + 1824377064, + 1853862084, + 1854464212, + 1854497003, + 1865910347, + 1867620412, + 1873590471, + 1874698443, + 1884142379, + 1884343396, + 1891186903, + 1898428101, + 1903612236, + 1905628916, + 1906408542, + 1906423097, + 1908462185, + 1910441627, + 1910503637, + 1915025672, + 1915394254, + 1916278099, + 1917327080, + 1921894426, + 1922413292, + 1922567078, + 1922671417, + 1922699851, + 1924462384, + 1924585254, + 1932870919, + 1933145837, + 1934917372, + 1937777860, + 1941409583, + 1941454586, + 1965349396, + 1966439670, + 1972196486, + 1972863609, + 1972909592, + 1974849131, + 1982640164, + 1983347764, + 1983461061, + 1988132214, + 1990062797, + 1991392548, + 1999273799, + 2000162011, + 2001578182, + 2001669450, + 2001814704, + 2004199576, + 2005925890, + 2007019632, + 2008084807, + 2009071951, + 2010452700, + 2016787611, + 2018908874, + 2024616088, + 2026741958, + 2060302634, + 2066743298, + 2081947650, 2091784484, 50917059, 52489043, @@ -2021,7 +2025,8 @@ public final class AttributeName 1739914974, 1739962169, 1740045862, - 1740119884, + 1740109544, + 1740130375, 1740222216, 1747295467, 1747309881, @@ -2217,5 +2222,6 @@ public final class AttributeName 2073034754, 2081423362, 2082471938, + 2089811970, }; } ===================================== src/nu/validator/htmlparser/impl/ElementName.java ===================================== @@ -1424,7 +1424,11 @@ TreeBuilder.OTHER); public static final ElementName SELECT = new ElementName("select", "select", // CPPONLY: NS_NewHTMLSelectElement, // CPPONLY: NS_NewSVGUnknownElement, -TreeBuilder.SELECT | SPECIAL); +TreeBuilder.SELECT | SPECIAL | SCOPING); +public static final ElementName SELECTEDCONTENT = new ElementName("selectedcontent", "selectedcontent", +// CPPONLY: NS_NewHTMLElement, +// CPPONLY: NS_NewSVGUnknownElement, +TreeBuilder.OTHER); public static final ElementName SLOT = new ElementName("slot", "slot", // CPPONLY: NS_NewHTMLSlotElement, // CPPONLY: NS_NewSVGUnknownElement, @@ -1484,18 +1488,18 @@ TreeBuilder.TBODY_OR_THEAD_OR_TFOOT | SPECIAL | FOSTER_PARENTING | OPTIONAL_END_ private final static @NoLength ElementName[] ELEMENT_NAMES = { FIGCAPTION, CITE, -FRAMESET, +FEOFFSET, H1, CLIPPATH, METER, -RADIALGRADIENT, +SELECT, B, BGSOUND, SOURCE, DL, RP, -NOFRAMES, -MTEXT, +PROGRESS, +NOSCRIPT, VIEW, DIV, G, @@ -1507,10 +1511,10 @@ TEXTPATH, ANIMATETRANSFORM, SECTION, HR, -CANVAS, -BASEFONT, -FEDISTANTLIGHT, -OUTPUT, +DEFS, +DATALIST, +FONT, +PLAINTEXT, TFOOT, FEMORPHOLOGY, COL, @@ -1533,14 +1537,14 @@ OPTION, VIDEO, BR, FOOTER, -TR, -DETAILS, -DT, -FOREIGNOBJECT, -FESPOTLIGHT, -INPUT, -RT, -TT, +ADDRESS, +MS, +APPLET, +FIELDSET, +FEPOINTLIGHT, +LINEARGRADIENT, +OBJECT, +RECT, SLOT, MENU, FECONVOLVEMATRIX, @@ -1585,23 +1589,23 @@ SAMP, ANIMATECOLOR, FECOMPONENTTRANSFER, HEADER, -NOBR, -ADDRESS, -DEFS, -MS, -PROGRESS, -APPLET, -DATALIST, -FIELDSET, -FEOFFSET, -FEPOINTLIGHT, -FONT, -LINEARGRADIENT, -NOSCRIPT, -OBJECT, -PLAINTEXT, -RECT, -SELECT, +TR, +CANVAS, +DETAILS, +NOFRAMES, +DT, +BASEFONT, +FOREIGNOBJECT, +FRAMESET, +FESPOTLIGHT, +FEDISTANTLIGHT, +INPUT, +MTEXT, +RT, +OUTPUT, +TT, +RADIALGRADIENT, +SELECTEDCONTENT, SCRIPT, TEXT, FEDROPSHADOW, @@ -1689,22 +1693,23 @@ FEFUNCR, FILTER, FEGAUSSIANBLUR, MARKER, +NOBR, }; private final static int[] ELEMENT_HASHES = { 1900845386, 1748359220, -2001349720, +2001349736, 876609538, 1798686984, 1971465813, -2007781534, +2008125638, 59768833, 1730965751, 1756474198, 1864368130, 1938817026, -1988763672, -2005324101, +1990037800, +2005719336, 2060065124, 52490899, 62390273, @@ -1716,10 +1721,10 @@ private final static int[] ELEMENT_HASHES = { 1881498736, 1907661127, 1967128578, -1982935782, -1999397992, -2001392798, -2006329158, +1983533124, +2000525512, +2001495140, +2006896969, 2008851557, 2085266636, 51961587, @@ -1742,14 +1747,14 @@ private final static int[] ELEMENT_HASHES = { 1925844629, 1963982850, 1967795958, -1973420034, -1983633431, -1998585858, -2001309869, -2001392795, -2003183333, -2005925890, -2006974466, +1982173479, +1986527234, +1998724870, +2001349704, +2001392796, +2004635806, +2006028454, +2007601444, 2008325940, 2021937364, 2068523856, @@ -1794,23 +1799,23 @@ private final static int[] ELEMENT_HASHES = { 1965334268, 1967788867, 1968836118, -1971938532, -1982173479, -1983533124, -1986527234, -1990037800, -1998724870, -2000525512, -2001349704, -2001349736, -2001392796, -2001495140, -2004635806, -2005719336, -2006028454, -2006896969, -2007601444, -2008125638, +1973420034, +1982935782, +1983633431, +1988763672, +1998585858, +1999397992, +2001309869, +2001349720, +2001392795, +2001392798, +2003183333, +2005324101, +2005925890, +2006329158, +2006974466, +2007781534, +2008305999, 2008340774, 2008994116, 2051837468, @@ -1898,5 +1903,6 @@ private final static int[] ELEMENT_HASHES = { 1967795910, 1968053806, 1971461414, +1971938532, }; } ===================================== src/nu/validator/htmlparser/impl/Portability.java ===================================== @@ -31,6 +31,7 @@ import nu.validator.htmlparser.common.Interner; public final class Portability { + // [NOCPP[ public static int checkedAdd(int a, int b) throws SAXException { // This can't be translated code, because in C++ signed integer overflow is UB, so the below code would be wrong. assert a >= 0; @@ -41,6 +42,7 @@ public final class Portability { } return sum; } + // ]NOCPP] // Allocating methods ===================================== src/nu/validator/htmlparser/impl/Tokenizer.java ===================================== @@ -932,7 +932,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - HtmlAttributes emptyAttributes() { + @Inline HtmlAttributes emptyAttributes() { // [NOCPP[ if (newAttributesEachTime) { return new HtmlAttributes(mappingLangToXmlLang); @@ -944,7 +944,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] } - @Inline private void appendCharRefBuf(char c) { + private void appendCharRefBuf(char c) { // CPPONLY: assert charRefBufLen < charRefBuf.length: // CPPONLY: "RELEASE: Attempted to overrun charRefBuf!"; charRefBuf[charRefBufLen++] = c; @@ -983,11 +983,8 @@ public class Tokenizer implements Locator, Locator2 { * the UTF-16 code unit to append */ @Inline private void appendStrBuf(char c) { - // CPPONLY: assert strBufLen < strBuf.length: "Previous buffer length insufficient."; // CPPONLY: if (strBufLen == strBuf.length) { - // CPPONLY: if (!EnsureBufferSpace(1)) { - // CPPONLY: assert false: "RELEASE: Unable to recover from buffer reallocation failure"; - // CPPONLY: } // TODO: Add telemetry when outer if fires but inner does not + // CPPONLY: EnsureBufferSpaceShouldNeverHappen(1); // CPPONLY: } strBuf[strBufLen++] = c; } @@ -1000,9 +997,22 @@ public class Tokenizer implements Locator, Locator2 { * * @return the buffer as a string */ - protected String strBufToString() { + @Inline protected String strBufToString() { + // CPPONLY: String digitAtom = TryAtomizeForSingleDigit(); + // CPPONLY: if (digitAtom) { + // CPPONLY: return digitAtom; + // CPPONLY: } + // CPPONLY: + // CPPONLY: boolean maybeAtomize = false; + // CPPONLY: if (!newAttributesEachTime) { + // CPPONLY: if (attributeName == AttributeName.CLASS || + // CPPONLY: attributeName == AttributeName.TYPE) { + // CPPONLY: maybeAtomize = true; + // CPPONLY: } + // CPPONLY: } + // CPPONLY: String str = Portability.newStringFromBuffer(strBuf, 0, strBufLen - // CPPONLY: , tokenHandler, !newAttributesEachTime && attributeName == AttributeName.CLASS + // CPPONLY: , tokenHandler, maybeAtomize ); clearStrBufAfterUse(); return str; @@ -1014,7 +1024,7 @@ public class Tokenizer implements Locator, Locator2 { * * @return the buffer as local name */ - private void strBufToDoctypeName() { + @Inline private void strBufToDoctypeName() { doctypeName = Portability.newLocalNameFromBuffer(strBuf, strBufLen, interner); clearStrBufAfterUse(); } @@ -1025,7 +1035,7 @@ public class Tokenizer implements Locator, Locator2 { * @throws SAXException * if the token handler threw */ - private void emitStrBuf() throws SAXException { + @Inline private void emitStrBuf() throws SAXException { if (strBufLen > 0) { tokenHandler.characters(strBuf, 0, strBufLen); clearStrBufAfterUse(); @@ -1094,13 +1104,12 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] } - private void appendStrBuf(@NoLength char[] buffer, int offset, int length) throws SAXException { - int newLen = Portability.checkedAdd(strBufLen, length); - // CPPONLY: assert newLen <= strBuf.length: "Previous buffer length insufficient."; + @Inline private void appendStrBuf(@NoLength char[] buffer, int offset, int length) throws SAXException { + // Years of crash stats have shown that the this addition doesn't overflow, as it logically + // shouldn't. + int newLen = strBufLen + length; // CPPONLY: if (strBuf.length < newLen) { - // CPPONLY: if (!EnsureBufferSpace(length)) { - // CPPONLY: assert false: "RELEASE: Unable to recover from buffer reallocation failure"; - // CPPONLY: } // TODO: Add telemetry when outer if fires but inner does not + // CPPONLY: EnsureBufferSpaceShouldNeverHappen(length); // CPPONLY: } System.arraycopy(buffer, offset, strBuf, strBufLen, length); strBufLen = newLen; @@ -1455,12 +1464,6 @@ public class Tokenizer implements Locator, Locator2 { */ int pos = start - 1; - /** - * The index of the first char in buf that is - * part of a coalesced run of character tokens or - * Integer.MAX_VALUE if there is not a current run being - * coalesced. - */ switch (state) { case DATA: case RCDATA: @@ -1486,19 +1489,24 @@ public class Tokenizer implements Locator, Locator2 { break; } - /** - * The number of chars in buf that have - * meaning. (The rest of the array is garbage and should not be - * examined.) - */ // CPPONLY: if (mViewSource) { // CPPONLY: mViewSource.SetBuffer(buffer); - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopViewSourceSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } else { + // CPPONLY: pos = StateLoopViewSourceALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } // CPPONLY: mViewSource.DropBuffer((pos == buffer.getEnd()) ? pos : pos + 1); // CPPONLY: } else if (tokenHandler.WantsLineAndColumn()) { - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopLineColSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } else { + // CPPONLY: pos = StateLoopLineColALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: } + // CPPONLY: } else if (htmlaccelEnabled()) { + // CPPONLY: pos = StateLoopFastestSIMD(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); // CPPONLY: } else { - // CPPONLY: pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); + // CPPONLY: pos = StateLoopFastestALU(state, c, pos, buffer.getBuffer(), false, returnState, buffer.getEnd()); // CPPONLY: } // [NOCPP[ pos = stateLoop(state, c, pos, buffer.getBuffer(), false, returnState, @@ -1547,7 +1555,7 @@ public class Tokenizer implements Locator, Locator2 { } // ]NOCPP] - @SuppressWarnings("unused") private int stateLoop(int state, char c, + @SuppressWarnings("unused") @Inline private int stateLoop(int state, char c, int pos, @NoLength char[] buf, boolean reconsume, int returnState, int endPos) throws SAXException { boolean reportedConsecutiveHyphens = false; @@ -1626,7 +1634,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementData(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -2201,7 +2213,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementAttributeValueDoubleQuoted(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -2698,7 +2714,11 @@ public class Tokenizer implements Locator, Locator2 { // CPPONLY: MOZ_FALLTHROUGH; case COMMENT: commentloop: for (;;) { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementComment(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3194,7 +3214,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementCdataSection(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3281,7 +3305,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementAttributeValueSingleQuoted(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -3893,7 +3921,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementPlaintext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4005,7 +4037,12 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // RCDATA and DATA have the same set of characters that they are indifferent to, hence accelerateData. + // CPPONLY: pos += accelerateAdvancementData(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4056,7 +4093,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementRawtext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4340,7 +4381,12 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // Using `accelerateAdvancementRawtext`, because this states has the same characters of interest as RAWTEXT. + // CPPONLY: pos += accelerateAdvancementRawtext(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -4536,7 +4582,11 @@ public class Tokenizer implements Locator, Locator2 { if (reconsume) { reconsume = false; } else { - if (++pos == endPos) { + ++pos; + // Perhaps at some point, it will be appropriate to do SIMD in Java, but not today. + // The line below advances pos by some number of code units that this state is indifferent to. + // CPPONLY: pos += accelerateAdvancementScriptDataEscaped(buf, pos, endPos); + if (pos == endPos) { break stateloop; } c = checkChar(buf, pos); @@ -6348,24 +6398,24 @@ public class Tokenizer implements Locator, Locator2 { forceQuirks = false; } - private void adjustDoubleHyphenAndAppendToStrBufCarriageReturn() + @Inline private void adjustDoubleHyphenAndAppendToStrBufCarriageReturn() throws SAXException { silentCarriageReturn(); adjustDoubleHyphenAndAppendToStrBufAndErr('\n', false); } - private void adjustDoubleHyphenAndAppendToStrBufLineFeed() + @Inline private void adjustDoubleHyphenAndAppendToStrBufLineFeed() throws SAXException { silentLineFeed(); adjustDoubleHyphenAndAppendToStrBufAndErr('\n', false); } - private void appendStrBufLineFeed() { + @Inline private void appendStrBufLineFeed() { silentLineFeed(); appendStrBuf('\n'); } - private void appendStrBufCarriageReturn() { + @Inline private void appendStrBufCarriageReturn() { silentCarriageReturn(); appendStrBuf('\n'); } @@ -6383,7 +6433,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - private void emitCarriageReturn(@NoLength char[] buf, int pos) + @Inline private void emitCarriageReturn(@NoLength char[] buf, int pos) throws SAXException { silentCarriageReturn(); flushChars(buf, pos); @@ -6412,7 +6462,7 @@ public class Tokenizer implements Locator, Locator2 { cstart = pos + 1; } - private void setAdditionalAndRememberAmpersandLocation(char add) { + @Inline private void setAdditionalAndRememberAmpersandLocation(char add) { additional = add; // [NOCPP[ ampersandLocation = new LocatorImpl(this); @@ -7077,7 +7127,7 @@ public class Tokenizer implements Locator, Locator2 { * happened in a non-text context, this method turns that deferred suspension * request into an immediately-pending suspension request. */ - private void suspendIfRequestedAfterCurrentNonTextToken() { + @Inline private void suspendIfRequestedAfterCurrentNonTextToken() { if (suspendAfterCurrentNonTextToken) { suspendAfterCurrentNonTextToken = false; shouldSuspend = true; @@ -7221,7 +7271,7 @@ public class Tokenizer implements Locator, Locator2 { * @param val * @throws SAXException */ - private void emitOrAppendTwo(@Const @NoLength char[] val, int returnState) + @Inline private void emitOrAppendTwo(@Const @NoLength char[] val, int returnState) throws SAXException { if ((returnState & DATA_AND_RCDATA_MASK) != 0) { appendStrBuf(val[0]); @@ -7231,7 +7281,7 @@ public class Tokenizer implements Locator, Locator2 { } } - private void emitOrAppendOne(@Const @NoLength char[] val, int returnState) + @Inline private void emitOrAppendOne(@Const @NoLength char[] val, int returnState) throws SAXException { if ((returnState & DATA_AND_RCDATA_MASK) != 0) { appendStrBuf(val[0]); @@ -7268,7 +7318,7 @@ public class Tokenizer implements Locator, Locator2 { } } - public void requestSuspension() { + @Inline public void requestSuspension() { shouldSuspend = true; } @@ -7311,7 +7361,7 @@ public class Tokenizer implements Locator, Locator2 { // ]NOCPP] - public boolean isInDataState() { + @Inline public boolean isInDataState() { return (stateSave == DATA); } ===================================== src/nu/validator/htmlparser/impl/TreeBuilder.java ===================================== @@ -226,12 +226,6 @@ public abstract class TreeBuilder implements TokenHandler, // no fall-through - private static final int IN_SELECT_IN_TABLE = 10; - - private static final int IN_SELECT = 11; - - // no fall-through - private static final int AFTER_BODY = 12; // no fall-through @@ -952,9 +946,6 @@ public abstract class TreeBuilder implements TokenHandler, * current node. */ break charactersloop; - case IN_SELECT: - case IN_SELECT_IN_TABLE: - break charactersloop; case IN_TABLE: case IN_TABLE_BODY: case IN_ROW: @@ -1166,9 +1157,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; i--; continue; - case IN_SELECT: - case IN_SELECT_IN_TABLE: - break charactersloop; case AFTER_BODY: errNonSpaceAfterBody(); fatal(); @@ -1334,8 +1322,6 @@ public abstract class TreeBuilder implements TokenHandler, case IN_TABLE_BODY: case IN_ROW: case IN_TABLE: - case IN_SELECT_IN_TABLE: - case IN_SELECT: case IN_COLUMN_GROUP: case FRAMESET_OK: case IN_CAPTION: @@ -1531,12 +1517,19 @@ public abstract class TreeBuilder implements TokenHandler, if (!(group == FONT && !(attributes.contains(AttributeName.COLOR) || attributes.contains(AttributeName.FACE) || attributes.contains(AttributeName.SIZE)))) { errHtmlStartTagInForeignContext(name); - if (!fragment) { - while (!isSpecialParentInForeign(stack[currentPtr])) { - popForeign(-1, -1); - } + // Pop until we reach an HTML namespace element, + // HTML integration point, or MathML text integration point. + // In fragment case, stop before popping the context element. + while (currentPtr > 0 && !isSpecialParentInForeign(stack[currentPtr])) { + popForeign(-1, -1); + } + if (currentPtr > 0 || isSpecialParentInForeign(stack[currentPtr])) { + // Popped to an HTML element or integration point continue starttagloop; - } // else fall thru + } + // In fragment case with foreign context, fall through + // to let switch(mode) handle the token in HTML namespace + break; } // CPPONLY: MOZ_FALLTHROUGH; default: @@ -2163,6 +2156,25 @@ public abstract class TreeBuilder implements TokenHandler, break starttagloop; case HR: implicitlyCloseP(); + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "hr"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags." + generateImpliedEndTags(); + // "2. If the stack of open elements has an option element + // in scope or has an optgroup element in scope, then + // this is a parse error." + if (errorHandler != null + && (findLastInScope("option") != TreeBuilder.NOT_FOUND_ON_STACK + || findLastInScope("optgroup") != TreeBuilder.NOT_FOUND_ON_STACK)) { + errUnclosedElementsImplied( + findLastInScope("option") != TreeBuilder.NOT_FOUND_ON_STACK + ? findLastInScope("option") + : findLastInScope("optgroup"), + name); + } + } appendVoidElementToCurrentMayFoster( elementName, attributes); @@ -2177,7 +2189,31 @@ public abstract class TreeBuilder implements TokenHandler, elementName = ElementName.IMG; continue starttagloop; case IMG: + reconstructTheActiveFormattingElements(); + appendVoidElementToCurrentMayFoster( + elementName, attributes, + formPointer); + selfClosing = false; + // [NOCPP[ + voidElement = true; + // ]NOCPP] + attributes = null; // CPP + break starttagloop; case INPUT: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "input"" + // "If the stack of open elements has a select element + // in scope:" + eltPos = findLastInScope("select"); + if (eltPos != TreeBuilder.NOT_FOUND_ON_STACK) { + // "Parse error." + errStartTagWithSelectOpen(name); + // "Pop elements until a select element has been popped." + while (currentPtr >= eltPos) { + pop(); + } + continue starttagloop; + } reconstructTheActiveFormattingElements(); appendVoidElementToCurrentMayFoster( elementName, attributes, @@ -2228,31 +2264,100 @@ public abstract class TreeBuilder implements TokenHandler, attributes = null; // CPP break starttagloop; case SELECT: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "select"" + // "If the parser was created as part of the HTML fragment + // parsing algorithm and the context element is a select + // element:" + if (fragment && "select" == contextName) { + // "Parse error. Ignore the token." + errStartSelectWhereEndSelectExpected(); + break starttagloop; + } + // "Otherwise, if the stack of open elements has a select + // element in scope:" + eltPos = findLastInScope(name); + if (eltPos != TreeBuilder.NOT_FOUND_ON_STACK) { + // "Parse error." + errStartSelectWhereEndSelectExpected(); + // "Pop elements until a select element has been popped." + while (currentPtr >= eltPos) { + pop(); + } + break starttagloop; + } + // "Otherwise:" + // "Reconstruct the active formatting elements, if any." reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." appendToCurrentNodeAndPushElementMayFoster( elementName, attributes, formPointer); - switch (mode) { - case IN_TABLE: - case IN_CAPTION: - case IN_COLUMN_GROUP: - case IN_TABLE_BODY: - case IN_ROW: - case IN_CELL: - mode = IN_SELECT_IN_TABLE; - break; - default: - mode = IN_SELECT; - break; + // "Set the frameset-ok flag to "not ok"." + framesetOk = false; + attributes = null; // CPP + break starttagloop; + case OPTION: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "option"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags except for optgroup elements." + generateImpliedEndTagsExceptFor("optgroup"); + // "2. If the stack of open elements has an option element + // in scope, then this is a parse error." + if (errorHandler != null) { + int optionPos = findLastInScope("option"); + if (optionPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optionPos, name); + } + } + } else { + // "Otherwise, if the current node is an option element, + // then pop the current node from the stack of open elements." + if (isCurrent("option")) { + pop(); + } } + // "Reconstruct the active formatting elements, if any." + reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." + appendToCurrentNodeAndPushElementMayFoster( + elementName, + attributes); attributes = null; // CPP break starttagloop; case OPTGROUP: - case OPTION: - if (isCurrent("option")) { - pop(); + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is "optgroup"" + // "If the stack of open elements has a select element in scope:" + if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) { + // "1. Generate implied end tags." + generateImpliedEndTags(); + // "2. If the stack of open elements has an option element + // in scope or has an optgroup element in scope, then + // this is a parse error." + if (errorHandler != null) { + int optionPos = findLastInScope("option"); + if (optionPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optionPos, name); + } else { + int optgroupPos = findLastInScope("optgroup"); + if (optgroupPos != TreeBuilder.NOT_FOUND_ON_STACK) { + errUnclosedElementsImplied(optgroupPos, name); + } + } + } + } else { + // "Otherwise, if the current node is an option element, + // then pop the current node from the stack of open elements." + if (isCurrent("option")) { + pop(); + } } + // "Reconstruct the active formatting elements, if any." reconstructTheActiveFormattingElements(); + // "Insert an HTML element for the token." appendToCurrentNodeAndPushElementMayFoster( elementName, attributes); @@ -2322,14 +2427,18 @@ public abstract class TreeBuilder implements TokenHandler, attributes = null; // CPP break starttagloop; case CAPTION: - case COL: - case COLGROUP: case TBODY_OR_THEAD_OR_TFOOT: case TR: case TD_OR_TH: + case COL: + case COLGROUP: case FRAME: case FRAMESET: case HEAD: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "A start tag whose tag name is one of: "caption", "col", + // "colgroup", "frame", "frameset", "head", "tbody", "td", + // "tfoot", "th", "thead", "tr"" errStrayStartTag(name); break starttagloop; case OUTPUT: @@ -2507,111 +2616,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; continue; } - case IN_SELECT_IN_TABLE: - switch (group) { - case CAPTION: - case TBODY_OR_THEAD_OR_TFOOT: - case TR: - case TD_OR_TH: - case TABLE: - errStartTagWithSelectOpen(name); - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break starttagloop; // http://www.w3.org/Bugs/Public/show_bug.cgi?id=8375 - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - default: - // fall through to IN_SELECT - } - // CPPONLY: MOZ_FALLTHROUGH; - case IN_SELECT: - switch (group) { - case HTML: - errStrayStartTag(name); - if (!fragment) { - addAttributesToHtml(attributes); - attributes = null; // CPP - } - break starttagloop; - case OPTION: - if (isCurrent("option")) { - pop(); - } - appendToCurrentNodeAndPushElement( - elementName, - attributes); - attributes = null; // CPP - break starttagloop; - case OPTGROUP: - if (isCurrent("option")) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } - appendToCurrentNodeAndPushElement( - elementName, - attributes); - attributes = null; // CPP - break starttagloop; - case SELECT: - errStartSelectWhereEndSelectExpected(); - eltPos = findLastInTableScope(name); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - errNoSelectInTableScope(); - break starttagloop; - } else { - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - break starttagloop; - } - case INPUT: - case TEXTAREA: - errStartTagWithSelectOpen(name); - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break starttagloop; - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - case SCRIPT: - startTagScriptInHead(elementName, attributes); - attributes = null; // CPP - break starttagloop; - case TEMPLATE: - startTagTemplateInHead(elementName, attributes); - attributes = null; // CPP - break starttagloop; - case HR: - if (isCurrent("option")) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } - appendVoidElementToCurrent(elementName, attributes); - selfClosing = false; - // [NOCPP[ - voidElement = true; - // ]NOCPP] - attributes = null; // CPP - break starttagloop; - default: - errStrayStartTag(name); - break starttagloop; - } case AFTER_BODY: switch (group) { case HTML: @@ -2992,9 +2996,11 @@ public abstract class TreeBuilder implements TokenHandler, boolean shadowRootIsClonable = attributes.contains(AttributeName.SHADOWROOTCLONABLE); boolean shadowRootIsSerializable = attributes.contains(AttributeName.SHADOWROOTSERIALIZABLE); boolean shadowRootDelegatesFocus = attributes.contains(AttributeName.SHADOWROOTDELEGATESFOCUS); + boolean shadowRootCustomElementRegistry = attributes.contains(AttributeName.SHADOWROOTCUSTOMELEMENTREGISTRY); String shadowRootReferenceTarget = attributes.getValue(AttributeName.SHADOWROOTREFERENCETARGET); + String shadowRootSlotAssignment = attributes.getValue(AttributeName.SHADOWROOTSLOTASSIGNMENT); - return getShadowRootFromHost(currentNode, templateNode, shadowRootMode, shadowRootIsClonable, shadowRootIsSerializable, shadowRootDelegatesFocus, shadowRootReferenceTarget); + return getShadowRootFromHost(currentNode, templateNode, shadowRootMode, shadowRootIsClonable, shadowRootIsSerializable, shadowRootDelegatesFocus, shadowRootCustomElementRegistry, shadowRootSlotAssignment, shadowRootReferenceTarget); } /** @@ -3220,6 +3226,11 @@ public abstract class TreeBuilder implements TokenHandler, for (;;) { if (eltPos == 0) { assert fragment: "We can get this close to the root of the stack in foreign content only in the fragment case."; + // For

and
, continue to mode handling + // which will create implied start tags + if (group == P || group == BR) { + break; // break from inner loop, continue to switch(mode) + } break endtagloop; } if (stack[eltPos].name == name) { @@ -3513,7 +3524,11 @@ public abstract class TreeBuilder implements TokenHandler, case PRE_OR_LISTING: case FIELDSET: case BUTTON: + case SELECT: case ADDRESS_OR_ARTICLE_OR_ASIDE_OR_DETAILS_OR_DIALOG_OR_DIR_OR_FIGCAPTION_OR_FIGURE_OR_FOOTER_OR_HEADER_OR_HGROUP_OR_MAIN_OR_NAV_OR_SEARCH_OR_SECTION_OR_SUMMARY: + // https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody + // "An end tag whose tag name is one of: "address", "article", + // ..., "select", ..., "ul"" eltPos = findLastInScope(name); if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { errStrayEndTag(name); @@ -3564,12 +3579,10 @@ public abstract class TreeBuilder implements TokenHandler, eltPos = findLastInButtonScope("p"); if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { errNoElementToCloseButEndTagSeen("p"); - // XXX Can the 'in foreign' case happen anymore? if (isInForeign()) { errHtmlStartTagInForeignContext(name); - // Check for currentPtr for the fragment - // case. - while (currentPtr >= 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { + // Pop foreign elements, but keep context element in fragment case + while (currentPtr > 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { pop(); } } @@ -3650,11 +3663,9 @@ public abstract class TreeBuilder implements TokenHandler, case BR: errEndTagBr(); if (isInForeign()) { - // XXX can this happen anymore? errHtmlStartTagInForeignContext(name); - // Check for currentPtr for the fragment - // case. - while (currentPtr >= 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { + // Pop foreign elements, but keep context element in fragment case + while (currentPtr > 0 && stack[currentPtr].ns != "http://www.w3.org/1999/xhtml") { pop(); } } @@ -3677,7 +3688,6 @@ public abstract class TreeBuilder implements TokenHandler, case IFRAME: case NOEMBED: // XXX??? case NOFRAMES: // XXX?? - case SELECT: case TABLE: case TEXTAREA: // XXX?? errStrayEndTag(name); @@ -3787,72 +3797,6 @@ public abstract class TreeBuilder implements TokenHandler, mode = IN_TABLE; continue; } - case IN_SELECT_IN_TABLE: - switch (group) { - case CAPTION: - case TABLE: - case TBODY_OR_THEAD_OR_TFOOT: - case TR: - case TD_OR_TH: - errEndTagSeenWithSelectOpen(name); - if (findLastInTableScope(name) != TreeBuilder.NOT_FOUND_ON_STACK) { - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - break endtagloop; // http://www.w3.org/Bugs/Public/show_bug.cgi?id=8375 - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - continue; - } else { - break endtagloop; - } - default: - // fall through to IN_SELECT - } - // CPPONLY: MOZ_FALLTHROUGH; - case IN_SELECT: - switch (group) { - case OPTION: - if (isCurrent("option")) { - pop(); - break endtagloop; - } else { - errStrayEndTag(name); - break endtagloop; - } - case OPTGROUP: - if (isCurrent("option") - && "optgroup" == stack[currentPtr - 1].name) { - pop(); - } - if (isCurrent("optgroup")) { - pop(); - } else { - errStrayEndTag(name); - } - break endtagloop; - case SELECT: - eltPos = findLastInTableScope("select"); - if (eltPos == TreeBuilder.NOT_FOUND_ON_STACK) { - assert fragment; - errStrayEndTag(name); - break endtagloop; - } - while (currentPtr >= eltPos) { - pop(); - } - resetTheInsertionMode(); - break endtagloop; - case TEMPLATE: - endTagTemplateInHead(); - break endtagloop; - default: - errStrayEndTag(name); - break endtagloop; - } case AFTER_BODY: switch (group) { case HTML: @@ -4312,23 +4256,7 @@ public abstract class TreeBuilder implements TokenHandler, return; } } - if ("select" == name) { - int ancestorIndex = i; - while (ancestorIndex > 0) { - StackNode ancestor = stack[ancestorIndex--]; - if ("http://www.w3.org/1999/xhtml" == ancestor.ns) { - if ("template" == ancestor.name) { - break; - } - if ("table" == ancestor.name) { - mode = IN_SELECT_IN_TABLE; - return; - } - } - } - mode = IN_SELECT; - return; - } else if ("td" == name || "th" == name) { + if ("td" == name || "th" == name) { mode = IN_CELL; return; } else if ("tr" == name) { @@ -5089,6 +5017,9 @@ public abstract class TreeBuilder implements TokenHandler, private void pop() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; elementPopped(node.ns, node.popName, node.node); node.release(this); @@ -5100,6 +5031,9 @@ public abstract class TreeBuilder implements TokenHandler, markMalformedIfScript(node.node); } assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; elementPopped(node.ns, node.popName, node.node); node.release(this); @@ -5108,6 +5042,7 @@ public abstract class TreeBuilder implements TokenHandler, private void silentPop() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + assert node.getGroup() != OPTION; currentPtr--; node.release(this); } @@ -5115,6 +5050,9 @@ public abstract class TreeBuilder implements TokenHandler, private void popOnEof() throws SAXException { StackNode node = stack[currentPtr]; assert debugOnlyClearLastStackSlot(); + if (node.getGroup() == OPTION) { + optionElementPopped(node.node); + } currentPtr--; markMalformedIfScript(node.node); elementPopped(node.ns, node.popName, node.node); @@ -5443,6 +5381,7 @@ public abstract class TreeBuilder implements TokenHandler, T getShadowRootFromHost(T host, T template, String shadowRootMode, boolean shadowRootIsClonable, boolean shadowRootIsSerializable, boolean shadowRootDelegatesFocus, + boolean shadowRootCustomElementRegistry, String shadowRootSlotAssignment, String shadowRootReferenceTarget) { return null; } @@ -5752,6 +5691,18 @@ public abstract class TreeBuilder implements TokenHandler, protected abstract void detachFromParent(T element) throws SAXException; + /** + * Called when an option element is popped from the stack. + * + * https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + * Implements "maybe clone an option into selectedcontent" for + * customizable select. Subclasses that support DOM operations + * should override this to perform the cloning. + */ + protected void optionElementPopped(T option) throws SAXException { + // Default: no-op (streaming/SAX mode ignores cloning) + } + protected abstract boolean hasChildren(T element) throws SAXException; protected abstract void appendElement(T child, T newParent) @@ -6499,10 +6450,6 @@ public abstract class TreeBuilder implements TokenHandler, } } - private void errNoSelectInTableScope() throws SAXException { - err("No \u201Cselect\u201D in table scope."); - } - private void errStartSelectWhereEndSelectExpected() throws SAXException { err("\u201Cselect\u201D start tag where end tag expected."); } @@ -6580,14 +6527,6 @@ public abstract class TreeBuilder implements TokenHandler, err("Saw an end tag after \u201Cbody\u201D had been closed."); } - private void errEndTagSeenWithSelectOpen(@Local String name) throws SAXException { - if (errorHandler == null) { - return; - } - errNoCheck("\u201C" + name - + "\u201D end tag with \u201Cselect\u201D open."); - } - private void errGarbageInColgroup() throws SAXException { err("Garbage in \u201Ccolgroup\u201D fragment."); } ===================================== src/nu/validator/htmlparser/sax/SAXTreeBuilder.java ===================================== @@ -34,6 +34,7 @@ import nu.validator.saxtree.Document; import nu.validator.saxtree.DocumentFragment; import nu.validator.saxtree.Element; import nu.validator.saxtree.Node; +import nu.validator.saxtree.NodeType; import nu.validator.saxtree.ParentNode; class SAXTreeBuilder extends TreeBuilder { @@ -197,4 +198,140 @@ class SAXTreeBuilder extends TreeBuilder { throws SAXException { element.detach(); } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + // Find the nearest ancestor + Element selectedContent = findDescendant(select, "selectedcontent"); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.getAttributes().getIndex("", "selected") >= 0; + if (!hasSelected && selectedContent.getFirstChild() != null) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + selectedContent.clearChildren(); + deepCloneChildren(option, selectedContent); + } + + private Element findAncestor(Element element, String localName) { + ParentNode parent = element.getParentNode(); + while (parent != null) { + if (parent.getNodeType() == NodeType.ELEMENT) { + Element elt = (Element) parent; + if (localName.equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals(elt.getUri())) { + return elt; + } + } + if (parent instanceof Node) { + parent = ((Node) parent).getParentNode(); + } else { + break; + } + } + return null; + } + + private Element findDescendant(Element root, String localName) { + Node current = root.getFirstChild(); + if (current == null) { + return null; + } + Node next; + for (;;) { + if (current.getNodeType() == NodeType.ELEMENT) { + Element elt = (Element) current; + if (localName.equals(elt.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + elt.getUri())) { + return elt; + } + } + if ((next = current.getFirstChild()) != null) { + current = next; + continue; + } + for (;;) { + if (current.getParentNode() == root) { + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + return null; + } + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + current = (Node) current.getParentNode(); + } + } + } + + private void deepCloneChildren(Element source, Element destination) + throws SAXException { + Node current = source.getFirstChild(); + if (current == null) { + return; + } + ParentNode destParent = destination; + Node next; + outer: + for (;;) { + switch (current.getNodeType()) { + case ELEMENT: + Element srcElem = (Element) current; + Element cloneElem = new Element(null, + srcElem.getUri(), + srcElem.getLocalName(), + srcElem.getQName(), + srcElem.getAttributes(), + false, + srcElem.getPrefixMappings()); + destParent.appendChild(cloneElem); + if ((next = srcElem.getFirstChild()) != null) { + destParent = cloneElem; + current = next; + continue outer; + } + break; + case CHARACTERS: + Characters srcChars = (Characters) current; + char[] buf = srcChars.getBuffer(); + destParent.appendChild( + new Characters(null, buf, 0, buf.length)); + break; + default: + break; + } + for (;;) { + if ((next = current.getNextSibling()) != null) { + current = next; + break; + } + if (current.getParentNode() == source) { + return; + } + current = (Node) current.getParentNode(); + destParent = (ParentNode) destParent.getParentNode(); + } + } + } } ===================================== src/nu/validator/htmlparser/xom/XOMTreeBuilder.java ===================================== @@ -23,6 +23,8 @@ package nu.validator.htmlparser.xom; +import java.util.ArrayDeque; + import nu.validator.htmlparser.common.DocumentMode; import nu.validator.htmlparser.impl.CoalescingTreeBuilder; import nu.validator.htmlparser.impl.HtmlAttributes; @@ -348,4 +350,79 @@ class XOMTreeBuilder extends CoalescingTreeBuilder { cachedTableIndex = -1; cachedTable = null; } + + @Override + // https://html.spec.whatwg.org/multipage/form-elements.html#maybe-clone-an-option-into-selectedcontent + // Implements "maybe clone an option into selectedcontent" + protected void optionElementPopped(Element option) throws SAXException { + try { + // Find the nearest ancestor + Element selectedContent = findSelectedContent(select); + if (selectedContent == null) { + return; + } + + // Check option selectedness + boolean hasSelected = option.getAttribute("selected") != null; + if (!hasSelected && selectedContent.getChildCount() > 0) { + // Not the first option and no explicit selected attr + return; + } + + // Clear selectedcontent children and deep-clone option children + selectedContent.removeChildren(); + for (int i = 0; i < option.getChildCount(); i++) { + selectedContent.appendChild(option.getChild(i).copy()); + } + } catch (XMLException e) { + fatal(e); + } + } + + private Element findSelectedContent(Element root) { + ArrayDeque stack = new ArrayDeque<>(); + for (int i = root.getChildCount() - 1; i >= 0; i--) { + Node child = root.getChild(i); + if (child instanceof Element) { + stack.push((Element) child); + } + } + while (!stack.isEmpty()) { + Element current = stack.pop(); + if ("selectedcontent".equals(current.getLocalName()) + && "http://www.w3.org/1999/xhtml".equals( + current.getNamespaceURI())) { + return current; + } + for (int i = current.getChildCount() - 1; i >= 0; i--) { + Node child = current.getChild(i); + if (child instanceof Element) { + stack.push((Element) child); + } + } + } + return null; + } } ===================================== src/nu/validator/saxtree/CharBufferNode.java ===================================== @@ -50,6 +50,14 @@ public abstract class CharBufferNode extends Node { System.arraycopy(buf, start, buffer, 0, length); } + /** + * Returns the buffer. + * @return the buffer + */ + public char[] getBuffer() { + return buffer; + } + /** * Returns the wrapped buffer as a string. * ===================================== src/nu/validator/saxtree/ParentNode.java ===================================== @@ -202,7 +202,22 @@ public abstract class ParentNode extends Node { prev.setNextSibling(node.getNextSibling()); if (lastChild == node) { lastChild = prev; - } + } + } + } + + /** + * Remove all children from this node. + */ + public void clearChildren() { + Node child = firstChild; + while (child != null) { + Node next = child.getNextSibling(); + child.setParentNode(null); + child.setNextSibling(null); + child = next; } + firstChild = null; + lastChild = null; } } ===================================== translator-src/nu/validator/htmlparser/cpptranslate/CppTypes.java ===================================== @@ -81,8 +81,14 @@ public class CppTypes { reservedWords.add("unicode"); } + private static Map methodRenames = new HashMap(); + + static { + methodRenames.put("htmlaccelEnabled", "mozilla::htmlaccel::htmlaccelEnabled"); + } + private static final String[] TREE_BUILDER_INCLUDES = { "jArray", - "mozilla/ImportScanner", "mozilla/Likely", + "mozilla/ImportScanner", "nsAHtml5TreeBuilderState", "nsAtom", "nsContentUtils", "nsGkAtoms", "nsHtml5ArrayCopy", "nsHtml5AtomTable", "nsHtml5DocumentMode", "nsHtml5Highlighter", "nsHtml5OplessBuilder", "nsHtml5Parser", @@ -91,12 +97,12 @@ public class CppTypes { "nsHtml5TreeOpExecutor", "nsHtml5ViewSourceUtils", "nsIContent", "nsIContentHandle", "nsNameSpaceManager", "nsTraceRefcnt", }; - private static final String[] TOKENIZER_INCLUDES = { "jArray", + private static final String[] TOKENIZER_INCLUDES = { "jArray", "nsAHtml5TreeBuilderState", "nsAtom", "nsGkAtoms", "nsHtml5ArrayCopy", "nsHtml5AtomTable", "nsHtml5DocumentMode", "nsHtml5Highlighter", "nsHtml5Macros", "nsHtml5NamedCharacters", - "nsHtml5NamedCharactersAccel", "nsHtml5String", - "nsIContent", "nsTraceRefcnt" }; + "nsHtml5NamedCharactersAccel", "nsHtml5String", "nsHtml5TreeBuilder", + "nsIContent", "nsTraceRefcnt", "mozilla/htmlaccel/htmlaccelEnabled" }; private static final String[] STACK_NODE_INCLUDES = { "nsAtom", "nsHtml5AtomTable", "nsHtml5HtmlAttributes", "nsHtml5String", "nsNameSpaceManager", "nsIContent", @@ -359,6 +365,14 @@ public class CppTypes { return candidate; } + public String mapMethodName(String method) { + String mapped = methodRenames.get(method); + if (mapped == null) { + return method; + } + return mapped; + } + public String stringForLiteral(String literal) { return '"' + literal + '"'; } @@ -486,6 +500,10 @@ public class CppTypes { return "P::checkChar"; } + public String policyPrefix() { + return "P::"; + } + public String silentLineFeed() { return "P::silentLineFeed"; } @@ -537,8 +555,4 @@ public class CppTypes { public String crashMacro() { return "MOZ_CRASH"; } - - public String loopPolicyInclude() { - return "nsHtml5TokenizerLoopPolicies"; - } } ===================================== translator-src/nu/validator/htmlparser/cpptranslate/CppVisitor.java ===================================== @@ -220,7 +220,7 @@ public class CppVisitor extends AnnotationHelperVisitor { private boolean inConstructorBody = false; - private String currentMethod = null; + protected String currentMethod = null; private Set labels = null; @@ -439,16 +439,6 @@ public class CppVisitor extends AnnotationHelperVisitor { printer.print(className); printer.printLn(".h\""); printer.printLn(); - - if ("Tokenizer".equals(javaClassName)) { - String loopPolicyInclude = cppTypes.loopPolicyInclude(); - if (loopPolicyInclude != null) { - printer.print("#include \""); - printer.print(loopPolicyInclude); - printer.printLn(".h\""); - printer.printLn(); - } - } } public void visit(EmptyTypeDeclaration n, LocalSymbolTable arg) { @@ -1320,6 +1310,9 @@ public class CppVisitor extends AnnotationHelperVisitor { } else if ("checkChar".equals(n.getName()) && n.getScope() == null) { visitCheckChar(n, arg); + } else if (n.getName().startsWith("accelerateAdvancement") + && n.getScope() == null) { + visitAccelerateAdvancement(n, arg); } else if ("silentCarriageReturn".equals(n.getName()) && n.getScope() == null) { visitSilentCarriageReturn(n, arg); @@ -1402,7 +1395,7 @@ public class CppVisitor extends AnnotationHelperVisitor { } } printTypeArgs(n.getTypeArgs(), arg); - printer.print(n.getName()); + printer.print(cppTypes.mapMethodName(n.getName())); if ("stateLoop".equals(n.getName()) && "Tokenizer".equals(javaClassName) && cppTypes.stateLoopPolicies().length > 0) { @@ -1646,15 +1639,11 @@ public class CppVisitor extends AnnotationHelperVisitor { printModifiers(n.getModifiers()); } - if (cppTypes.requiresTemplateParameter(currentMethod) + if (!inHeader() && cppTypes.requiresTemplateParameter(currentMethod) && "Tokenizer".equals(javaClassName) && cppTypes.stateLoopPolicies().length > 0) { printer.print("template"); - if (inHeader()) { - printer.print(" "); - } else { - printer.printLn(); - } + printer.printLn(); } printTypeParameters(n.getTypeParameters(), arg); @@ -1956,6 +1945,23 @@ public class CppVisitor extends AnnotationHelperVisitor { printer.print(")"); } + private void visitAccelerateAdvancement(MethodCallExpr call, LocalSymbolTable arg) { + List args = call.getArgs(); + printer.print(cppTypes.policyPrefix()); + printer.print(call.getName()); + printer.print("(this, "); + if (call.getArgs() != null) { + for (Iterator i = call.getArgs().iterator(); i.hasNext();) { + Expression e = i.next(); + e.accept(this, arg); + if (i.hasNext()) { + printer.print(", "); + } + } + } + printer.print(")"); + } + private void visitSilentLineFeed(MethodCallExpr call, LocalSymbolTable arg) { printer.print(cppTypes.silentLineFeed()); printer.print("(this)"); ===================================== translator-src/nu/validator/htmlparser/cpptranslate/HVisitor.java ===================================== @@ -182,6 +182,12 @@ public class HVisitor extends CppVisitor { previousVisibility = Visibility.PUBLIC; } } + if (cppTypes.requiresTemplateParameter(currentMethod) + && "Tokenizer".equals(javaClassName) + && cppTypes.stateLoopPolicies().length > 0) { + printer.print("template"); + printer.printLn(); + } if (inline()) { printer.print("inline "); } View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/5c6cfbbfa87c52c8b048b0181f89fc0f5fbc4d42 -- View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/5c6cfbbfa87c52c8b048b0181f89fc0f5fbc4d42 You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at salsa.debian.org Fri May 1 14:02:10 2026 From: gitlab at salsa.debian.org (bastif (@bastif)) Date: Fri, 01 May 2026 13:02:10 +0000 Subject: [Git][java-team/libhtml5parser-java][master] d/watch: use HEAD to track latest version Message-ID: <69f4a452cebc7_52ffdb9c4348d@godard.mail> bastif pushed to branch master at Debian Java Maintainers / libhtml5parser-java Commits: eeccb848 by Fab Stz at 2026-05-01T15:01:37+02:00 d/watch: use HEAD to track latest version - - - - - 1 changed file: - debian/watch Changes: ===================================== debian/watch ===================================== @@ -1,2 +1,3 @@ -version=3 -https://about.validator.nu/htmlparser/htmlparser-(.*)\.zip +version=4 +opts="mode=git, pretty=1.4+r%cd, pgpmode=none, dversionmangle=s/\+(debian|dfsg|ds|deb)(\.?\d+)?$//" \ + https://github.com/validator/htmlparser HEAD View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/eeccb8485c0f4508c99dde769fd72fa79d5b2831 -- View it on GitLab: https://salsa.debian.org/java-team/libhtml5parser-java/-/commit/eeccb8485c0f4508c99dde769fd72fa79d5b2831 You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help -------------- next part -------------- An HTML attachment was scrubbed... URL: