-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XMLEventReader does not handle ' properly #72
Comments
* Upgrade junit-interface to 0.11 & Fixes scala#72
Furthermore, it does not handle other HTML entities well. There are well over 1,000 HTML entities (see list), and their values are simply tossed out with object testEntityErr {
import scala.io.Source
import scala.xml.pull._
val testStr = "<text> & " < > </text>" +
"<notext> ' © ® € $ ¢ £ ¥ </notext>"
val xml = new XMLEventReader(Source.fromString(testStr))
for (event <- xml) {
event match {
case EvEntityRef(e) => println(e)
case EvComment(_) => println(event)
case _ => // ignore
}
}
} Output:
|
@Mark-L6n Thanks for sharing your comments. Would you be willing to create a separate issue with your concerns? It seems broader than the original issue. |
It seems like Nine years later, can we put back that line for There is some commentary on the Entities representing special characters in XHTML at Wikipedia. I couldn't find an an analysis of browser support for Presumably, all browsers support |
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
Fix scala#72 XMLEventReader does not handle ' properly
It would be great if this was including in an upcoming release. I've been using #89 for a while now without issue. |
@lespea Thanks for your feedback on #89. I had suggested on the Release Plans wiki page to delay merging a PR on this. Maybe I'm making too big of a deal over compatability issues? |
I'm certainly not an expert on the |
Seems fixing the XMLEventReader bug is not mutually exclusive with breaking HTML support. The MarkupParser doesn't even use
Then, the Utility.escape method doesn't even use the
|
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* Upgrade junit-interface to 0.11 & Fixes scala#72
* jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in Escapes map. (escape): Add case for apos. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/parsing/MarkupParser.scala: Import unescMap instead of pairs. * shared/src/main/scala/scala/xml/parsing/MarkupParserCommon.scala: Import unescMap instead of pairs. * jvm/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Unit test from Fehmi Can Saglam <fehmican.saglam@gmail.com>
* shared/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Refactor unit test.
* shared/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Refactor unit test.
* shared/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Refactor unit test.
* shared/src/main/scala/scala/xml/Utility.scala: Uncomment apos in unescape map. * jvm/src/test/scala/scala/xml/pull/XMLEventReaderTest.scala (entityRefTest): Refactor unit test.
Fix #72 XMLEventReader does not handle ' properly
(This issue migrated from https://issues.scala-lang.org/browse/SI-7796)
Of the five required predefined entities in XML, XMLEventReader does not handle ', returning an EvComment of " unknown entity apos; " instead of an EvEntityRef.
This test program:
outputs:
Also, apos does not appear in XhtmlEntities.scala (may be unrelated).
Since these five entities are predefined, I would argue that the parser should auto-replace them with their equivalents so the user doesn't have to.
The text was updated successfully, but these errors were encountered: