Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | Importing apostrophes from HTML fails | ||||||
---|---|---|---|---|---|---|---|
Product: | Writer | Reporter: | ronnystandtke <ronny.standtke> | ||||
Component: | open-import | Assignee: | AOO issues mailing list <issues> | ||||
Status: | RESOLVED FIXED | QA Contact: | |||||
Severity: | Minor | ||||||
Priority: | P3 | CC: | damjan, elish, issues, mseidel | ||||
Version: | OOo 2.2.1 | ||||||
Target Milestone: | 4.1.14 | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Issue Type: | ENHANCEMENT | Latest Confirmation in: | 4.2.0-dev | ||||
Developer Difficulty: | Simple | ||||||
Attachments: |
|
Description
ronnystandtke
2007-08-13 18:29:51 UTC
Created attachment 47515 [details]
HTML file with apostrophes
Reassigned to ES. "&apos" may be supported by many browsers it is not a valide HTML entity. See: http://www.w3.org/TR/html4/sgml/entities.html Please use a plain text ' or ' instead. *** This issue has been marked as a duplicate of 9457 *** closed The 'apos' is a standard html/xhtml entity, see http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent XHTML yes, HTML (4.0) not. Requalifying as enhancement. Reassigned Now using v3.2, unfortunately, I still have to manually sed many html files before opening them with OOo because of this issue... As given in description AOO410m15(Build:9761) - Rev. 1583666 2014-04-01 13:50 - Linux x86_64 Debian Still an issue in the latest Git. We just need 2 lines of code added to fix this bug. The patch below allows reading "'" from HTML, but doesn't write it to HTML. But should we unconditionally support the "'" entity, or only when the HTML version is >= 5? If web browsers were supporting it in 2007, while the first draft of HTML 5 was in 2008 and HTML 5 only became a stable recommendation on October 2014, then we may have to support it in any HTML version for better compatibility. diff --git a/main/svtools/inc/svtools/htmlkywd.hxx b/main/svtools/inc/svtools/htmlkywd.hxx index ff11057f1a..5ec2e37c79 100644 --- a/main/svtools/inc/svtools/htmlkywd.hxx +++ b/main/svtools/inc/svtools/htmlkywd.hxx @@ -182,6 +182,7 @@ #define OOO_STRING_SVTOOLS_HTML_C_lt "lt" #define OOO_STRING_SVTOOLS_HTML_C_gt "gt" #define OOO_STRING_SVTOOLS_HTML_C_amp "amp" +#define OOO_STRING_SVTOOLS_HTML_C_apos "apos" #define OOO_STRING_SVTOOLS_HTML_C_quot "quot" #define OOO_STRING_SVTOOLS_HTML_C_Aacute "Aacute" #define OOO_STRING_SVTOOLS_HTML_C_Agrave "Agrave" diff --git a/main/svtools/source/svhtml/htmlkywd.cxx b/main/svtools/source/svhtml/htmlkywd.cxx index 24b3160009..7554343ec6 100644 --- a/main/svtools/source/svhtml/htmlkywd.cxx +++ b/main/svtools/source/svhtml/htmlkywd.cxx @@ -278,6 +278,7 @@ static HTML_CharEntry __FAR_DATA aHTMLCharNameTab[] = { {{OOO_STRING_SVTOOLS_HTML_C_lt}, 60}, {{OOO_STRING_SVTOOLS_HTML_C_gt}, 62}, {{OOO_STRING_SVTOOLS_HTML_C_amp}, 38}, + {{OOO_STRING_SVTOOLS_HTML_C_apos}, 39}, {{OOO_STRING_SVTOOLS_HTML_C_quot}, 34}, {{OOO_STRING_SVTOOLS_HTML_C_Agrave}, 192}, Firefox imports "'" as "'" even when the HTML file has version 2.0 set: <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN"> so I am going to commit this. Fixed by commit 3304210c5c53f441cdb2c462fbbf6d8351380b01. Resolving FIXED. Thank you for your bug report and sample file! Cherry-picked for AOO42X: https://github.com/apache/openoffice/commit/48fd16e62d68bc37e55a2fea75b52824561dbb0d Cherry-picked for AOO41X: https://github.com/apache/openoffice/commit/3e4cee8e792e2b0ceced47872f3819d43f0c36ff |