Bug 57878 - Using UTF-8 for all languages, and avoiding html-entities.
Summary: Using UTF-8 for all languages, and avoiding html-entities.
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Documentation (show other bugs)
Version: 2.4-HEAD
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: HTTP Server Documentation List
URL:
Keywords:
Depends on: 57879
Blocks:
  Show dependency tree
 
Reported: 2015-04-30 22:19 UTC by Tom Fredrik Blenning
Modified: 2018-08-06 13:51 UTC (History)
1 user (show)



Attachments
Removes all HTML-entities and uses UTF-8 for all languages (88.64 KB, patch)
2015-04-30 22:26 UTC, Tom Fredrik Blenning
Details | Diff
Script to fix this issue (879 bytes, application/x-sh)
2015-04-30 22:30 UTC, Tom Fredrik Blenning
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Fredrik Blenning 2015-04-30 22:19:04 UTC
UTF-8 is used for all but Spanish and Portuguese. Also avoid using pure HTML entities.
Comment 1 Tom Fredrik Blenning 2015-04-30 22:26:07 UTC
Created attachment 32707 [details]
Removes all HTML-entities and uses UTF-8 for all languages
Comment 2 Tom Fredrik Blenning 2015-04-30 22:30:12 UTC
Created attachment 32708 [details]
Script to fix this issue

This is the script used to generate the previous patch. Apply in docs/error
Comment 3 Tom Fredrik Blenning 2015-04-30 22:31:21 UTC
This is a rather invasive patch, I've done the whole process with a script. But I would advise that someone proficient in the different languages, reviews the changes.
Comment 4 Takashi Sato 2015-06-08 08:58:40 UTC
+1 for concept.
I don't like HTML-entities.
Comment 5 Sierk Bornemann 2018-02-26 11:55:23 UTC
Any progress on this issue?

On Tue Dec 5 11:21:21 2017 UTC (Revision 1817175) error docs have been touched firstly since years for russian translations, see
https://svn.apache.org/viewvc?view=revision&revision=1817175
https://github.com/apache/httpd/commit/7d640c80928447b2b8c3b4cea898abf421d461b5

Why not fix the above issue (Bug 57878), which is open and untouched since several years, simultaneously or shortly after?
Please fix.
Comment 6 William A. Rowe Jr. 2018-08-06 13:51:24 UTC
Entirely agree on html entities for native alpha, they can be represented in any applicable ISO set.

Entirely agree on utf-8 for editing.

Two different charseta can coexist for one language. If we feel any desire to retain 8bit charseta at the user agent's priority/preference, then 8bit should be generated and appended based on some maplist of languages.

Question - does anyone for see user agens in active use which have a reason to still prefer a 8859 or 2022 mapping in this day and age?