Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing
|Summary:||TITLE in HTML pages and charset?|
|Component:||code||Assignee:||AOO issues mailing list <issues>|
|Status:||CONFIRMED ---||QA Contact:|
|Priority:||P3||CC:||issues, t8m, xslf|
|Issue Type:||ENHANCEMENT||Latest Confirmation in:||---|
Description pavel 2002-12-31 18:00:58 UTC
Hi, the attached HTML page contains the TITLE with characters in ISO8859-2 charset. The Content-type (META tag) contains proper definition of the charset. When this page is imported to the OOo 643C, the title is dispalyed using ISO8859-1 font. When I move the TITLE tag down (below META tag), everything is OK and the title is using iso8859-2 font.
Comment 1 pavel 2002-12-31 18:01:25 UTC
Created attachment 4174 [details] Sample HTML page with TITLE before META
Comment 2 h.ilter 2003-01-15 13:51:36 UTC
Reassigned to ES
Comment 3 eric.savary 2003-01-15 15:22:00 UTC
ES->pjanik: please zip your html file before attaching it. IssueZilla destroys html attachments. Thanx.
Comment 5 pavel 2003-01-15 20:02:10 UTC
Here it is - just try to compare this page with the same page with only meta above title... Watch for the title of the frame in the window manager.
Comment 6 pavel 2003-02-02 18:30:53 UTC
You can now continue.
Comment 7 eric.savary 2003-03-12 15:40:21 UTC
from DEV-> It is valid to use a character encoding from the tag on where is is specified. Otherwise, the source code would have to be read two times.
Comment 8 eric.savary 2003-03-12 15:40:43 UTC
Comment 9 pavel 2003-05-11 11:17:48 UTC
HTML specification does not say so. Where did you got the information that chosen encoding should be used from the line of specification on and not before?
Comment 10 pavel 2003-05-11 11:19:44 UTC
Reopening to solve this issue properly.
Comment 11 dan 2003-05-12 20:48:15 UTC
From my point - META tags should be sent by server in HTTP 1.1 header, so user expects, that all information from META tags are known to the browser before processing the page at all. In special case, where the .html is not interpreted by http server, but by some other application, the application should process it in the same way as http server.
Comment 12 t8m 2003-05-13 09:20:57 UTC
I agree with Dan and Pavel. However this is a can of worms. For example Mozilla has (had?) various problems handling encoding in meta tags if they aren't sent by the HTTP server. The easiest solution would probably be to try to find the META tag only in a short beginning part of a file for example 4K and cache this part for the later rereading.
Comment 13 eric.savary 2003-05-20 01:02:00 UTC
OOo is *not* a browser. It procersses files from the top to the bottom. If If you you don't set the relevant information at the bottom, It won't be parsed.
Comment 14 eric.savary 2003-05-20 01:02:21 UTC
Comment 15 pavel 2003-05-20 06:07:28 UTC
Yes, OOo processes HTML files and as such should follow the HTML specification. It does not. Please do not close this without real reason to do so. If you do not want to implement "double scan", just say so and set target to OOo later.
Comment 16 pavel 2003-05-20 06:14:09 UTC
See the results of validator: http://validator.w3.org/check?uri=http%3A%2F%2Fwww.janik.cz%2Ftmp%2Fq.html I do not tell that OOo is a browser, but if it tries to read files in some format, it should follow the format specification.
Comment 17 eric.savary 2003-06-12 17:06:19 UTC
I don't close issues without good reasons... Now if you want this as an enhancement (for I can't decide myself what we want or not), no problem. Reassigned to BH
Comment 18 pavel 2005-10-29 11:22:21 UTC
Comment 19 pavel 2007-08-25 14:54:17 UTC
Still happens in SRC680_m226. See attached screenshot.