70630 – wrong encoding in LaTeX export

Issue 70630 - wrong encoding in LaTeX export

Summary: wrong encoding in LaTeX export

Status:	CONFIRMED

Alias:	None

Product:	Writer
Classification:	Application
Component:	save-export (show other issues)
Version:	OOo 2.0.4
Hardware:	All All

Importance:	P3 Trivial with 1 vote (vote)
Target Milestone:	---
Assignee:	AOO issues mailing list
QA Contact:

URL:
Keywords:

Depends on:
Blocks:	71021
	Show dependency tree

Reported:	2006-10-20 02:37 UTC by uwestoehr
Modified:	2013-07-30 02:39 UTC (History)
CC List:	6 users (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Writer file containing Japanese characters. (6.94 KB, application/vnd.sun.xml.writer) 2006-10-29 23:51 UTC, maho.nakata	no flags	Details
wrong export of Japanese of former document (latex.odt) (1.25 KB, text/plain) 2006-10-29 23:52 UTC, maho.nakata	no flags	Details
possible correct export in ISO-2022-JP (1.21 KB, text/plain) 2006-10-29 23:53 UTC, maho.nakata	no flags	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description uwestoehr 2006-10-20 02:37:27 UTC

When I export a german .odt document to LaTeX the following laTeX preamble
entries are created:

\usepackage[ascii]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[ngerman]{babel}

The first line is not correct as the encoding "ascii" is chosen although babel
is correctly loaded with the parameter for german. The wrong encoding leads to
LaTeX errors for every umlaut character (ä, ö, ü) in the document. The correct
encoding for german documents is

\usepackage[latin1]{inputenc}
or
\usepackage[cp1252]{inputenc}

OOo should recognize the correct encoding for every document language like it
currently already does this for the babel package.

Comment 1 eric.savary 2006-10-20 08:41:45 UTC

Reassigned to JW

Comment 2 maho.nakata 2006-10-29 23:51:06 UTC

次の添付ファイルを作成しました。 (id=40150)
Writer file containing Japanese characters.

Comment 3 maho.nakata 2006-10-29 23:52:01 UTC

次の添付ファイルを作成しました。 (id=40151)
wrong export of Japanese of former document (latex.odt)

Comment 4 maho.nakata 2006-10-29 23:53:31 UTC

次の添付ファイルを作成しました。 (id=40152)
possible correct export in ISO-2022-JP

Comment 5 maho.nakata 2006-10-29 23:54:23 UTC

hi,
this does also happens for documents containing Japanese.
example files are attached.

Comment 6 tora3 2006-11-05 00:08:35 UTC

IMHO, uwestoehr and maho addresses different aspects. 

For the uwestoehr's point, the author, Henrik Just, states in his web page 
that Writer2LaTeX supports latin1. 
http://www.hj-gym.dk/~hj/writer2latex/index3.html#features
Discussion for the solution would be mainly about how to tell a type of 
language to the Writer2LaTeX filter. A code fragment can be found in 
xmerge/source/writer2latex/source/writer2latex/latex/style/I18n.java

For Japanese, the uwestoehr's point might be (but, i am not sure):
\usepackage[cp932]{inputenc} for Windows variants Japanese
\usepackage[eucjp]{inputenc} for UNIX variants where $LANG=ja or ja_JP.eucJP or
ja_JP.eucjp
\usepackage[sjis]{inputenc}  for UNIX variants where $LANG=ja_JP.PCK (PC Kanji code)
\usepackage[utf8]{inputenc}  for UNIX variants where $LANG=ja_JP.UTF-8 or ja_JP.utf8
\usepackage[iso-2022-jp]{inputenc} for some UNIX variants using JIS encoding (*1)
*1: Japanese Industrial Standard 

For the maho's point, according to the web page, current Writer2LaTeX does 
not supports any Asian languages. It would be a sort of enhancement. 
For the phenomenon that a Japanese character gets converted into [xxxx?] 
notation, which maho addresses, I have made a patch that hides the problem. 
http://www.tora-japan.com/ooo/download/filters/Writer2LaTeX_by_Henrik_Just/OOo_2.0.4/

Comment 7 henrikjust 2006-11-06 13:11:52 UTC

Yes, these are actually two independent rfe's

* Support for automatic choice of inputenc based on document language
(Currently ascii is chosen as default encoding to ensure that the user can 
read the LaTeX file with any text editor.)
* Support for CJK languages
(Currently only latin-cyrillic-greek scripts are supported)

Comment 8 jack.warchold 2006-12-08 16:44:20 UTC

henrik i reassign this issue to you. maho please open a new issue for your request.

Comment 9 uwestoehr 2010-03-02 20:03:34 UTC

Any progress with this bug?

Comment 10 Rob Weir 2013-07-30 02:39:42 UTC

Reset assignee on issues not touched by assignee in more than 1000 days.