Issue 36232 - dbase-files with cp 437 can not be used
Summary: dbase-files with cp 437 can not be used
Status: CLOSED IRREPRODUCIBLE
Alias: None
Product: Base
Classification: Application
Component: code (show other issues)
Version: OOo 1.1.3
Hardware: All Windows NT
: P3 Trivial (vote)
Target Milestone: ---
Assignee: marc.neumann
QA Contact: issues@dba
URL:
Keywords: needmoreinfo, oooqa
Depends on:
Blocks:
 
Reported: 2004-10-28 00:11 UTC by mhatheoo
Modified: 2007-05-06 10:09 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description mhatheoo 2004-10-28 00:11:12 UTC
Data-access has a glitch with treating the code-page of dbase-file

- DOS/OS-437: can not opend for write-access

- ASCII/US : can be opend for write, but OO.o is treating/translating wrong for
the German Umlaute (Ascii above Hex127)

no samples, just check it out

This not two but one problem only, as the problem is the character-translation
from not-system (like dos etc) to system-/OO.o and the ability to write back.

Martin
Comment 1 mhatheoo 2004-11-04 23:13:09 UTC
just a little to contiune with this issue:

OO.o is treating characters as ANSI-code, it should not do so.

At actual, it is impossible to establish a serious working connection to DBase-file.
the OO.o-build-in functions are not working.
The ODBC-connection will work only with latest driver-set from the
Jet/OLE-DB-Service-Pack 8 and MDAC-Typ Version 2.8.



someone else might look into this issue aswell

rgds
Martin
Comment 2 Frank Schönheit 2004-11-05 09:08:00 UTC
Martin, sorry, I cannot reproduce the first problem: I have write access to my
dBase files no matter which encoding I use. Can you please provide a
step-by-step description of what you're doing? I strongly assume there's a side
effect which is causing the missing write access, bug since you don't describe
your workflow, I can only guess.

The fact that german umlauts are not properly treated with ASCII/US is no bug:
ASCII/US *does not contain* german umlauts, so encoding your (Unicode) input
using this character set will always fail. That's inherent in the encoding you
chose. For german umlauts, you should use something like IBM/DOS-850.

Regards
Frank
Comment 3 mhatheoo 2004-11-05 12:44:20 UTC
sorry frank - it looks as we may need some experienced assistance, righ?
and please re-read the character-list of cp 437 , you may wonder what all is
inside that - 850 is no option - ANSI/CP 1200 etc aswell - and the windows
system can/will much more do, than you expect.

rgds
Martin
Comment 4 Frank Schönheit 2004-11-16 08:02:12 UTC
Sorry for the delay, have been quite busy with other things.

I admit that my knowledge about character sets and encodings originates from
what others have told me, so please enlight me :)
- is the "cp 437" the same as ASCII/US? I don't think so, but correct me if I'm
wrong.
  If yes, please provide me with a pointer to the character table. If not, then
I don't see
  your point, since all I said was:
  - I cannot reproduce your problem with DOS/OS-437, so please provide a
    step-by-step description of what you're doing.
  - German umlauts cannot be stored when using ASCII/US, since they cannot be
    encoded therein.
- Can you please elaborate why 850 is not an option?
- Somewhat unrelated, and as said my concrete knowledge about character sets is
  limited, but I seem to remember that "Windows-1252/WinLatin-1" should also be a
  suitable encoding if you're planning to use german texts

I'd really like to nail down the problem here, but at the moment my impression
is we're talking at cross-purpose :(
Comment 5 mhatheoo 2004-11-16 18:30:33 UTC
@ frank


one thing in advance:
Yes, CP 437 and ASCII/US are the same, in principle. 

a)
OO.o is treating Ascii/US as the 7-Bit-Portion of ASCII only. 
This can be acceptable - for historic reasons.
However, in case the datafile holds data with code above the first 128
characters, it is substitution this with UNICODE-characters. 
wrong behavior

b) 
The system-functions of Windows will treat and convert DOS-files by default as
files with code-page 437. 
Using OO.o with Westeurope/DOS/CP-437 will re-translate this again, with funny
results:
DOS-Ü(DOS=0x154) ==> ANSI-Ü(Windows=0x220) 
==> OO.o-â–€(Blockgrafic-sign)(=CP-437=0x220)
This filter works for "re-encoding" to DOS-Look-alike for Text-Imports, but not
for Data-Imports.


you may find some usefull infos to the code-page-problem here:
http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0901/en/html/dbdaen9/00000357.htm
or you google yourself to "codepage ANSI 437 850"

to simplfy this complex problem, you may want to deal with the cp-437-part only,
for this moment.

rgds
Martin 


Comment 6 mhatheoo 2005-11-03 19:18:02 UTC
this issue is still valid - for OO2.0

Dbase-files can not be used on a german system, as the proprietary
code-page-driver of OO2.0 is treating Dos-character-codes above code-no.128 wrong

still no serious problem, as system-based character-exchange via ODBC is working
fine, however, it should not be like that.

Martin
Comment 7 Regina Henschel 2005-12-02 23:10:55 UTC
Have you used the characterset "Westeuropa (DOS/OS2-437/US)"? You find it in
OOo2, in the databasefile window in Edit - Database - Properties.

I use OOo2.0.1rc2 with a dBase file generated from Calc and it works, but that
is no original dBase. So please provide a small dBase-file, which has the
problems for you.
Comment 8 mhatheoo 2006-03-21 16:26:24 UTC
as of OO.o  2.0.2:

well, this issue might be closed - any reopened immidiatly - maybe.

1. Looking at this issue from the german point of view, it works partly, the
german Umlaute are opened correct (äöüÄÖÜß)

2. But this is done by using the codepage 850 (latin-1) by default. 
And that can not be changed, files will be opened with CP 850 even if you set CP
437 in the properties. 
So, only a few characters vary (mostly graphical characters), all german umlauts
have the same 8-bit-decoding in CP437 and CP850. No problem for me, but it looks
a little bit strange, to do settings, which are overruled by the default settings.

3. For some reason I do not undrstand sofar, my application will not read fileds
that should contain numeric values (!), the fieldlength seems to be improper.

4. supposed you have a "non-Standard-character" in a Calc-Sheet and want to save
it as DBF-file, the data-export will terminate un-completed (try it with
ALT-0169= ©). This might be one of the Unicode-problems of OO.o, I guess the
translation back to 8-bit does not work properly.

outcome for me: leave this issue open and ´have a look into it again.

Martin 
Comment 9 mhatheoo 2006-03-22 17:05:53 UTC
excuse me - forgotten:

for this issue

- pls drop the "needmoreinfo"
- set it at least to new
- set the realetd version to 2.0.2

Martin
Comment 10 mhatheoo 2006-06-29 17:38:34 UTC
wow - great stuff:
since 680M173 the whole support for 8-bit ascii is droped.
as intermediate - that should not be the solution for this issue,
I hope

Martin H. 
Comment 11 Frank Schönheit 2006-06-30 08:08:39 UTC
> since 680M173 the whole support for 8-bit ascii is droped.
Hmm?
Comment 12 andreschnabel 2007-05-06 09:59:01 UTC
set to worksforme as I cannot reproduce the issue with OOo 2.2

If the issue still exists please feel free to reopen, but attach an example and
give clear instructions how to reproduce. (Needmoreinfo keyword is absolutely
correct in this case - we cannot confirm the issue without an example).
Comment 13 andreschnabel 2007-05-06 10:09:46 UTC
additional info: as far as I can see, © is not part of cp437. So OOo behaves as
expected.