Issue 125257

Summary: Supplementary Multilingual Plane selection in menù Insert, Special Characters
Product: Internationalization Reporter: efa <efa>
Component: uiAssignee: AOO issues mailing list <issues>
Status: UNCONFIRMED --- QA Contact:
Severity: Normal    
Priority: P3 CC: pescetti, rb.henschel
Version: 3.3.0 or older (OOo)   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: DEFECT Latest Confirmation in: 4.2.0-dev
Developer Difficulty: ---
Issue Depends on:    
Issue Blocks: 102943    
Attachments:
Description Flags
GNOME charmap
none
Screenshot of Insert Character dialog with plane1 none

Description efa 2014-07-15 08:28:02 UTC
The menù Insert, Special Characters ...
does not allow insertion of "Supplementary Multilingual Plane 1".

On Windows seems that built in Character Map cannot show those chars, and also Notepad, with notstandard ALT+HEXCODE sequence (after registry hack), does't support insertion of SMP chars, so probably no way on Win.

But also on Linux where Gnome Character Table correctly show all SMP chars, and standard ISO key sequence: CTRL+SHIFT+U HEXCODE correctly insert codepoint > U+FFFF, OpenOffice still miss show and selection of those chars in menù Insert, Special Characters, showing the same limit of Windows Character Map. This is not acceptable.

Expected behave:
a new drop down selection with:
- Plane    0: Basic Multilingual Plane for codepoint 0000–​FFFF
- Plane    1: Supplementary Multilingual Plane for codepoint 10000–​1FFFF
- Plane    2: Supplementary Ideographic Plane for codepoint 20000–​2FFFF
- Plane   14: Supplement­ary Special-purpose Plane for codepoint E0000–​EFFFF
- Plane 15/6: Supplement­ary Private Use Area for codepoint F0000–​10FFFF
as reference see:
http://en.wikipedia.org/wiki/Plane_(Unicode)
so user can show and select chars from different than BMP planes, at least on Linux.
Comment 1 Andrea Pescetti 2014-07-15 16:45:39 UTC
Can you provide an easy way to reproduce it? In GNOME 3 I can open the character map, it shows the Cantarell font by default, on the left some subsets (like Cuneiform, U+12000 to U+12473) seem to have codepoints > FFFF but the font does not support them (I get the classic square with numbers within it).

To confirm this, we need to have a font, preferably one that comes by default with GNOME, and a specific example. I mean, I can see the problem, but we also need everything to reproduce it easily in a "real life" scenario.
Comment 2 efa 2014-07-15 17:12:34 UTC
- OpenOffice 4.1.0 on Gnome on Debian
- open Insert menù
- select Special Characters
- select Liberation Serif font (Liberation package from RedHat)
- scroll symbol down to the last
- it is U+FB02

From Gnome Character Table you can see that Liberation font surely include codepoint > U+FFFF (for example the block from U+1D000), but OpenOffice seems cannot insert them via builtin Special Chars GUI function
Comment 3 Ariel Constenla-Haile 2014-07-15 17:34:58 UTC
Also it's evident if you compare the AOO dialog with the system's default:

Gnome:
/usr/bin/charmap
/usr/bin/gucharmap
/usr/bin/gnome-character-map
(package gucharmap on Fedora)

KDE:
/usr/bin/kcharselect

Although the way characters are grouped differs, the set of available characters is bigger than in AOO, for example the "Private Use Area" block
Comment 4 Andrea Pescetti 2014-07-16 21:55:36 UTC
I'm still stuck on the same problem. I mean, I do understand the issue: the OpenOffice character map does not show the full font, but only entries < FFFF in Unicode.

Still, if I run /usr/bin/gnome-character-map on my system and I open Liberation Serif and I select the Miao group (U+16F00 and above) I only see the "squares with numbers within" (see attached image), and this happens for all groups above FFFF.

So I see and I understand the limitation in OpenOffice described in this issue, but on the other hand I don't see any real glyphs above FFFF using the GNOME character map. Maybe I miss something in my system? It would be great to reproduce this by seeing some specific real glyphs that are unavailable in OpenOffice.
Comment 5 Andrea Pescetti 2014-07-16 21:56:11 UTC
Created attachment 83699 [details]
GNOME charmap
Comment 6 efa 2014-07-16 23:32:50 UTC
select group All and go to codepoint U+1D000 BYZANTINE MUSICAL SYMBOL PSILI, Liberation has those gliphs. My gucharmap is GNOME 3.4.1.1
Comment 7 efa 2014-07-16 23:35:03 UTC
Liberation package version 1.07.0-2
Comment 8 Regina Henschel 2014-09-04 18:13:25 UTC
With OpenOffice 4.1.1 on Windows 7 I can scroll the character table, only the subset drop-down list does not have the Supplementary Multilingual Plane. If you choose such character, the "subset" field is empty.

The same is in the symbol definition dialog of Math. The characters are available in the table, but they have no subset item.

I have tested with font Aegean from http://users.teilar.gr/~g1951d/
Comment 9 efa 2014-09-05 20:07:39 UTC
Regina are you saying that in AOO4.1.1 Insert, Special Chars, you can show and select a codepoint like U+1D000 BYZANTINE MUSICAL SYMBOL PSILI ?

Note: Unicode is divided in planes. Plane 0 named "Basic Multilingual Plane" is divided in Blocks, that are about the Subset you see in AOO Special Chars. What is missing is the selection of the plane number.
Comment 10 Regina Henschel 2014-09-05 22:26:23 UTC
Created attachment 83938 [details]
Screenshot of Insert Character dialog with plane1

That special character is not included in font Aegean. But I can insert a lot of other plane1 characters, see screenshot
Comment 11 efa 2014-09-05 23:21:22 UTC
I checked all my fonts, are about 260, with few of them I can show and select codepoint > U+FFFF. Are:

AR PL UMing *
Aegean
DejaVu Sans
FreeSans
FreeSerif
TakaoPGothic
VL Gothic
WenQuanYiZen Hei
plus some Chinese only font.

With all of them the Subset field become blank when show a codepoint > U+FFFF.


Strangely comparing what is shown by Gnome Char Table, the glyps are correct, but AOO Special Chars, miss lot of codepoint shown by Gnome Char Table.
I cannot understand a rule for this.
For example in FreeSerif (package ttf-freefont 20100919-1), the first codepoint > U+FFFF shown by AOO is:
U+10143 GREEK ACROPHONIC ATTIC FIVE
while U+10000 to U+10142 are about all filled in GNOME Char Table.
Comment 12 Ariel Constenla-Haile 2014-09-06 09:41:47 UTC
(In reply to efa from comment #11)
> Strangely comparing what is shown by Gnome Char Table, the glyps are
> correct, but AOO Special Chars, miss lot of codepoint shown by Gnome Char
> Table.
> I cannot understand a rule for this.
> For example in FreeSerif (package ttf-freefont 20100919-1), the first
> codepoint > U+FFFF shown by AOO is:
> U+10143 GREEK ACROPHONIC ATTIC FIVE
> while U+10000 to U+10142 are about all filled in GNOME Char Table.

In the Gnome Character Map, please check "View" - "Show only glyphs from this font".
If I do so, it shows that FreeSerif, right after U+FFFF, has U+10143 - U+10147
Can you find a glyph that is present in the font but not included in the "Special Characters" dialog?

Looks like the issue can be narrowed to the dialog only listing the Basic Multilingual Plane (BMP) in the "Subset"; in fact, if you see the source, aAllSubsets includes up to Subset( 0xFFF0, 0xFFFF, RID_SUBSETSTR_SPECIALS )

void SubsetMap::InitList()
svx/source/dialog/charmap.cxx
https://svn.apache.org/viewvc/openoffice/trunk/main/svx/source/dialog/charmap.cxx?revision=1541847&view=markup#l748
Comment 13 efa 2014-09-07 13:31:38 UTC
you are right, with this GNOME Char Table option set, only chars really available in a font are shown. For example now Liberation Serif hasn't filled codepoint > U+FFFF, while FreeSerif has U+10143 to U+10147 and U+10330 to U+1034A and so on.
So the remaining glyphs shown are the same of AOO.

The only thing to add to AOO is the possibility to directly select the Supplementary Multilingual Plane (and the other planes) in Subset field or in a dedicated drop down compo box.

Note: What is not clear is from which font are taken the other glyphs shown by GNOME Char Table when the "Show only glyphs from this font" option is not set?