Apache OpenOffice (AOO) Bugzilla – Issue 43693
Drop caps should work on cluster boundaries
Last modified: 2013-08-07 14:38:26 UTC
Some languages (such as Thai) have the concept of a grapheme cluster, i.e. a group of characters, a combining character sequence, that should be treated as a unit in most operations (e.g. caret movement, mouse selection). See "Characters and Grapheme Clusters" in Section 2.10 of Unicode 4.0. Drop-caps too should work on cluster boundaries. Using the drop-caps feature, users can specify the number of character(s) to drop (Format->Paragraph->Drop Caps->Number of characters) which is default to 1. However, it is possible for the first cluster of the paragraph to have more than 1 character. The feature should work in this case by maintaining the cluster boundary. OOo still has a bug regarding this. Test case:- - Load the attach Writer document. It has 5 paragraphs of roughly the same text. The first cluster of the paragraphs is 'มุ่', which contains 3 characters. 1) The first paragraph has no drop-caps formatting. 2) The 2nd paragraph has drop-caps of 3 characters, which is the whole first cluster. Drop caps works well. 3) The 3rd and 4th paragraphs have drop-caps of 1-2 characters, which are only a part of the first cluster. OOo display the first cluster as drop-caps and also display it again in the paragraph. 4) The 5th paragraph show that drop-caps of whole-word (Format->Paragraph->Drop Caps->Whole word) works. OOo maintains the word boundary, which suggests that drop-caps of characters should do the same, i.e. maintain the cluster boundary. The expected result:- A) Case 3 should work too. Users may specify part of the first cluster(s) (the number of character may be specify as part of a Style) but the feature should respect the cluster boundaries and round it to the next cluster boundary. Or B) Change the semantic of the number. The number specified should be the number of clusters (not characters) to drop.
Created attachment 23124 [details] Paragraphs with drop caps
The above is tested on Windows XP. On Linux, case 3, dotted-circles will be displayed and messed up the following text.
MRU->SBA: please have a look.
SBA: There are two easy and intuitive workarounds: (1) set "whole word" (2) increase the "Number of chars" -> Changed Priority to P4, Target "OOo Later" SBA->FME: I don't know if this requires some underlying features (from i18n, break iterator, VCL, ...). If so, please get FT in. My 2 cts: In connection with Thai (maybe also Khmer, Arabic, .., .?), "counting letters" does not seem to make sense in case of drop caps. I guess there are further areas affected where "cluster calculation" is desired (...or can cause side effects...).
as of my knowledge, Pladao Office 2.0 and 3.1 has a code for this feature (cluster-based drop caps).
This works in Word 2003.
I think the desired result is option B not option A. The case where this is a real problem is if you want to use a style: a user would want to be able to use the same style the drop the first cluster of a paragraph regardless of the number of characters in the cluster. The workarounds don't allow a user to do this, nor would option A.