Issue 66344

Summary: Title Case and More Advanced Casing
Product: Writer Reporter: discoleo <discoleo>
Component: codeAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: ace_dent, ajay610, frank.loehmann, issues, stp
Version: OOo 2.0.2   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
Sample c++ proof of concept program
none
An alpha version of the Custom Casing Specification none

Description discoleo 2006-06-12 12:11:27 UTC
There were a number of requests for title case:
http://www.openoffice.org/issues/show_bug.cgi?id=5502
http://www.openoffice.org/issues/show_bug.cgi?id=4834
and so on,

HOWEVER, they miss the point (and were closed nevertheless). I will propose a
more comprehensive approach to this issue, allowing a very flexible casing
(because of this, I started a new request).

I will shortly desribe this algorithm/feature:
In a text we may have:
 - characters starting a sentence (_Sentence)
 - characters starting a word, but not a sentence (_Word_Begin)
 - and chars inside a word (_Word_Inside)

We may have the following casing options:
 - Case_NOP (NO Operation): leave case as is
 - Case_Upper: make uppercase
 - Case_Lower: make lowercase
 - Case_Custom: some custom casing function (like reverse, weird,...)

The casing applied by OO.o would be:
 - _Sentence = Case_Upper
 - _Word_Begin = Case_Upper
 - _Word_Inside = Case_NOP

But I would prefer, e.g.:
 - _Sentence = Case_Upper
 - _Word_Begin = Case_Upper
 - _Word_Inside = Case_Lower,

while others may want:
 - _Sentence = Case_Upper
 - _Word_Begin = Case_NOP
 - _Word_Inside = Case_Lower.

Having the possibility to apply such diverse (custom) casing would be a great
feature. I will append a C++ program I wrote, that implements this, however that
started as my "Hello World" program, and, although it ended with template
classes, you should strongly consider rewriting it. (I used it to format
filenames.) If anyone wishes to enhance that program or to port it to other
systems/ programming languages (like java), feel free to do so.

Sincerely,

Leonard Mada
Comment 1 discoleo 2006-06-12 12:13:13 UTC
Created attachment 37074 [details]
Sample c++ proof of concept program
Comment 2 michael.ruess 2006-06-12 12:19:30 UTC
Reassigned to SBA.
Comment 3 ace_dent 2006-06-12 19:43:12 UTC
Dupe for Issue 1601?
It would be best to aggregate these 'Change Case' issues for implementation.

Regards,
Andrew
Comment 4 discoleo 2006-08-10 16:46:36 UTC
I discovered a 4th type of characters:

we may have:
  SENTENCE_BEGIN
  WORD_BEGIN
  WORD_INSIDE, but also
  TABLE_CELL_BEGIN <------  NEW !!!!

I recently was frustrated by the continuos and automatic uppercasing performed
when writing text in a table. There should be an option to turn this off.
Conversely, this is indeed another type of character, which is different from
SENTENCE_BEGIN.
Comment 5 stefan.baltzer 2006-10-27 16:36:05 UTC
SBA-> discoleo: About unwanted capitalisation, see
http://oooauthors.org/en/FAQs/Writer/Automatisms/025

SBA: This is code from the community. Hence I changed the issue type to "patch".
But this is a feature with UI changes, thus it needs a specification and the
blessings from User experience (FL). Related links:
Specification project:
http://specs.openoffice.org/
Specification WIKI page:
http://wiki.services.openoffice.org/wiki/Category:Specification
Specification template:
http://specs.openoffice.org/collaterals/template/OpenOffice-org-Specification-Template.ott

SBA->FME: First, we need your estimation of the "quality" of this code. 
"Some code" that requires a total rework "from scratch" by another developer
would not a patch but a "normal feature request". Please proceed.
Comment 6 discoleo 2006-10-29 20:08:34 UTC
[QUOTE]
> SBA-> discoleo: About unwanted capitalisation, see
> http://oooauthors.org/en/FAQs/Writer/Automatisms/025

This does NOT really apply here, because the end user is left to choose what
gets capitalised and how precisely it gets capitalised. So it's up to the user
to choose whatever suits him best.

I begun writing a specification, though I do not know many things. Also, the
code I posted is only a proof of concept. It should be adapted (and improved)
accordingly.

Summary of Specification:
Abstract
Users wish more advanced casing options beyond the ordinary upper case and lower
case. This specification will present an advanced method of custom casing. Users
will be allowed to select (1) separate and independent casing methods for (a)
the letter starting a sentence, (b) starting a word and (c) all other letters.
(2) Valid casing options for the previous 3 groups of letters include (a) leave
case unchanged, (b) make upper case, (c) lower case, (d) toggle case and
possibly (e) other custom casing functions. (3) Users are also allowed to
customize (a) the detection of a new sentence, (b) of a word, (c) various
special characters propagating or (d) blocking a new sentence/ word, and (e)
special characters allowing propagation of a start new word when at the
beginning of a word but not inducing a new word if positioned in the middle of a
word.

I also begun to draft the detailed specification and will append the
alpha-version here.
Comment 7 discoleo 2006-10-29 20:13:19 UTC
Created attachment 40147 [details]
An alpha version of the Custom Casing Specification
Comment 8 frank.meies 2006-10-30 08:08:04 UTC
fme->sba: As discoleo already stated, his code is only a proof-of-concept
standalone program which cannot be integrated into OOo as is. Therefore I change
the issue type from PATCH to RFE.

fme->discoleo: To integrate your code into OOo, SwDoc::TransliterateText() in
sw/source/core/doc/docedt.cxx would be a good starting point. One question: I
had a first glance at your spec draft and saw that e.g., 'word start' and
'sentence start' should be user-definable. Why don't we just use the
functionality provided by our i18n module, there are already functions like
'isBeginWord' and 'startOfSentence'. I guess this would work fine in most cases.
Comment 9 discoleo 2006-10-30 10:50:53 UTC
discoleo->fme

[QUOTE]
> One question: I had a first glance at your spec draft
> and saw that e.g., 'word start' and 'sentence start'
> should be user-definable. Why don't we just use the
> functionality provided by our i18n module, there are
> already functions like 'isBeginWord' and 'startOfSentence'.
> I guess this would work fine in most cases.

While this is true in many cases, there are two (common) situation, where this
will probably fail.

1. Some users may wish to case some text containing abbreviations, e.g.:
   - "This is e.g. an abbr. text" => we don't want the "an" and "text" to behave
like starting a new sentence
2. In some languages, a new sentence may start with a more specialized syntax:
   - dialogues in Romanian: "- This is a dialogue."
   - questions in Spanish: "?Should that be an inverted questionmark?"

There might be other situations where a user wishes that the sentence markers
are interpreted differently in his text and this customizing ability gives the
greatest flexibility. Everything is about having different choices for greatest
flexibility. But again, if there is no time to implement this, than I can live
with the provided functions (although with my solution, the code would be only
minimally more complex).

The second important point, I am NOT a programmer, it only happens that I know
some C++. And my problem is, that I have absolute no idea of OOo internals and
no time to learn it (by the way, I am also involved with 3 non-writer projects).
BUT issue 1601 is 5 years old and I would be very grateful if a developer takes
one or two afternoons and implements this. There were 15 votes for the issue
(and a number of dupes).
Comment 10 frank.meies 2006-11-01 10:07:02 UTC
FME: Changing issue type to "requirements".
Comment 11 frank.loehmann 2006-11-02 09:40:30 UTC
FME asked me to comment on this issue. I think the current solution in western
languages (UPPERCASE and lowercase only) is not sufficient. I would support to
extend this feature, but the question (as always) is what is really needed by
our user group. Therefore I propose that someone should create scenarios for the
proposed change case functions, so that we could decide who needs this function
in which saturation. This feature seems to be language depended as well.

These usage scenarios could help us to identify which feature should be added to
OOo. Initially I think the Custom Case dialog is a way to much, but if someone
has a real live scenario...

Comment 12 discoleo 2006-11-02 09:59:41 UTC
discoleo->fl

CUSTOM CASE:

 - a title taken from the net:
  "High Prevalence of Ceftazidime-Resistant Klebsiella pneumoniae and
   Increase of Imipenem-Resistant Pseudomonas aeruginosa and Acinetobacter spp.
   in Korea: a KONSAR Program in 2004"
   (see http://www.eymj.org/2006/pdf/10634.pdf, but actually many journal
    articles in English have their title formatted similarly)
 - lower case words: of, and, in
 - uppercase: most other words
 - special lower case: bacterial names - second word is lowercase
   -- Acinetobacter spp. (here "." does NOT start another sentence, too)
   -- Pseudomonas aeruginosa

My casing functions do not handle this one, and I believe it is far to complex
(and language dependent) to implement right now. BUT someone may write later a
great plugin to just do that and it would be nice to already have the mechanisms
in place to use with the general case formatter (and do NOT have to hack
afterwards).

While it is this feature I would like most (beyond the customizability of the
Case Formatter as implemented in my proof of concept program), this is far too
complex even for me, to devise a (simple) solution. Too many exceptions, too
many special conditions make it a tuff algorithm.
Comment 13 ajaygautam 2007-12-27 04:04:38 UTC
Can we please have this functionality?