Apache OpenOffice (AOO) Bugzilla – Issue 66344
Title Case and More Advanced Casing
Last modified: 2013-08-07 14:38:26 UTC
There were a number of requests for title case: http://www.openoffice.org/issues/show_bug.cgi?id=5502 http://www.openoffice.org/issues/show_bug.cgi?id=4834 and so on, HOWEVER, they miss the point (and were closed nevertheless). I will propose a more comprehensive approach to this issue, allowing a very flexible casing (because of this, I started a new request). I will shortly desribe this algorithm/feature: In a text we may have: - characters starting a sentence (_Sentence) - characters starting a word, but not a sentence (_Word_Begin) - and chars inside a word (_Word_Inside) We may have the following casing options: - Case_NOP (NO Operation): leave case as is - Case_Upper: make uppercase - Case_Lower: make lowercase - Case_Custom: some custom casing function (like reverse, weird,...) The casing applied by OO.o would be: - _Sentence = Case_Upper - _Word_Begin = Case_Upper - _Word_Inside = Case_NOP But I would prefer, e.g.: - _Sentence = Case_Upper - _Word_Begin = Case_Upper - _Word_Inside = Case_Lower, while others may want: - _Sentence = Case_Upper - _Word_Begin = Case_NOP - _Word_Inside = Case_Lower. Having the possibility to apply such diverse (custom) casing would be a great feature. I will append a C++ program I wrote, that implements this, however that started as my "Hello World" program, and, although it ended with template classes, you should strongly consider rewriting it. (I used it to format filenames.) If anyone wishes to enhance that program or to port it to other systems/ programming languages (like java), feel free to do so. Sincerely, Leonard Mada
Created attachment 37074 [details] Sample c++ proof of concept program
Reassigned to SBA.
Dupe for Issue 1601? It would be best to aggregate these 'Change Case' issues for implementation. Regards, Andrew
I discovered a 4th type of characters: we may have: SENTENCE_BEGIN WORD_BEGIN WORD_INSIDE, but also TABLE_CELL_BEGIN <------ NEW !!!! I recently was frustrated by the continuos and automatic uppercasing performed when writing text in a table. There should be an option to turn this off. Conversely, this is indeed another type of character, which is different from SENTENCE_BEGIN.
SBA-> discoleo: About unwanted capitalisation, see http://oooauthors.org/en/FAQs/Writer/Automatisms/025 SBA: This is code from the community. Hence I changed the issue type to "patch". But this is a feature with UI changes, thus it needs a specification and the blessings from User experience (FL). Related links: Specification project: http://specs.openoffice.org/ Specification WIKI page: http://wiki.services.openoffice.org/wiki/Category:Specification Specification template: http://specs.openoffice.org/collaterals/template/OpenOffice-org-Specification-Template.ott SBA->FME: First, we need your estimation of the "quality" of this code. "Some code" that requires a total rework "from scratch" by another developer would not a patch but a "normal feature request". Please proceed.
[QUOTE] > SBA-> discoleo: About unwanted capitalisation, see > http://oooauthors.org/en/FAQs/Writer/Automatisms/025 This does NOT really apply here, because the end user is left to choose what gets capitalised and how precisely it gets capitalised. So it's up to the user to choose whatever suits him best. I begun writing a specification, though I do not know many things. Also, the code I posted is only a proof of concept. It should be adapted (and improved) accordingly. Summary of Specification: Abstract Users wish more advanced casing options beyond the ordinary upper case and lower case. This specification will present an advanced method of custom casing. Users will be allowed to select (1) separate and independent casing methods for (a) the letter starting a sentence, (b) starting a word and (c) all other letters. (2) Valid casing options for the previous 3 groups of letters include (a) leave case unchanged, (b) make upper case, (c) lower case, (d) toggle case and possibly (e) other custom casing functions. (3) Users are also allowed to customize (a) the detection of a new sentence, (b) of a word, (c) various special characters propagating or (d) blocking a new sentence/ word, and (e) special characters allowing propagation of a start new word when at the beginning of a word but not inducing a new word if positioned in the middle of a word. I also begun to draft the detailed specification and will append the alpha-version here.
Created attachment 40147 [details] An alpha version of the Custom Casing Specification
fme->sba: As discoleo already stated, his code is only a proof-of-concept standalone program which cannot be integrated into OOo as is. Therefore I change the issue type from PATCH to RFE. fme->discoleo: To integrate your code into OOo, SwDoc::TransliterateText() in sw/source/core/doc/docedt.cxx would be a good starting point. One question: I had a first glance at your spec draft and saw that e.g., 'word start' and 'sentence start' should be user-definable. Why don't we just use the functionality provided by our i18n module, there are already functions like 'isBeginWord' and 'startOfSentence'. I guess this would work fine in most cases.
discoleo->fme [QUOTE] > One question: I had a first glance at your spec draft > and saw that e.g., 'word start' and 'sentence start' > should be user-definable. Why don't we just use the > functionality provided by our i18n module, there are > already functions like 'isBeginWord' and 'startOfSentence'. > I guess this would work fine in most cases. While this is true in many cases, there are two (common) situation, where this will probably fail. 1. Some users may wish to case some text containing abbreviations, e.g.: - "This is e.g. an abbr. text" => we don't want the "an" and "text" to behave like starting a new sentence 2. In some languages, a new sentence may start with a more specialized syntax: - dialogues in Romanian: "- This is a dialogue." - questions in Spanish: "?Should that be an inverted questionmark?" There might be other situations where a user wishes that the sentence markers are interpreted differently in his text and this customizing ability gives the greatest flexibility. Everything is about having different choices for greatest flexibility. But again, if there is no time to implement this, than I can live with the provided functions (although with my solution, the code would be only minimally more complex). The second important point, I am NOT a programmer, it only happens that I know some C++. And my problem is, that I have absolute no idea of OOo internals and no time to learn it (by the way, I am also involved with 3 non-writer projects). BUT issue 1601 is 5 years old and I would be very grateful if a developer takes one or two afternoons and implements this. There were 15 votes for the issue (and a number of dupes).
FME: Changing issue type to "requirements".
FME asked me to comment on this issue. I think the current solution in western languages (UPPERCASE and lowercase only) is not sufficient. I would support to extend this feature, but the question (as always) is what is really needed by our user group. Therefore I propose that someone should create scenarios for the proposed change case functions, so that we could decide who needs this function in which saturation. This feature seems to be language depended as well. These usage scenarios could help us to identify which feature should be added to OOo. Initially I think the Custom Case dialog is a way to much, but if someone has a real live scenario...
discoleo->fl CUSTOM CASE: - a title taken from the net: "High Prevalence of Ceftazidime-Resistant Klebsiella pneumoniae and Increase of Imipenem-Resistant Pseudomonas aeruginosa and Acinetobacter spp. in Korea: a KONSAR Program in 2004" (see http://www.eymj.org/2006/pdf/10634.pdf, but actually many journal articles in English have their title formatted similarly) - lower case words: of, and, in - uppercase: most other words - special lower case: bacterial names - second word is lowercase -- Acinetobacter spp. (here "." does NOT start another sentence, too) -- Pseudomonas aeruginosa My casing functions do not handle this one, and I believe it is far to complex (and language dependent) to implement right now. BUT someone may write later a great plugin to just do that and it would be nice to already have the mechanisms in place to use with the general case formatter (and do NOT have to hack afterwards). While it is this feature I would like most (beyond the customizability of the Case Formatter as implemented in my proof of concept program), this is far too complex even for me, to devise a (simple) solution. Too many exceptions, too many special conditions make it a tuff algorithm.
Can we please have this functionality?