Apache OpenOffice (AOO) Bugzilla – Issue 14351
Charcasemap.titlecase wrongly defined to act only on first letter
Last modified: 2013-02-24 21:08:18 UTC
Setting the charcasemap of a word to "titlecase" only affects the first letter of a word (see API reference, which says: " const short TITLE = 3; Description The first character of each word is put in upper case.") But this is wrong. A word is not in title case if it is all in upper case; so title case needs to set the first letter to upper case (as it presently does) and go on to set all subsequent letters to lower case as well. Test case: 1.Use a macro for cycling through the three possible settings of a word and back to the original Starting with a lower case word gives you the correct sequence lower ->LOWER ->Lower ->lower Starting with an upper case word, however, gives LOWER -> LOWER ->lower ->LOWER Starting with a word where the caps lock key has been depressed along with shift lock gives lOWER ->LOWER ->LOWER ->lower
Quick macro to show this in action: ========== Sub Big_CaseChanger ' silly macro by Andrew Brown ' This works on words and arbitrary ranges of text, ' but doesn't know intelligently about title case: ie ' if you give it a string of words, it will set only the first one to title case . ' Probably doesn't work in tables, footnotes, or cells. ' there is a hack that would make this function, but life's too short ' ' if you assign it to a key, successive key presses Dim oDocument, oDesktop as Object Dim oText, alpha, omega as Object Dim oVCursor, mySelection As Object Dim snot as string 'error handling stolen from Paolo Mantovani On Error GoTo ErrH: oDocument = thisComponent oText = oDocument.Text oVCursor = oDocument.currentcontroller.getViewCursor() snot=oVCursor.getString() alpha=oVCursor.getStart() omega=oVCursor.getEnd() If len(snot)>0 Then ' there is a selection mySelection = oText.createTextCursorByRange(alpha) mySelection.goToRange(omega,TRUE) else ' the cursor is a point; grab the word it's in. mySelection = oText.createTextCursorByRange(oVCursor.getstart()) mySelection.gotoStartOfWord(FALSE) mySelection.gotoEndOfWord(TRUE) end if ' this version ( 29/4/03) ' has reverted to charcasemap for use with 644_m11 and above ' though it has a workaround for the bust title case ' call in the API, which only affects the first char but ' should change the subsequent ones to lowercase, too. if mySelection.charCaseMap=0 then mySelection.charcaseMap=1 elseif mySelection.charcaseMap=1 then mySelection.charcaseMap=2 elseif mySelection.charcaseMap=2 then ' the next line, commented out here, is my owrkaround ' myselection.setString(lcase(myselection.getString)) mySelection.charcaseMap=3 elseif myselection.charcaseMap=3 then mySelection.charcaseMap=0 else mySelection.charcaseMap=0 end if ExitPoint: Exit Sub ErrH: MsgBox "Error " & err & ": " & error$ & chr(13) & _ "In line : " & Erl & chr(13) & Now , 16 ,"error occurred" Resume ExitPoint End Sub
It works as specified, doesn't it? For compatibility reasons, we will not change existing spefifications. All we could do is specifying a NEW (otional) property with the behaviour you want. Thus, I change this to an enhancement request and forward it to the responsible developer.
Compatible with what, may I ask? The meaning of "title case" is quite clear in English: it is that the first letter of the words are capped up _and the rest are not_. a) This Is Title Case. b) THis IS NOt. c) nEITHER iS tHIS. That's also the way in which the function is actually used. I can't think of any time in which people would want to change only the first letter and not the rest, unless this behaviour produced my first example. I do think that an enhancement ("proper title case"?) is the wrong way to go here because I can't believe that there is any demand for the feature as it presently works in cases (b) and (c) So I ask again, what is the compatibility we woud be breaking?
Set target to OOo 2.0.
.
TL->FME: The behaviour in the API is the same as in the UI. Please have a look.
FME: Works as designed. FME->FT: We cannot change our current behavior here without affecting existing documents. Do we really want to have a compatibility flag here?
Can I butt in again here to say that I find it vbery hard to imagine that anyone would have used title case in a way that would be broken by changing it. under the present system, THIS counts as title case. so does THIs. Both are clearly wrong, both in English and any other language I know.
FT: Well this is quite some issue. "TEst" is _no_ title no doubt. On the other hand every document that uses this will be rendered different! This is especially dangetrous for "TITEL" changing to "Titel"! FT->FL: Please give a second opinion on this. Thx.
I don't understnad this point. It seems to me that "TITEL" is upper case, "Titel" is title case, and "titel" is lower case. There should be no fourth case (except for recognised exception,s in the autocorrect exceptinon table, like CDs). Something like TITel is just wrong, unless explicitly chosen by the user.
FL: Proper names like StarOffice would be misspelled if the case of other than the first letter would be changed as well. I do not see any problem with current behavior, if a correct spelled sentence is given.
But words like StarOffice aren't title case. they are Camel case. This is important, and not just semantics. The difference is that you can't make a general rule for CamelCase (which is in any case, an illiterate abomination in English. It has no legitimate use except in some corporate names and in code). You have to know which are the two words rammed together to make the camelcased one. That kind of thing has to be dealt with by specific exceptions. Suppose we have a word like staroffice, in lower case. If I want to convert it to upper case, then the algorthm is simple. If I want to convert it to title case, then, again, the rule is to cap up the first letter and lower case the rest. If I want to convert it to Camel Case, no general rule is possible. The program can't as a matter of principle distinguish between STaroffice StaroFFice StaRoffice and so on. All except one of these must be wrong, and we don't know which one. We can't know which one. Title Case, lower case, and UPPER CASE, on the other hand, are rule-bound. You can look at a word at once (and so can the program) and tell which state it is in. Camel case is a necessarily undefined state. so the program should not be asked to change words to it, which is what the present implementation does. We do in fact already have a camelcase exception dictionary in the autocorrect module somewhere. If we really want to worry about being able to change, say STarOffice to StarOffice, then the title case change routine should check that dictionary. I think myself that this is too much hassle, at least for English, and that title case should just do what it says on the tin. If people want to use camelcased words, then let them do so by hand. But I don't know if that solution can be generalised across all other languages. For all I know, there are languages in which camel case is common and legitimate.
FT: After second, third, etc. thoughts on your idea here's the final decision: We will not change our current behaviour (at least not for 2.0). Reason: If you have a title that says "IBM and Sun love StarOffice" your suggestion would make a completely wrong "Ibm And Sun Love Staroffice" out of it. Our implementation still is not 100% OK (e.g. for UK/US): "IBM And Sun Love StarOffice" but at least leaving the special cases intact. If you have problems with "STarOffice" than just run the spellchecker on your text beforehand. Again: Our current implementation is not 100% OK but closer to the truth than your suggestion. We will reconsider this feature with a more intelligent (locale-dependent) feature-set after 2.0.
close
I'm not going to make more than one final protest. There are mor urgent problems for 2.0, no doubt, and I have a workaround for this one anyway. But I think your example makes clear that this is ultimately a linguistic problem. "Titel" means something very different from "Title", and the closeness of the two words has been misleading both of us. The example you quote is not a "title" -- it's a headline. "Title" in English "title case" context means "Buchtitel". Just looking at the Leo online dictonary, I can see a whole range of meanings for the German "Titel" which aren't included in English "Title case" -- "Caption", "cover", "header", "heading", "subsection". So when I say "title case", I mean something that could appear on the spine of a book. You mean something that might appear as a heading in a long document. I don't think, for reasons that I have made clear, that yours is a logically coherent category, comparable to upper case or lower case. I don't want this to develop into one of those interminable online arguments :-) But it is worth noting that this is one of the rare moments when OOo's German heritage makes it do something that looks very odd to English (or American) eyes. Thank you for all the time and thought you've spent -- even if it led you to the wrong conclusion :)