Apache OpenOffice (AOO) Bugzilla – Issue 17169
Add automatic non breaking space before certain punctuation marks (for French)
Last modified: 2017-05-20 10:20:29 UTC
Hi, It would be great to have automatic unbreakable space before and after double signs like ';' ':' '?' '!'. There may be others. Or have the possibility to add them to Autocorrect/Autoformat. TIA Best - Sophie
Added Hervé Tanguy in cc as he is the initial reporter of this issue. Sophie
Reassigned to BH
*** Issue 24355 has been marked as a duplicate of this issue. ***
set keywords, target and reassign issue according to RFE process - Sophie
This little feature can get users save a lot of time. Seems not difficult to implement Included in Word for long so perceived as regression when migrating, OOo's reputation suffers
please note that whether a space before punctuation marks is desired depends on the typographic conventions of the country/language. German e.g. doesn't have a space before punctuation marks. And an unbreakable space after the characters seems contra-productive. and why should bla bla bla bla! bla bla break bla bla bla bla! bla bla instead of bla bla bla bla! bla bla a duplicate issue cannot be dependent on the original issue.
Hi, 1/ I have to correct the purpose of this issue : it is 'Add automatic unbreakable space BEFORE typo DOUBLE signs (;:!?)'. Unbeakable space is not required after. 2/ I understand that it is a french specific typo rule so it seems to be difficult to take into account ; therefore Word does it and it saves lots of strikes and prevents bad line breaks. A OOo is ofen presented as much 'language specific' than MSO, one doesn't take that...
*** Issue 59005 has been marked as a duplicate of this issue. ***
*** Issue 62228 has been marked as a duplicate of this issue. ***
SBA: Adjusted summary for clarity and to ease summary string queries.
*** Issue 67561 has been marked as a duplicate of this issue. ***
The easiest and most flexible way to implement this would be to enhance the AutoCorrect feature a little, i.e., to allow specifying the behavior to delete the space and, optionally, to insert special codes like hard spaces. Then we could implement generic rules like " ," -> "[Delete-space],", " !?" -> "[Delete-space]!?" or, for French " !?" -> "[Hard-space]!?". Probably this wouldn't need to change the BreakIterator, and only implement two behaviors: "delete space, if any" and "insert hard space", and two special codes which could be rendered fancifully on the screen (like <- or _ on the icon).
*** Issue 74151 has been marked as a duplicate of this issue. ***
Merci de bien vouloir le faire.
Surely it's because I don't know nothing about it, but I can't figure out why OOo can't do with double signs what it manages to do with closing double quotes...
Indeed, it works fine for " converted to «nbsp or nbsp» so I guess that doing the same for other punctuation symbols should be possible?
not really, the problem is that a normal space is not treated the same as the punctuation mark by OOo; it is only a separator and it cannot be replaced using automatic correction but maybe changing the behavior of OOo in regards to breaking words would allow to treat " «" as a single token... I'll look at that.
By the way, the current release of LanguageTool grammar checker has a rule to deal with this problem in French, see http://www.danielnaber.de/languagetool/
> not really the problem is that a normal space is not treated > the same as the punctuation mark by OOo; it is only a separator > and it cannot be replaced using automatic correction But to implement this feature, there is no need to replace space characters. I still don't get the difference you're trying to explain. With quotes, a single non-space character " is replaced by either «[nbsp] or [nbsp]». Here, the situation is similar: a single non-space character (:;?!) should be replaced by [nbsp]: or [nbsp]; or [nbsp]? or [nbsp]!. (Of course, using non-breaking *thin*spaces would be even better.) > but maybe changing the behavior of OOo in regards to breaking > words would allow to treat " «" as a single token... I'll look > at that. Quote handling is already implemented and in my experience it works perfectly.
I'm afraid you're wrong :( See the example: C'est «vrai» ! It's without hard spaces but with normal spaces. Now, OpenOffice.org can replace « with hard space and «, but this will get appended to the existing space. So there will be a sequence: space, hard space, «. Hardly what you want.
I do not get what you mean with that example. If you type c'est "vrai" it will be displayed as c'est «[nbsp]vrai[nbsp]» by OOo French & Belgian keyboard layouts do not have « » but only "
OK, I'm using a heavily customized keyboard with «», that's why I didn't see that. Anyway, with ":" or "!" it's much harder. If you add "!" to your autocorrection, it will be replaced only if it is already after a space, so then you can add hard space before it. But it will not erase the space that precedes that. Example: C'est vrai : ... Will get replaced: C'est vrai [nbsp]: ... Is that now clear?
I will attach a test document so that you understand what this is all about. There is no space before the "!" That's the problem. Why should I type it manually? I'm not doing it for the quotes, so why should I do it for ; : ? ! That doesn't make sense. Especially since it would be wrong, because this space must be non-breaking! (Well, currently I must insert a manual non-breaking space, but that's only a workaround until OpenOffice takes care of this issue.)
Created attachment 44407 [details] test document to clarify the situation
I know exactly what this is about, after all I wrote the proper rule for LanguageTool, and you're welcome to test it. The problem is that: (1) OpenOffice.org autocorrect doesn't work for punctuation marks that are not preceded by whitespace (tabs and spaces); (2) OpenOffice.org autocorrect works for punctuation marks preceded by whitespace but it can only add another hard space to that whitespace. We should change the behavior of autocorrect so that it wouldn't start working only after a word-breaking symbol. But you can already install LanguageTool for OOo and it will correct all these cases, however this will not happen automatically.
> (1) OpenOffice.org autocorrect doesn't work for punctuation marks > that are not preceded by whitespace (tabs and spaces); *Wrong*, it *does* work for quotation marks. example: "Bonjour" -> « Bonjour » The second quote *is not* preceded by whitespace, yet it's transformed. I think I've finally understood where the confusion comes from: What I and probably others want is an implementation similar to the quote handling (FOURTH tab of the AutoCorrect dialog) and not one based on "Replace" (FIRST tab of AutoCorrect dialog). It is about custom code to handle this peculiarity in French typography. Just like there are custom algorithms to handle quotes in different languages. AutoCorrect is more than just a simple word replacement table. Anyway, thanks for your patience and caring about this issue.
Gosh, of course by "punctuation marks" I meant "punctuation marks excluding apostrophes and quotations marks" because they have their own routines in automatic correction. Anyway, please test LanguageTool. I implemented this for you :)
I've just tried language tool. It doesn't handle apostrophe's correctly if they are replaced by typographical ones. (Something about a missing opening or closing quote.) Since I'm using Antidote for my correction purposes I disabled everything except space handling and it's a nice workaround for the time being. Thanks.
That's what I meant when I talked about the way of handling the closing double quotes. When OOo replaces " by [nbsp]», it does exactly what we need. It does even more than what we want it to do, since there is no need to replace the double punctuation signs, in this case. I understand that quotation marks have their own routines, but would it be such a big work load to add the same routines (or nearly so) for double signs? There is a French proverb that says "Qui peut le plus peut le moins"...
ggs wrote: > There is no space before the "!" That's the problem. Why should I type it > manually? I'm not doing it for the quotes, so why should I do it for ; : ? ! > That doesn't make sense. Especially since it would be wrong, because this > space must be non-breaking!" AFAIK, MS Office needs you to type in a white space, and then it replaces it with a non-break space (I've not been using it for a long time, though). This may be a not so bad behavior - we can discuss that. Second point: standard apostrophes (U+0027 APOSTROPHE, ie ' ) can be replaced by AutoCorrection (fourth tab) with U+2019 RIGHT SINGLE QUOTATION MARK, ie ’ - which "is preferred for apostrophe" (following the Unicode specification). This replacement function is primarily intended to replace English quotation marks U+0019 and U+0018 (right and left marks). But it is working perfectly for apostrophes too, at least in English and French. Note that all apostrophes should be U+0019 for typographical reasons: these are smarter. This AutoCorrection rule should be the default for French too. Maybe for German this would be an issue, I don't know. I'm aware this post is partly out of the scope of this bug, but these are related issues: how to set replacement rules that can be adapted for each language automatically. We have to keep this in mind when trying to design a new typo system. Thanks for working on this *really annoying bug*.
The punctuation rules in French require a non-breaking space before colon ( : ), semicolon ( ; ), exclamation ( ! ) and question mark ( ? ). In Microsoft Office, when the language is set to French, those non-breaking spaces are automatically inserted, but in OpenOffice that's not the case. You should add an option that automatically adds non-breaking spaces before those signs, as the user types. But contrary to what was said in the original post, non-breaking space after the punctuation sign should not be added. There's an extension for OpenOffice that offers a temporary solution : http://extensions.services.openoffice.org/project/insecable
I was surprised, when I tried Open Office, to discover that American quotation marks are transformed into French quotation marks (unbreakable space included) but that no unbreakable space is inserted before double/high punctuation. I was surprised, in part because I have been using Microsoft Word for years and never had to think about it (and because Open Office presents itself as a functional alternative to Microsoft Office), and in part because the French Ministry of Culture recently switched its thousands of computers from Microsoft Office to Open Office (which seems to be something rather significant, in regards to Open Office's official recognition as a better alternative to Microsoft Office). I know there's a patch that allows for unbreakable spaces to be inserted automatically, but that works clumsily with texts that mix French and English (which may seldom be an issue for most American users, but is not an infrequent one for many a French user).
This problem doesn't involve the same development than for quote replacement because it doesn't involve notion of opening/closing. One repaired to handle this case, a simple replace rule may be sufficient. Though, there will still be a problem because this replacement is language specific and must take into account the paragraph language before triggering. This point involve hardier development.
@pomcompot : See the "French Spacing (espace insécable)" extension, release 1.4.1. Try this extension to see if it answers to your requests.
I tried the patch. It worked. When I tried to disable it to type in English again, Writer crashed and I lost the (test) document I was working on. So I went back to MS Word and, this coming academic year again, I'll request from all my students that they use MS Word (because I don't have time to explain how to do this and that, with screen captures, in several different word processors).
@ sicart : Yes OOo should not crashed when you disable "French Spacing" extension. But why disabled this? It is not needed. The best method is, first, to declare the English language paragraph or part in... English by this method: 1. Select your text that is in English language. 2. Select the Format character menu and choose English in the list, or simplest, choose English clicking on the bottom status bar. Now the insecable spaces are automatically disabled. 3. Declare a new issue for the bug in this web site. If your students work only on English texts, create or modify a Standard style with all text in English. I think that is very simple and it is the base of all wordprocessing program (MSWord include). See the documentation. If all users of free software do as you did when a crash occurred, free software would not exist and OOo would not in that good quality. This soft is free and in exchange we try to improve it declaring problems and other issues in the right place. Remember: this program is free and (partially) the fruit of the cooperation between users.
I must confuse this patch with another, older one, then. The one I tried applied French spacing to any text, regardless of the language selected. Alright, I'll give it a go, though it means reinstalling OOo first, so it'll have to wait a bit: I've just finished translating five children books from French to English; I still have to translate a couple novellas and several short stories from English to French before the academic year starts and/or my editors shoot me. My students too need to type in both English and French. By the way, if this patch can recognize the language being used, like MSWord has done for years, why is it still just a patch? Why isn't it included in OOo directly? Oh, and I should have started with: Thank you for your answer. And for the patch itself!
I have a working patch of the core. This feature is added as another AutoCorr option and is working for all the applications (not only Writer). Here is a description of the patch's implementation: A non breaking space is added in front of the following characters ';', ':', '?', '!' in french text only. The characters list and language aren't configurable, but this option can be (un)checked. I can start a CWS for it if needed. As the patch is ready, it could be integrated into 3.2 if somebody would be kind to do the QA
Fixed in CWS cbosdo01.
Set target
Cedric, thanks a lot for working on this. I've added Stefan and Oliver to the CC: list as they can help with the integration.
mba: thanks. Do you know someone who could do the QA of the CWS? I can set it as ready for QA, but need some tester.
I hope we can get sba or es sa QA rep here. AFAIK es is expected to be back tomorrow, I will find out who can have a look. Did you commit your fix already?
mba: yes, the fix is already committed to cbosdo01. I have set the CWS Ready For QA. Should I set es as QA Rep ?
Pleas wait until tomorrow, in the meantime I will have look.
mba: any update?
It's currently hard to find someone for QA; es is still on vacation and sba is already under heavy workload. Your CWS comes in very late, but I will see what we can do. First I will prepare a build for our QA. One question wrt. the implementation: what happens if the user already has inserted a blank before the ":", will the unbreakable space be inserted anyway? Did I get that correctly, the autocorrection will insert the unbreakable space before *and* after the sign? And it removes the blank the user needs to type to start the autocorrection?
mba: I know that it comes quite late... I just had some hope it could be integrated into 3.2... wrt the implementation question: * the hard space is inserted before the ?!:; characters even if there is already a normal white space. That may be improved to replace the existing white space ? * no change is done on the white space after the :;?! character: in french typographic rules, this is a normal space. The hard space is inserted before only to avoid the punctuation mark to go to the next line.
I don't know if it's necessary to fix the problem with the existing space; it just came to my mind. At the end, you are French and should know better if that is annoyance. :-) I will come back to that when I have done the build and have played with it and had a first look on the code. My remark to the blanks after the ":" was caused by the initial description of the submitter that asked explicitly for spaces before and after.
mba: Here is a page describing the use of the hard space in french. This is an automatic translation from a french wikipedia page: http://translate.google.fr/translate?u=http%3A%2F%2Ffr.wikipedia.org%2Fwiki%2FEspace_ins%25C3%25A9cable&sl=fr&tl=en&hl=fr&ie=UTF-8 the original page: http://fr.wikipedia.org/wiki/Espace_ins%C3%A9cable I'll have a look at MSO's behaviour on that topic, but the previous white space replacement is a pertinent remark.
In Word, the space before the punctuation mark is replaced by a non-breaking space. I think this is a good behavior. Please note that there is no non-breaking space between several punctuation mark. Examples: Tu viens[NBS]!? Attends[NBS]!!! Not correct: Tu viens[NBS]![NBS]? Attends[NBS]![NBS]![NBS]! [NBS] = non-breaking space.
Hello, Many thanks for this CWS. It will be a great improvement for French speaking users. You may have a look to macros included in the "French spacing" extension http://extensions.services.openoffice.org/project/insecable which treats all cases : change soft space before double punctuation, treat exceptions such as http://, treat correctly succession of double punctuation... See http://user.services.openoffice.org/fr/forum/viewtopic.php?f=26&t=10030 for a discussion (in French) about all cases.
I've done a small change in svxaccor.cxx to fix a warning (unused parameter)
jumbo444: thanks for those links... I wasn't aware of that forum topic. The http:// exceptions are properly handled because the autocorrection is fired when you type the next word separator (space, paragraph ending...). I wasn't aware of the multiple punctuation case: this has to be fixed indeed, as well as the previous non hard whitespaces replacement.
Another exception may be when paragraph starts with a double punctuation.
mba: I have fixed the exceptions mentioned by the above comments. I'm committing them to the CWS. I also commented the unused parameter to remove the warning you fixed. jumbo444: yes, I've seen that one, this is implemented in the CWS from now.
Reassigned to me.
->cedricbosdo: I found that in sw/source/ui/utlui/utlui.src there is Text [ en-US ] = "Add non break space"; This should read "Add non breaking space"; What about tab stops? Text<tab char>: Should they be ignored instead of inserting a hard blank additionally?
os: WRT the tab behaviour, Word doesn't add the hard space in that case. It may me nice to stick to that behaviour.
I am sorry to spoil the party... :-( I have to reopen this one. This is a full-blown feature because it introduces new UI in AutoCorrect Options (2 check boxes and the respective string) => Change issue type from "Enhancement" to "Feature". The description in this issue and in the attached document describes the goal. That goal (automatically insert a non-breaking space) is acheived in general. But the "surrounding behavior" needs further clarification and improvement. Example: The nonbreaking spaces "pile up" when using Backspace and entering another line break after the respective mark. Therefore I must insist on a specification.
sba: this behaviour has been fixed on 2009-09-09 in the CWS thanks to some comment from mba and jumbo444.
Change target to OOo 3.3 because 3.2 feature deadline is too close. - Spec needs being written (I volunteer for QA part in iTeam) - Feature needs some optimizations (will be specified in Spec) - Autotests need adjustment - Online help needs an update, too (therefore put UFI on cc)
Note: I propose to activating this by default for French (like the insertion of nonbreaking spaces for quotations already is). This would address the fact that only a tiny minority of users know how and dare to change office default setting, thus would "keep on missing this feature".
Note that correct French typography distinguishes between non break space and narrow non break space. ; ! ? : are supposed to be preceded by a narrow non break space (U202F) « » OTOH use the full-width non break space Ref : page 149 of http://www.amazon.fr/Lexique-r%C3%A8gles-typographiques-lImprimerie-Nationale/dp/2743304820
add me to cc.
Just to make it clear (sba's comment could be misunderstood): The issue and its CWS didn't make it into 3.2 because the autotest broken by it couldn't be fixed until nomination deadline. As we now have more time to integrate the CWS, sba wants to write a spec for this feature change before he or one of his colleagues will adjust the autotest.
Hi All! SBA asked me if I would like to take over the QA part (testing) and writing the specification for this issue. I agreed not only because I am his Writer co-worker but also because the fact I'm French might help a bit in term of awareness of the problem and motivation to solve it correctly ;). Now we have time to write a correct and detailed spec and really think about what we want to address and NOT to address. First, I would not like to *discuss* what we should do and how here. I'd rather make a first spec draft on the Wiki and let people comment it on dev@fr.openoffice.org or so... Things I think about when talking about "address or not": - Differentiate the spaces or not (Thanx nmailhot for your last comment! I ordered the book! ;) ) - "French". In which country? What are the rules? (I have read that in Quebec it is recommended NOT to have a space before : ! ?...) - I don't like the Idea of dropping a "French stuff" in the middle of the Options which are not really language depended (what if every language adds its own features there) - The question of the the automatic << >> which is on an other tabpage... But as I wrote. It's better not to make this issue bigger then it is with discussions. @cedricbosdo: is that ok for you?
CCed: es
es: it's good to know that this has fallen into french hands :) Discussing it on dev@fr is also a good idea. For the rule about Quebec, this has to be clarified as the reporter on the equivalent issue here (https://bugzilla.novell.com/show_bug.cgi?id=278233) is a french guy working for a Quebec company. We shall make sure that fr_CA users are taking part of the thread on the list. We also have to take into account that this feature should also work fine for Impress / Draw and Calc (may be less important).
@cedricbosdo: First spec draft: http://wiki.services.openoffice.org/wiki/Non_Breaking_Spaces_Before_Punctuation_In_French_(espaces_ins%C3%A9cables) Quebec and others: needs to be investigated... Not only Writer: I agree but we should focus on Writer first. For further discussions: I'll post a message on dev@fr.openoffice.org these days. Feel free to do it first if you like :)
Implemented according to the specs in cbosdo01
@cedricbosdo: as a french user, I'm glad to see this new feature integrated in OOo :-) Thanks for your job! Two questions about the implentation you recently delivered: - did you re-use the existing extension (http://extensions.services.openoffice.org/project/insecable) or is it a new implementation? - according to the wiki ("Non Breaking Spaces Before Punctuation In French"), I understand that the specification is still in discussion. Does that mean that the implementation you delivered is subject to change? And a final question: are you confident that we can expect to see this feature in OOo 3.3 as stated on the page of the Issue? Thanks
reassigned
@pposc: > Thanks for your job! Many thanks :) > Two questions about the implentation you recently delivered: > - did you re-use the existing extension > (http://extensions.services.openoffice.org/project/insecable) or is it a new > implementation? This is a new implementation in the C++ core of OOo > - according to the wiki ("Non Breaking Spaces Before Punctuation In French"), > I understand that the specification is still in discussion. Does that mean > that the implementation you delivered is subject to change? The specs have reached a stable state ATM and will certainly not change a lot.
Link to specification: http://wiki.services.openoffice.org/wiki/Non_Breaking_Spaces_Before_Punctuation_In_French_%28espaces_ins%C3%A9cables%29 Verified in CWS cbosdo01. Note: The respective AutoTests got adjusted in the same CWS (issue 108382, issue 107088, issue 108380).
It's kind of disappointing that a Finish (!) six-years-old (2000 revised in 2004) page on the then-state of the art in html browsers (ie6 + netscape 4?) has been used to justify ignoring the very clear typographic rules on narrow no break space that even Unicode.org references (the same Finish page claims œ is too “new” and should not be used, when Unicode has long deprecated ISO-8859-15 in software)
@nmailhot: I guess you refer to http://wiki.services.openoffice.org/wiki/Talk:Non_Breaking_Spaces_Before_Punctuation_In_French_(espaces_ins%C3%A9cables)#Exclusion_of_the_NARROW_NO-BREAK_SPACE_.28U.2B202F.29 And the old link http://www.cs.tut.fi/~jkorpela/html/french.html ? You are right, a little bit outdated... :) But don't forget the other arguments and especially the fact that not a lot of fonts have this Unicode area. A lot of fonts and especially *very common fonts* miss this character. A small test shows: Arial, Verdana, Adobe Garamond Pro... A good reason for NOT respecting to the good old French typography rules in a bright and brilliant software world which is still discovering the multiplicity of locale usages! ;) Else: where were you as we made the specification, discussed this on dev@fr, made the CWS and finalized it? As you are French: "la critique est aisée mais l'art est Minotaure"... :) Think about it...