Apache OpenOffice (AOO) Bugzilla – Issue 92153
spurious div/0 error
Last modified: 2013-01-29 21:41:32 UTC
Div0.ods opens without error under OO2.4.1 It opens with a div/0 error in cell B23 under OO3beta2
Created attachment 55374 [details] file with false div/0 error
The number of rows has increased from OOo2 to OOo3. Therefore now PI2 is a cell address and that cell has value 0. In a new document you are not allowed to use PI2 as cell name. I thought that there is already an issue concerning importing documents with named cells, where the name is a cell address. But I cannot find it.
Well, at some point in the future, any named (range/formula/...) will be interpreted as a column identifier. There is a simple rule to remove this ambiguity, and I hope OOo Calc and ODF implement it: RULE: ===== Named 'ranges/expressions/...' shall have precedence over column identifiers. If a column identifier needs to be used when a similarly named 'range/expression/...' does exist, the column identifier needs to be escaped. This solves elegantly the problem even for the most distant future (e.g. a spreadsheet with NO column/row limitations!) RATIONALE ========= If a 'range/expression/whatever' has been named, THEN the user most definitely wants to use this name in subsequent expressions, so it is more likely that he indeed wants this name and NOT the column ID. Escaping the name is counterproductive because the scope of the name is to ease usage.
Quote discoleo: RULE: ===== Named 'ranges/expressions/...' shall have precedence over column identifiers. RATIONALE ========= If a 'range/expression/whatever' has been named, THEN the user most definitely wants to use this name in subsequent expressions, so it is more likely that he indeed wants this name and NOT the column ID. ***This seems logical, but... A user might forget (3 months later) that she has defined VARIABLE AB6, refer to CELL AB6, and get (without warning) a wrong result using variable AB6 --not what she meant! So, variable names that can be interpreted as a cell names or ranges should be forbidden. That should upset no user --not even one allergic to restrictions; variable AB6 can be renamed AB_6, or AB6v ('v' for variable),or AB6_, or whatever form that cannot be construed as cell name. If OO3 encounters (in a file written by an earlier version) forbidden variable names, it could either flag them or convert them --eg, by appending an illegal char (like #) or a 'v' to each --provided that does not duplicate an existing name: if AB6v already exists, AB6 could be renamed AB6w, or AB6_, or...
> A user might forget (3 months later) that she has defined VARIABLE AB6, > refer to CELL AB6, and get (without warning) a wrong result using > variable AB6 --not what she meant! I do not quite understand this statement. In general named ranges should be preferred whenever feasible (and I strongly prefer them), because they make the sheet much more structured, and indeed the user can easily remember 3 months later what he was doing. I especially point here to some of my feature requests to extend the concept of named ranges/variables: - issue 66886 (to restrict names to a particular sheet) and - issue 67499 (which, besides other things, would allow organising these definitions and adding sensible comments to every definition) http://www.openoffice.org/issues/show_bug.cgi?id=66886 http://www.openoffice.org/issues/show_bug.cgi?id=67499 As I said, at some time point every 'string'-name will be a valid column (even this 'VeryVeryLongVariableName') and so would preclude the use of any sensible name. Also, I expect variable names to be as short as possible, therefore I strongly oppose artificial 'X_6v or' the like. They only increase error rates and mistakes. At the same time, it is unlikely that a particular user, which has defined PI2 will actually use column PI2. Still, most people use less than 100 columns. Why wasting so many resources for just the rare use of more columns.
Quote discoleo: I do not quite understand this statement. In general named ranges should be preferred... I agree, and I use them. But, suppose you defined variable AB6 for some formula, 3 months ago. Now, you reopen the same spreadsheet to add some some other calculation using cells AB6 and AB7, say '=AB6+AB7'. (Even if you 'strongly prefer' named ranges, you likely use also direct cell references, for their simplicity.) IF you are very careful, you will first hit ^f3 to check if AB6 or AB7 has already been defined as a variable. However, the average user will not even think of doing such a check, and believe he has added cells AB6 and AB7, when in fact he has added VARIABLE AB6 to cell AB7, without error message since the operation is legal --but not what was intended! 'Friendly' software should --within reason-- protect the 'careless' user against likely mistakes. (Toughware will tell the user: 'Haha! You should have known that AB6 is a variable name!') Forbidding variable names that would duplicate cell names is an simple way to avoid the above problem, and this restriction is very easy to satisfy: if the name 'AB6' is not a legal variable name, 'AB_6' is, as are 'AB6v', 'AB6_', ....
The problem with the proposed approach is: 1. variable names have to be typed 2. cells can be selected with the mouse I mention this, because it is easy to mistype something like AB_6 or AB6v and still end with cell [AB6] instead of the variable name. IF the user names something PI2, or to give an example that I often use APACHE2, then I still believe that the chance of an error will be dim and the chance of a calculation with a cell PI2 (and APACHE2 - when spreadsheets will support as many columns) is rather small. But as you said, it is not zero. Instead of forbidding valid names (and highly desirable names), I would adopt an overall better solution: - Calc could detect such conflicts and mark the respective cell -- mark only IF the cell is non-empty -- it is unlikely that the user would reference an empty cell - the user is then warned that a similar name and cell do exist I still believe that the liberal use of names greatly outweighs the negative consequences. If people were using more often names, then they would start both to organise better their spreadsheet and also reduce a great number of other errors. This would by far offset the rare occurrence of the mentioned naming conflict.
Quote discoleo: The problem with the proposed approach is: 1. variable names have to be typed ***What do you mean? How will Calc know I want to name a variable 'Abcd02Jimmy' unless I type it in? 2. cells can be selected with the mouse I mention this, because it is easy to mistype something like AB_6 or AB6v and still end with cell [AB6] instead of the variable name. ***There seems to be a misunderstanding. Variable AB_6 has nothing to do with cell AB6. it could be associated with cell H41. (Why I call a variable 'AB6' is my business; it could be because it contains the product AB*6, or whatever...) In fact, that's the problem: if I use 'AB6' in an expression, Calc assumes I refer to cell AB6 when I really wanted to refer to cell H41. IF the user names something PI2, or to give an example that I often use APACHE2, then I still believe that the chance of an error will be dim and the chance of a calculation with a cell PI2 (and APACHE2 - when spreadsheets will support as many columns) is rather small. But as you said, it is not zero. ***Exactly! To me, 'PI2' referred to Pi*2 (ie, 6.28...) which I had pre-calculated and stored in cell D1; Calc (ver.3) assumed I was referring to cell PI2, which is empty. But if the cell were'nt empty, Calc would not give me a div0 error, but it would quietly give a wrong result! Instead of forbidding valid names (and highly desirable names), I would adopt an overall better solution: - Calc could detect such conflicts and mark the respective cell -- mark only IF the cell is non-empty -- it is unlikely that the user would reference an empty cell ***'unlikely', 'dim' or 'small' chances (above) and 'rare occurrences' (below) are not a good basis for reliable programming. What if the non-empty cell becomes empty later? Or, if the user references an empty cell he intends to fill? I still believe that the liberal use of names greatly outweighs the negative consequences. If people were using more often names, then they would start both to organise better their spreadsheet and also reduce a great number of other errors. This would by far offset the rare occurrence of the mentioned naming conflict. ***'liberal'? There are always restrictions to variable names: reserved words like FOR, IF, ELSE in many languages, forbidden characters like space, ?, ., &, etc. So, forbidding the family of reserved words of the form AAAnnn (eg, AB6 or APACHE234) should surprise or upset no-one, and would prevent confusion with cell references once for all --no 'ifs', 'that's unlikely', 'that will seldom happen', ... Incidentally, you mention the problem of typing mistakes. Well, it's easier to type 'MS' instead of 'MonthlySalary' or 'monthly_salary', but the later forms are recognized as far superior for clarity. The meaning of 'MS' may be obvious when you write it, but will be far more obscure when you revise the spreadsheet 6 months from now. Unless you reserve a table of comments where you explain that 'MS=monthly salary'-- but that has to be typed!