92153 – spurious div/0 error

Issue 92153 - spurious div/0 error

Summary: spurious div/0 error

Status:	CONFIRMED

Alias:	None

Product:	Calc
Classification:	Application
Component:	editing (show other issues)
Version:	OOo 3.0 Beta 2
Hardware:	PC All

Importance:	P3 Trivial (vote)
Target Milestone:	---
Assignee:	AOO issues mailing list
QA Contact:

URL:
Keywords:	oooqa

Depends on:
Blocks:

Reported:	2008-07-25 19:49 UTC by tab
Modified:	2013-01-29 21:41 UTC (History)
CC List:	4 users (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
file with false div/0 error (10.33 KB, text/plain) 2008-07-25 19:51 UTC, tab	no flags	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description tab 2008-07-25 19:49:19 UTC

Div0.ods opens without error under OO2.4.1
It opens with a div/0 error in cell B23 under OO3beta2

Comment 1 tab 2008-07-25 19:51:35 UTC

Created attachment 55374 [details]
file with false div/0 error

Comment 2 Regina Henschel 2008-07-25 21:03:15 UTC

The number of rows has increased from OOo2 to OOo3. Therefore now PI2 is a cell
address and that cell has value 0. In a new document you are not allowed to use
PI2 as cell name.

I thought that there is already an issue concerning importing documents with
named cells, where the name is a cell address. But I cannot find it.

Comment 3 discoleo 2008-07-29 01:02:21 UTC

Well, at some point in the future, any named (range/formula/...) will be
interpreted as a column identifier.

There is a simple rule to remove this ambiguity, and I hope OOo Calc and ODF
implement it:

RULE:
=====
Named 'ranges/expressions/...' shall have precedence over column identifiers.

If a column identifier needs to be used when a similarly named
'range/expression/...' does exist, the column identifier needs to be escaped.

This solves elegantly the problem even for the most distant future (e.g. a
spreadsheet with NO column/row limitations!)

RATIONALE
=========
If a 'range/expression/whatever' has been named, THEN the user most definitely
wants to use this name in subsequent expressions, so it is more likely that he
indeed wants this name and NOT the column ID.

Escaping the name is counterproductive because the scope of the name is to ease
usage.

Comment 4 tab 2008-08-07 03:59:09 UTC

Quote discoleo:
RULE:
=====
Named 'ranges/expressions/...' shall have precedence over column identifiers.
RATIONALE
=========
If a 'range/expression/whatever' has been named, THEN the user most definitely
wants to use this name in subsequent expressions, so it is more likely that he
indeed wants this name and NOT the column ID.

***This seems logical, but... A user might forget (3 months later) that she has
defined VARIABLE AB6, refer to CELL AB6, and get (without warning) a wrong
result using variable AB6 --not what she meant!
   So, variable names that can be interpreted as a cell names or ranges should
be forbidden. That should upset no user --not even one allergic to restrictions;
variable AB6 can be renamed AB_6, or AB6v ('v' for variable),or AB6_, or
whatever form that cannot be construed as cell name.
   If OO3 encounters (in a file written by an earlier version) forbidden
variable names, it could either flag them or convert them --eg, by appending an
illegal char (like #) or a 'v' to each --provided that does not duplicate an
existing name: if AB6v already exists, AB6 could be renamed AB6w, or AB6_, or...

Comment 5 discoleo 2008-08-07 09:46:08 UTC

> A user might forget (3 months later) that she has defined VARIABLE AB6,
> refer to CELL AB6, and get (without warning) a wrong result using
> variable AB6 --not what she meant!

I do not quite understand this statement. In general named ranges should be
preferred whenever feasible (and I strongly prefer them), because they make the
sheet much more structured, and indeed the user can easily remember 3 months
later what he was doing.

I especially point here to some of my feature requests to extend the concept of
named ranges/variables:
 - issue 66886 (to restrict names to a particular sheet) and
 - issue 67499 (which, besides other things, would allow organising
   these definitions and adding sensible comments to every definition)

http://www.openoffice.org/issues/show_bug.cgi?id=66886
http://www.openoffice.org/issues/show_bug.cgi?id=67499

As I said, at some time point every 'string'-name will be a valid column (even
this 'VeryVeryLongVariableName') and so would preclude the use of any sensible
name. Also, I expect variable names to be as short as possible, therefore I
strongly oppose artificial 'X_6v or' the like. They only increase error rates
and mistakes.

At the same time, it is unlikely that a particular user, which has defined PI2
will actually use column PI2. Still, most people use less than 100 columns. Why
wasting so many resources for just the rare use of more columns.

Comment 6 tab 2008-08-08 16:13:26 UTC

Quote discoleo: I do not quite understand this statement. In general named
ranges should be preferred...

I agree, and I use them.
But, suppose you defined variable AB6 for some formula, 3 months ago.
Now, you reopen the same spreadsheet to add some some other calculation using
cells AB6 and AB7, say '=AB6+AB7'. (Even if you 'strongly prefer' named ranges,
you likely use also direct cell references, for their simplicity.) IF you are
very careful, you will first hit ^f3 to check if AB6 or AB7 has already been
defined as a variable.
   However, the average user will not even think of doing such a check, and
believe he has added cells AB6 and AB7, when in fact he has added VARIABLE AB6
to cell AB7, without error message since the operation is legal --but not what
was intended!
   'Friendly' software should --within reason-- protect the 'careless' user
against likely mistakes. (Toughware will tell the user: 'Haha! You should have
known that AB6 is a variable name!') Forbidding variable names that would
duplicate cell names is an simple way to avoid the above problem, and this
restriction is very easy to satisfy: if the name 'AB6' is not a legal variable
name, 'AB_6' is, as are 'AB6v', 'AB6_', ....

Comment 7 discoleo 2008-08-08 18:31:18 UTC

The problem with the proposed approach is:

1. variable names have to be typed
2. cells can be selected with the mouse

I mention this, because it is easy to mistype something like AB_6 or AB6v and
still end with cell [AB6] instead of the variable name.

IF the user names something PI2, or to give an example that I often use APACHE2,
then I still believe that the chance of an error will be dim and the chance of a
calculation with a cell PI2 (and APACHE2 - when spreadsheets will support as
many columns) is rather small.

But as you said, it is not zero. Instead of forbidding valid names (and highly
desirable names), I would adopt an overall better solution:
 - Calc could detect such conflicts and mark the respective cell
   -- mark only IF the cell is non-empty
   -- it is unlikely that the user would reference an empty cell
 - the user is then warned that a similar name and cell do exist

I still believe that the liberal use of names greatly outweighs the negative
consequences. If people were using more often names, then they would start both
to organise better their spreadsheet and also reduce a great number of other
errors. This would by far offset the rare occurrence of the mentioned naming
conflict.

Comment 8 tab 2008-08-12 00:19:22 UTC

Quote discoleo:
The problem with the proposed approach is:

1. variable names have to be typed
***What do you mean? How will Calc know I want to name a variable 'Abcd02Jimmy'
unless I type it in?

2. cells can be selected with the mouse
I mention this, because it is easy to mistype something like AB_6 or AB6v and
still end with cell [AB6] instead of the variable name.
	***There seems to be a misunderstanding. Variable AB_6 has nothing to do with
cell AB6. it could be associated with cell H41. (Why I call a variable 'AB6' is
my business; it could be because it contains the product AB*6, or whatever...)
In fact, that's the problem: if I use 'AB6' in an expression, Calc assumes I
refer to cell AB6 when I really wanted to refer to cell H41.

IF the user names something PI2, or to give an example that I often use APACHE2,
then I still believe that the chance of an error will be dim and the chance of a
calculation with a cell PI2 (and APACHE2 - when spreadsheets will support as
many columns) is rather small.

But as you said, it is not zero.
	***Exactly! To me, 'PI2' referred to Pi*2 (ie, 6.28...) which I had
pre-calculated and stored in cell D1; Calc (ver.3) assumed I was referring to
cell PI2, which is empty. But if the cell were'nt empty, Calc would not give me
a div0 error, but it would quietly give a wrong result!

Instead of forbidding valid names (and highly
desirable names), I would adopt an overall better solution:
 - Calc could detect such conflicts and mark the respective cell
   -- mark only IF the cell is non-empty
   -- it is unlikely that the user would reference an empty cell
	***'unlikely',  'dim' or 'small' chances (above) and 'rare occurrences' (below)
are not a good basis for reliable programming. What if the non-empty cell
becomes empty later? Or, if the user references an empty cell he intends to fill?

I still believe that the liberal use of names greatly outweighs the negative
consequences. If people were using more often names, then they would start both
to organise better their spreadsheet and also reduce a great number of other
errors. This would by far offset the rare occurrence of the mentioned naming
conflict.
	***'liberal'? There are always restrictions to variable names: reserved words
like FOR, IF, ELSE in many languages, forbidden characters like space, ?, ., &,
etc. So, forbidding the family of reserved words of the form AAAnnn (eg, AB6 or
APACHE234) should surprise or upset no-one, and would prevent confusion with
cell references once for all --no 'ifs', 'that's unlikely', 'that will seldom
happen', ...
	Incidentally, you mention the problem of typing mistakes. Well, it's easier to
type 'MS' instead of 'MonthlySalary' or 'monthly_salary', but the later forms
are recognized as far superior for clarity. The meaning of 'MS' may be obvious
when you write it, but will be far more obscure when you revise the spreadsheet
6 months from now. Unless you reserve a table of comments where you explain that
'MS=monthly salary'-- but that has to be typed!