Apache OpenOffice (AOO) Bugzilla – Issue 65221
SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria
Last modified: 2017-05-20 11:01:01 UTC
This is a consolidation issue of several other issues such as issue 13702, issue 50331 and issue 65109. Calc interprets criteria arguments to the SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... functions different from Excel if the criteria is a literal "" empty string, or a reference to a formula cell returning an empty string. Furthermore, Excel is not even consistent in how it handles these different forms, nor how it interprets the same form in different functions like DCOUNT and DSUM, and maybe other D*() functions. This is a mess. See attached testcase document. Load in both, Calc and Excel, to have a better comparison of the results and the colored TRUE/FALSE cells. We need to come to a conclusion how we'll handle this. Simply mimicking the Excel behavior may be appealing but might not be the best idea.
Created attachment 36346 [details] testcases
*** Issue 50331 has been marked as a duplicate of this issue. ***
*** Issue 65109 has been marked as a duplicate of this issue. ***
Accepting.
Corrected typo in summary
*** Issue 11273 has been marked as a duplicate of this issue. ***
*** Issue 81198 has been marked as a duplicate of this issue. ***
*** Issue 85966 has been marked as a duplicate of this issue. ***
*** Issue 97711 has been marked as a duplicate of this issue. ***
In the other issues consolidated here there is considerable confusion about what constitutes an empty cell and whether OOo should be 'compatible' with XL. There is a further issue: in the help it clearly uses the phrase 'empty cell' to mean one with no value set, and furthermore to DEFINE equality comparison with "" and 0 to succeed against such cells. Try, for example, using A1="" and A1<>"" as cell formulae against the empty (unset) cell A1, and you will discover the semantics of comparison. Independent of the question of compatibility with XL is the necessity for compatibility with OOo. SUMIF, COUNTIF and the like, should apply the same comparison rules as simple comparison formulae -- and they don't. This note is to request of the developer looking at this to provide some sort of specification, and to ensure that self-consistency is preserved. Is there a place to discuss this?
@ztevepowell: You can get the draft spec of ODF from http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula. You get most of the relevant places by searching for "empty cell" and "blank". A place to discuss the problem is the mailing list users@sc.openoffice.org or maybe dev@sc.openoffice.org.
*** Issue 105908 has been marked as a duplicate of this issue. ***
*** Issue 110508 has been marked as a duplicate of this issue. ***
,
For those of us who develop spreadsheet applications for anonymous users, compatibility is a huge concern, even if the solution is not technically perfect.
This defect is a major PITA. Compatibility with Excel aside, this is a MAJOR problem. Because of the way that OOo calculates empty cells with these functions, AFAIK there is NO WAY to accurately calculate the number of empty cells within a range if any of those cells contain an equation whose output is an empty string, i.e. ="". In Excel, this can be done one of TWO ways, using either COUNTBLANK(<Range>), or COUNTIF(<Range>,""). I started using COUNTBLANK originally so that spreadsheets which have to count blank/empty cells would be compatible with both Excel and OOo, since COUNTIF(<Range>,"") doesn't work at all in OOo. Now with more advanced spreadsheets which contain equations within these cell ranges, I've found that OOo is simply incapable of handling such simple equations, and I have been unable to find a way to make a spreadsheet that will calculate blank/empty cells correctly in OOo. So you see it is not simply a matter of OOo being incompatible with Excel; rather, it is a matter of OOo being absolutely incapable of performing such a (simple) calculation with its current function set. It is IMPOSSIBLE to count the number of blank cells (cells containing an empty string result ="" in addition to the truly empty cells containing absolutely nothing) using OOo. I've tried this in OOo 3.0 and 3.1, and there is NO WAY to do this as far as I can tell. If there is, please enlighten me. The earliest defect I found regarding this issue dates back to 2003. 7 years later it has yet to be addressed. COUNTIF(<Range>,"") doesn't work at all in OOo and always returns a result of 0. COUNTBLANK(<Range>) will only count cells if they have absolutely nothing in them. IMO COUNTBLANK() should also count cells where an equation returns an empty string "". In fact, it is what I would expect given the fact that by returning an empty string, you are effectively saying "This cell is empty." ISBLANK() returns FALSE for a cell containing the equation ="". I read the Help manual and understand this is the expected behavior of OOo due to the way it specifies 'blank' cells, but WHY? I'd like to know what exactly the usage case is for counting cells containing equations that return an empty string "" as NOT blank? The current behavior and specification does not make any sense whatsoever IMO. There needs to be some simple way to do this. Using a function that is already shared by Excel and OOo would be ideal, because most people use Excel and not OOo. Also, the behavior of Excel for COUNTBLANK() makes far more sense than OOo's behavior.
@rockfx01: The final OpenFormula specification for ODF 1.2 may resolve the question about empty strings versus empty cells. Try =COUNTIF(<range>;"=") Note that the "=" operation has no value. To count empty strings, it looks like you need to use =COUNTIF(<range>;"=""""") The paired "" introduce single " characters into the literal string, so the criterion is '=""'. If these functions have been updated for ODF 1.2 in Apache OpenOffice 3.4.0, you should be able to tell the difference with these two cases. PS: Check =COUNTIF(<range>;"") again too. This should find empty strings, but it might also catch empty cells (which default to empty strings when referenced in search of text values).
Created attachment 78576 [details] ODFF expect result @orcmid For =COUNTIF(<range>;""), I think it should only find empty strings. I add an attachment. In this xls file, I list the expect result for ODFF. Please comments if there is something wrong.
Hi. My problem was this: =SUMIF(<refrange>, <>"y", <numberrange>) only added cells that corresponded to non-empty reference cells that had strings which were not "y". This omitted cells that had empty reference cells, even though A1<>"y" would have been true for those cells in the <refrange>. The issue is complicated by the strange reading of 'geometry' of the reference range. This seems to remove empty cells from the <refrange> geometry before considering which cells to sum, and only then are the corresponding cells in the <numberrange> considered. This is a disaster for consistency: if I generated a new column, for example, propagate a simple test =IF((A1<>"y"),B1,0) and sum the result, I get a different answer from the "corresponding" SUMIF. This is a serious consistency issue which is nothing to do with XL or ODF, nor much to do with the way empty cells are treated in boolean expressions elsewhere. This is due to the special treatment of empty cells in reference ranges, which is inconsistent with the other semantics.
Created attachment 78581 [details] Update the ODFF expect result sample file Update the ODFF expect result sample file
*** Issue 123907 has been marked as a duplicate of this issue. ***
Reset assigne to the default "issues@openoffice.apache.org".