Apache OpenOffice (AOO) Bugzilla – Issue 90759

ZTEST not same as Excel

Last modified: 2017-05-20 11:41:49 UTC

ZTEST produces different results to Excel. Attached file has a review.

Created attachment 54497 [details] Description of the ZTEST problem

I have tested this some time ago. The results are wrong both in Calc and in Excel (although each program computes something different). I do not have the slightest idea what these programs compute. The issue is mentioned also on the draft OASIS document, and I commented on this issue on the OASIS mailing list, see section 5 in this message: http://lists.oasis-open.org/archives/office-comment/200805/msg00002.html

Just one notice, forgot to mention: I tested this with MS Office XP on Win2k and again with OOo-dev m19. Both results are wrong, as exemplified by comparison to R: > x <- c(90,95,95,100) > z.test(x,100,5) One Sample z-test data: x z = -2, n = 4.0, Std. Dev. = 5.0, Std. Dev. of the sample mean = 2.5, p-value = 0.0455 alternative hypothesis: true mean is not equal to 100 95 percent confidence interval: 90.10009 99.89991 sample estimates: mean of x 95

Thanks, Leonard It's difficult to see exactly what R is calculating. I've tried to go back to basics to show that there is some logic/validity to Excel's method, and indeed that there is a clear bug with Calc, with the 3 parameter version. I've tried to get z.test to work in R but failed after a brief struggle. What happens if you set alternative (Direction of the alternative hypothesis) to 'greater', or 'less' or even 'two.sided'? I think Excel produces a 'one-sided' (not 'one-tailed') test which might correspond to 'greater'.

Ah, I figured how to use z.test in R (over lunch ;). Yes, it does give the same results as Excel: z.test(x,100,5,'greater') Even with the mean on the other side y<-c(110,105,105,100) z.test(y,100,5,'greater') I'm now convinced that 'one-sided' is the correct description for what Excel does - not one- or two- tailed.

adding me to cc.

z.test does the two-tailed test by default:- ===== > y<-c(110,105,105,100) > z.test(y,100,5) One Sample z-test data: y z = 2, n = 4.0, Std. Dev. = 5.0, Std. Dev. of the sample mean = 2.5, p-value = 0.0455 alternative hypothesis: true mean is not equal to 100 95 percent confidence interval: 100.1001 109.8999 sample estimates: mean of y 105 ====== p-value of 0.0455 is exactly twice what Excel gives - which is right, because when the sample mean is above the population mean Excel does the one-tailed test. In summary this all hangs together.

Well, I am glad you figured out the R-syntax. You can access the help in R using: > help('z.test') [or whatever command you wish to know about] Just to remove some confusion: 1.) one sided and single-tailed are the same [similarly 2-sided / 2-tailed are the same] 2.) statistical tests should be almost always 2-sided [one sided tests are necessary only in rare instances] 3.) R-syntax (with default values / possible options) z.test(x, mu = 0, stdev, alternative = c("two.sided", "less", "greater"), sd = stdev, conf.level = 0.95, ...) e.g. (you can copy / paste these lines in R) x <- c(90,95,95,100) z.test(x, mu = 100, sd = 5,'greater') # one-sided greater z.test(x, mu = 100, sd = 5,'less') # one sided less z.test(x, mu = 100, sd = 5,'two.sided')# two sided z.test(x, mu = 100, sd = 5 ) # two sided by default z.test(x, mu = 100, sd = sd(x) ) # compute the SD automatically Just some words on using the z-test for people who did NOT figure out yet how to do it in R: - you need first to install the TeachingDemos: go to the Menu: Packages -> Install packages -> -> select the mirror server -> select TeachingDemos - you need now to Load the package: go to the Menu: Packages -> Load package -> select TeachingDemos - Done. > I'm now convinced that 'one-sided' is the correct description for > what Excel does Good to know, though I'll stick with R for serious statistics. ;-)

Small corrections: [parameter 'alternative' needs to be explicitly stated, when naming all the other parameters] x <- c(90,95,95,100) z.test(x, mu = 100, sd = 5, alternative = 'greater') # one-sided greater z.test(x, mu = 100, sd = 5, alternative = 'less') # one sided less z.test(x, mu = 100, sd = 5, alternative = 'two.sided')# two sided z.test(x, mu = 100, sd = 5 ) # two sided by default z.test(x, mu = 100, sd = sd(x) ) # compute the SD automatically

Created attachment 54518 [details] ZTEST: one-sided or one-tailed; proposed bug fix

I'm on a roll on this one... ;) Just attached a defence of my view that 'one-sided' is the correct term, plus a proposed bug fix.

Sneaking into the code for computing the variance/SD/SE, I noticed that Calc uses an unstable algorithm. Calc should implement either a 2 pass code, or alternatively a stable one pass algorithm.

My previous comment refers of course to the ztest code, the true SD-code is corrected as afr as I know. ;)

You have a good point. http://www.coventry.ac.uk/ec/~nhunt/pottel.pdf highlights a case where Excel's STDEV falls down. In fact Calc performs correctly with that data, so it might be better to simply call STDEV within ZTEST?

Improving the statistical functions is an ongoing process. The function ZTEST should depend on NORMSDIST. Therefore NORMSDIST has to be controlled and improved first. Excel is not good for comparison because EXCEL has only 7 digit accuracy very often. Calc should give better accuracy if possible. It would be nice to get a test document, which has correct results up to 15 digits for to see how good a new implementation of ZTEST is. You will find information on Excel's ZTEST on http://office.microsoft.com/en-us/excel/HP052042111033.aspx?pid=CH062528191033 or the corresponding one for EXCEL 2007. A better place to discuss how to implement ZTEST is not this issue but the mailing list dev@sc.openoffice.org.

Hi Regina Discussion moved to dev@sc.openoffice.org

If third parameter is given, result is wrong.

Created attachment 54565 [details] patch as proposed

Created attachment 54566 [details] testcase

I've added a patch which changes as described by drking. I thing, that further work on ZTEST should be postponed, until the specification in ODF1.2 is finished.

Please apply the patch. The problem here is independent from the other problems with ZTEST. OOo calculates wrong and that should be corrected.

Yes, sure, sorry this patch went unnoticed.

In cws odff06: revision 275201 sc/source/core/tool/interpr3.cxx Just to be sure: would it be correct to speak of a one-sided test in the online help, and ODFF? As in "Returns the one-sided P value of a z test with standard distribution."?

I'm not sure that's exactly right. The p-value can refer to the value on the vertical axis, whereas it's the area (cumulative probability) that is returned. Because Excel has got their description wrong in the past (and may well still have it wrong), and because the function returns an unconventional result it might be better to be *really* clear about this, with diagrams. The stuff I wrote in 2008 is here: http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_ZTEST_function and you'd be very welcome to use it in ODFF of course

I'm no statistician, but what about: Calculates the probability of observing a standard normal value greater than the z-core (M-Âµ)/(sigma/sqrt(N)), where M is the sample mean, N the sample size, and Âµ and sigma the known mean and standard deviation of the population. Or as short variant: Calculates the probability of observing a value greater than the z-core.

Hi Regina I think you mean 'z-score' rather than 'z-core'. For your short form version, we really should make clear that this is a sample, not the entire population. There's a real problem understanding the function, because Excel has described it incorrectly, and it also doesn't work as you might expect. So my inclination is to use more words, rather than try to be succinct. I do find the diagrams helpful for this, to highlight the difference between one-sided and one/two-tailed. I'd suggest that it's helpful to be very clear that this is not a one- or two-sided test - although I'd also be happy if we simutaneously avoided the 'one-sided' label. D

"I think you mean 'z-score' rather than 'z-core'." Yes. "I do find the diagrams helpful for this, to highlight the difference between one-sided and one/two-tailed." The diagrams in the Wiki are indeed very useful, but the function description cannot contain pictures. And the proposed 'one sided' is not exact enough, because it does not say whether P(Z<z) or P(Z>z) is returned. This issue is about wrong calculation. That is fixed and therefore the issue is closed. But the current help "Returns the two-tailed P value of a z test with standard distribution." is wrong. We need a description, that is right in content and short enough to be shown in the function wizard. Can you make a suggestion in a new issue? It should not exceed two lines (see IPMT for a description which is too long).

@drking: if you come up with a new wording for the function wizard _within the next 2-3 days_, please don't create a new issue but add to this issue instead, I can easily change the resource in CWS odff06 as well where this patch was added. However, we'd need a separate issue for the online help.

Regina and I are discussing this by private email and will come up with something

Our proposal for the function wizard:- function description: calculates the probability of a sample mean greater than the mean of the given sample. first parameter The given sample, drawn from a normally distributed population second parameter The known mean of the population third parameter The known standard deviation of the population. If omitted, the standard deviation of the given sample is used.

So, no mention of one-sided at all? Ok. We probably should change the name of the 2nd parameter from "Number" to "mu" as well. The 3rd is already named "sigma".

Yes, "number" is too unspecific. For the application help I have suggested "mean", but "mu" is fine too and shorter.

In cws odff06: revision 275349 sc/source/ui/src/scfuncs.src

In http://sc.openoffice.org/servlets/ReadMsg?list=dev&msgNo=3673 Leonard Mada suggested the following description: "calculates the probability of observing a value as large or larger for the z-statistic" Would that do?

No. The comparison "larger" is incomplete, missing a "than" part.

I used this one now: Calculates the probability of observing a z-statistic greater than the one computed based on a sample. revision 275752 sc/source/ui/src/scfuncs.src

Reassigning to QA for verification.

verified in internal build cws_odff06