9403 – HSSFRow.cellIterator() returns row results in reverse order

Bug 9403 - HSSFRow.cellIterator() returns row results in reverse order

Summary: HSSFRow.cellIterator() returns row results in reverse order

Status:	RESOLVED FIXED

Alias:	None

Product:	POI
Classification:	Unclassified
Component:	HSSF (show other bugs)
Version:	2.0-dev
Hardware:	PC All

Importance:	P3 normal with 8 votes (vote)
Target Milestone:	---
Assignee:	POI Developers List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2002-05-24 17:55 UTC by Shawn Savela
Modified:	2006-01-12 18:32 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Shawn Savela 2002-05-24 17:55:37 UTC

When using the HSSFRow.cellIterator to traverse through a document, the column 
information is in reverse-sequential order.  For example, if iterating through 
a document with data in two rows and three columns, the data will be in this 
order:

(0,2),(0,1),(0,0),(1,2),(1,1),(1,0).

The HSSFSheet.rowIterator properly iterates through the data in forward-
sequential order.

I duplicated this bug in the 1.5 release and the 1.6 build release.

Comment 1 Andy Oliver 2002-05-24 17:57:23 UTC

There is no contract guaranteeing the order.  Furthermore, they can appear in
any order in the underlying file format.

Comment 2 Shawn Savela 2002-05-24 18:11:49 UTC

If there is an implied ordering of the cells (a number that can be retrieved 
from getCellNum()), why wouldn't the cellIterator() method return the rows in 
that order?

It seems inconsitent at best since the rowIterator does return (at least in my 
example) the rows in the implied order that they exist in the spreadsheet.

The documentation should reflect the fact that the *Iterator routines will 
return the results in random order.

Comment 3 Andy Oliver 2002-05-24 22:17:48 UTC

The implied ordering is "whatever is in the file" or some variant of "whatever
was most efficient to store".  This is where the rubber meets the road.  While I
realize it can be inconvienient for the user to reorder, its far more efficient
then us ordering them in a particular order.  If they are precisely in reverse
because of something we're doing, feel free to submit a patch, but I'm against
enforcing any contract as to the order.  Your point about the documentation is
well taken, please submit a patch and I'll apply it against the head. (2.0)

Comment 4 Andy Oliver 2002-05-24 22:18:10 UTC

(if patch is provided please reopen)

Comment 5 Barry Andrews 2002-12-19 15:31:06 UTC

I respectfully disagree with the decision to close this bug. It just makes 
sense to have the cellIterator() return the Iterator in the correct forward 
order. This method could be very convenient, but if the programmer has to 
reorder it, it's pretty much useless. I believe this is happening because 
HashMap was used. Couldn't a different data structure be used instead? Can we 
please keep this one open for a while and let some folks vote on it? 

thanks,

Barry

Comment 6 Andy Oliver 2002-12-19 16:21:21 UTC

Sure.  You can leave it open and please feel free to vote (if enough people feel
that way and I think they are making an INFORMED vote I/other commiters may
change my/our mind).  I'm retargeting to 2.0 because there is like NO way we're
backporting such changes into 1.5.1 (behavioral/feature-oriented,etc).  However,
the fact we're using a HashMap will change in 3.0 and instead we'll probably
return them in the order you suggest just due to HOW we'll be storing it.  I
just don't want to guarantee order in this interface because it could change and
the file format itself might effect it.  Personally, I think you're suffering
from file-format API versus VBA-style API confusion.  The HSSF usermodel is to
give you access to the file format without exposing you to certain nasty details
(such as the fact that rows are completely unrelated to cells and all the little
records and intricacies).  VBA and Formula 1 make it look like you're using
Excel (and one interfaces with Excel single-threadedly, and the other is a full
implementation of Excel in Java more or less...to the tune of 10k).  Its the
difference between abstracting the file format to you and creating an
implementation of Excel.  We make this decision for performance reasons and
simplicity.  (Formula 1 and VBA APIs are simpler to conceieve but harder to
master because there are just so freaking many of them...10 different ways to do
EVERYTHING... HSSF seeks a greater conceptual simplicity.  Also "convienience
functions" are by [apparent] community consensus until a later release -- we're
all infected with eXtremeProgramming style thought.)

Besides.  Just because you need the cells or rows in order, doesn't mean
everyone does.  Depending on what you're doing, the reactor pattern (in your own
code) might help you here regardless of whether you're using the eventmodel:
http://www.freeroller.net/page/acoliver/20021215#the_reactor_pattern_in_reading

Comment 7 Sean Geraty 2003-04-28 08:59:07 UTC

Hi,

I'm new to this, so please excuse me if I do anything incorrectly. I've voted 
for this to be changed because of the following:
-> While no contract to order exists, there is certainly a logical expectation 
of sequence because the HSSFSheet.rowIterator() does deliver its results 
ordered from low to high, so why not HSSFRow.cellIterator?
-> It appears easy to do - I got an ordered sequence by simply changing the 
HashMap cells to TreeMap (and removing the constructors initial capacity) in 
HSSFRow.java - only 3 lines. By the way, this will make it consistent with the 
TreeMap rows defined in HSSFSheet.java.

If the change is declined, perhaps a compromise method (e.g. 
HSSFRow.orderedCellIterator() - that converts the HashMap to a TreeMap?).

Cheers,
Sean

Comment 8 Jason Height 2006-01-13 03:32:54 UTC

As a result of the recent performance change, the storage of the HSSFCell
objects was changed from a TreeMap implementation to an array based one.

This has the beneficial sideeffect that the cellIterator is now in cell order.

This change is available in SVN.

Jason