Bug 58245 - [PATCH] Make Workbook interface iterable over sheets
Summary: [PATCH] Make Workbook interface iterable over sheets
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: SS Common (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2015-08-15 07:15 UTC by Javen O'Neal
Modified: 2015-09-17 11:18 UTC (History)
0 users



Attachments
patch to make Workbook interface implement Iterable<Sheet> (1.32 KB, patch)
2015-08-15 07:15 UTC, Javen O'Neal
Details | Diff
patch to make all workbooks implement Iterable<Sheet> (22.33 KB, patch)
2015-09-06 09:19 UTC, Javen O'Neal
Details | Diff
site documentation updates (2.89 KB, patch)
2015-09-06 09:24 UTC, Javen O'Neal
Details | Diff
site documentation updates (2.85 KB, patch)
2015-09-06 09:31 UTC, Javen O'Neal
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Javen O'Neal 2015-08-15 07:15:19 UTC
Created attachment 33001 [details]
patch to make Workbook interface implement Iterable<Sheet>

In one of my own projects, I would like to write code without specifying any one particular implementation (HSSFWorkbook, XSSFWorkbook).

> Workbook wb = WorkbookFactory.create(file);
> for(Sheet worksheet : workbook) {
>   // do something with worksheet
> }

Both HSSFWorkbook and XSSFWorkbook implement an iterator over the sheets, but the Workbook interface is missing this, thus I can't write the above code without defining the workbook as HSSF or XSSF.
Comment 1 Javen O'Neal 2015-08-15 10:42:03 UTC
It looks like neither HSSFWorkbook nor XSSFWorkbook have sheet iterators defined for them.

I'm not sure what's preferred in this case, upcasting all iterators to Iterator<Sheet> or leaving the iterators in their more specific sheet type. The former is how XSSFWorkbook.iterator is currently implemented. The latter is similar to how XSSFSheet.rowIterator is currently implemented.

POI devs, what's preferred here?


============

ss.usermodel.Workbook:
public interface Workbook extends Closeable, Iterable {
    Iterator<? extends Sheet> iterator();
}

xssf.usermodel.XSSFWorkbook:
public class XSSFWorkbook extends POIXMLDocument implements Workbook {
    @Override
    public Iterator<XSSFSheet> iterator() {
        return sheets.iterator();
    }

hssf.usermodel.HSSFWorkbook:
public final class HSSFWorkbook extends POIDocument implements Workbook {
    @Override
    public Iterator<HSSFSheet> iterator() {
        return _sheets.iterator();
    }

xssf.streaming.SXSSFWorkbook:
public class SXSSFWorkbook implements Workbook {
    @Override
    public Iterator<SXSSFSheet> iterator() {
        return new SXSSFSheetIterator(_wb.iterator());
    }
}

=================

ss.usermodel.Workbook:
public interface Workbook extends Closeable, Iterable {
    Iterator<Sheet> iterator();
}

xssf.usermodel.XSSFWorkbook:
public class XSSFWorkbook extends POIXMLDocument implements Workbook {
    @Override
    public Iterator<Sheet> iterator() {
        return (Iterator<Sheet>)(Iterator<? extends Sheet>) sheets.iterator();
    }

hssf.usermodel.HSSFWorkbook:
public final class HSSFWorkbook extends POIDocument implements Workbook {
    @Override
    public Iterator<Sheet> iterator() {
        return (Iterator<Sheet>)(Iterator<? extends Sheet>) _sheets.iterator();
    }

xssf.streaming.SXSSFWorkbook:
public class SXSSFWorkbook implements Workbook {
    @Override
    public Iterator<Sheet> iterator() {
        Iterator<SXSSFSheet> it = new SXSSFSheetIterator(_wb.iterator());
        return (Iterator<Sheet>)(Iterator<? extends Sheet>) it;
    }
}
Comment 2 Andreas Beeker 2015-08-15 11:08:01 UTC
I had the same intention with X/HSLF [1], but this ends in some interesting generics definitions ... it works now, but I still think that some declarations could be simpler ..

so I guess, I would do it as in org.apache.poi.sl.usermodel.ShapeContainer with a parameterized interface

[1] http://stackoverflow.com/questions/29440337/java-generics-parameterized-class-vs-typed-method
Comment 3 Javen O'Neal 2015-08-16 04:08:25 UTC
Mimicking what you wrote in r1692593:
org.apache.poi.sl.usermodel.ShapeContainer:  https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/sl/usermodel/ShapeContainer.java?limit_changes=0&r1=1692593&r2=1692592&pathrev=1692593
org.apache.poi.xslf.usermodel.XSLFShapeContainer: https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xslf/usermodel/XSLFShapeContainer.java?r1=1410315&r2=1692593&pathrev=1692593

Is this what you were referring to?


ss.usermodel.Workbook:
public interface Workbook<T extends Sheet> extends Closeable, Iterable<T> {
    Iterator<T> iterator();
}

xssf.usermodel.XSSFWorkbook:
public class XSSFWorkbook extends POIXMLDocument implements Workbook<XSSFSheet> {
    @Override
    public Iterator<XSSFSheet> iterator() {
        return sheets.iterator();
    }

hssf.usermodel.HSSFWorkbook:
public final class HSSFWorkbook extends POIDocument implements Workbook<HSSFSheet> {
    @Override
    public Iterator<HSSFSheet> iterator() {
        return _sheets.iterator();
    }

xssf.streaming.SXSSFWorkbook:
public class SXSSFWorkbook implements Workbook<SXSSFSheet> {
    @Override
    public Iterator<SXSSFSheet> iterator() {
        return new SXSSFSheetIterator(_wb.iterator());
    }
}

I suppose I could go as far as writing a new interface:
public interface SheetContainer<T extends Sheet> extends Iterable<T>, but I currently don't have any additional methods that a SheetContainer would need to implement beyond what an Iterable requires.

This might be getting a bit out of my Java comfort zone here--I'm used to Python where generics/templates aren't needed because everything is duck typed.

Suggestions?
Comment 4 Nick Burch 2015-08-16 04:13:23 UTC
I don't think we can do the Workbook<T> trick. I had a play with that quite recently. The killer is that Eclipse (+friends) all start complaining when you write something like "Workbook wb = WorkbookFactory.create(file)", claiming that you should be giving it a generics type. We don't want to do that, as it defeats the whole point of telling everyone to use the SS interfaces!

We certainly could do 
Workbook:  Iterator<? extends Sheet> getSheetIterator();
HSSFWorkbook:  Iterator<HSSFSheet> getSheetIterator();

That works fine, and mirrors what we have for rows and cells. It's the iterable version that I think will have to remain "Workbook implements Iterable<Sheet>" and "HSSFWorkbook implements Workbook, Iterable<Sheet>", unless someone can come up with a cunning workaround!
Comment 5 Javen O'Neal 2015-09-06 09:19:53 UTC
Created attachment 33066 [details]
patch to make all workbooks implement Iterable<Sheet>

The XSSFWorkbook.iterator breaks that pattern used by Sheet.iterator and Row.iterator

Unfortunately, XSSFWorkbook has had this method for nearly 7 years (r700472 https://svn.apache.org/viewvc?view=revision&revision=700472).
Nonetheless, I think deprecating the old iterator is a step in the right direction, allowing developers to write for loops like

for (Sheet sh : workbook) {
    for (Row row : sh) {
        for (Cell cell : row) {
            System.out.println(cell);
        }
    }
}

I've provided test cases and documentation to help users transition their code to the new Iterator<Sheet> iterator() interface (previously Iterator<XSSFSheet> iterator().

Note: this patch will break backwards compatibility with existing code.
Comment 6 Javen O'Neal 2015-09-06 09:24:21 UTC
Created attachment 33067 [details]
site documentation updates

Updated quick-guide to include for-each sheet iteration in the attached patch.

These are the changes that developers would need to make with attachment 33066 [details]


final XSSFWorkbook wb = new XSSFWorkbook();
wb.createSheet();

// =====================================================================
// Case 1: Existing code uses XSSFSheet for-each loop
// =====================================================================
// Original code (no longer valid)
for (XSSFSheet sh : wb) {
    sh.createRow(0);
}

// Option A:
for (XSSFSheet sh : (Iterable<XSSFSheet>) (Iterable<? extends Sheet>) wb) {
    sh.createRow(0);
}

// Option B (preferred for new code):
for (Sheet sh : wb) {
    sh.createRow(0);
}

// =====================================================================
// Case 2: Existing code creates an iterator variable
// =====================================================================
// Original code (no longer valid)
Iterator<XSSFSheet> it = wb.iterator();
XSSFSheet sh = it.next();
sh.createRow(0);

// Option A:
Iterator<XSSFSheet> it = (Iterator<XSSFSheet>) (Iterator<? extends Sheet>) wb.iterator();
XSSFSheet sh = it.next();
sh.createRow(0);

// Option B (deprecated, but a quick-fix)
@SuppressWarnings("deprecation")
Iterator<XSSFSheet> it = wb.xssfSheetIterator();
XSSFSheet sh = it.next();
sh.createRow(0);

// Option C (preferred for new code):
Iterator<Sheet> it = wb.iterator();
Sheet sh = it.next();
sh.createRow(0);
Comment 7 Javen O'Neal 2015-09-06 09:31:44 UTC
Created attachment 33068 [details]
site documentation updates

fix quick-guide whitespace
Comment 8 Nick Burch 2015-09-07 15:47:26 UTC
Looks good to me, thanks for this

I think that once we've got the build green again from the forbidden APIs check changes, we should be fine to apply this as-is. Please remind us in 1-2 weeks if we forget!
Comment 9 Javen O'Neal 2015-09-11 00:37:26 UTC
As of r1702321, build 841 <https://builds.apache.org/job/POI/841/>, the build server is green again.
Comment 10 Nick Burch 2015-09-17 11:18:55 UTC
Thanks for this, and for your patience!

Code change applied in r1703573, site in r1703574.