Bug 50868 - Performance of sheet.setDefaultColumnStyle(col, style) degrades as more columns are added
Summary: Performance of sheet.setDefaultColumnStyle(col, style) degrades as more colum...
Status: NEEDINFO
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.7-FINAL
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-04 06:04 UTC by ajit_bhagvat
Modified: 2015-10-29 09:07 UTC (History)
2 users (show)



Attachments
Java program to test performance of sheet.setDefaultColumnStyle(col, style) for different no. of columns (3.11 KB, application/octet-stream)
2011-03-04 06:04 UTC, ajit_bhagvat
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ajit_bhagvat 2011-03-04 06:04:42 UTC
Created attachment 26729 [details]
Java program to test performance of sheet.setDefaultColumnStyle(col, style) for different no. of columns

When creating an Excel sheet with thousands of columns, the performance of sheet.setDefaultColumnStyle(col, style) keeps degrading as more and more columns are added. For every call to this method with a new column, the time taken by the method keeps on increasing.

Please execute the attached test program with 1000 columns or more, and notice the time taken by the method for every 100th column. The CPU consumption also reaches 100% during execution of the program.

Please note: Though the test program sets only one default style for all the columns, in practice, I need to create an Excel sheet with different default styles for different columns.

Because of the performance issue, creating large Excel sheets with thousands of columns is time-consuming and it is also consuming a lot of CPU for the entire duration.
Comment 1 Nick Burch 2011-03-04 06:11:17 UTC
I suspect that ColumnHelper.getColumn1Based is the source of the slowdown - because a CTCol entry can cover more than one column we currently need to loop over all of them to find the appropriate spot

Can you try making a call to getColumn1Based for your columns instead of the set style call, and see if that shows the same problem?
Comment 2 ajit_bhagvat 2011-03-09 07:21:35 UTC
Nick, you are right. I modified my code as per your suggestion and as you suspected ColumnHelper.getColumn1Based is the source of the slowdown.

Here is the modified for loop:

    ColumnHelper helper = sheet.getColumnHelper();
    for(int i = 0; i < numCols; i++) {
      long startTime = System.currentTimeMillis();
//      sheet.setDefaultColumnStyle(i, style);
      CTCol col = helper.getColumn1Based(i+1, true);
      if ((i+1)%100 == 0) {
        long endTime = System.currentTimeMillis();
        System.out.printf("Time for setting default column style on %4dth column (ms): %10d%n", (i+1), (endTime-startTime));
      }
      if (col == null) {
            col = sheet.getCTWorksheet().getColsArray(0).addNewCol();
            col.setMin(i+1);
            col.setMax(i+1);
      }
      col.setStyle(style.getIndex());
    }
Comment 3 Nick Burch 2011-03-14 17:34:23 UTC
I have a feeling that we might need to re-write the search as an xpath, rather than the current method of iterating to find the correct one

(We probably don't want to try to maintain a cache, as other code could easily change the CTCol entries)

Can you try seeing how an xpath would perform? You need to get the CTSheet off the XSSFSheet, then get the CTCols, and finally do a simple xpath query (eg to get the child with a known attribute). If that looks to perform ok as you add the columns, then we know it's a good bet for switching too.