Created attachment 26729 [details] Java program to test performance of sheet.setDefaultColumnStyle(col, style) for different no. of columns When creating an Excel sheet with thousands of columns, the performance of sheet.setDefaultColumnStyle(col, style) keeps degrading as more and more columns are added. For every call to this method with a new column, the time taken by the method keeps on increasing. Please execute the attached test program with 1000 columns or more, and notice the time taken by the method for every 100th column. The CPU consumption also reaches 100% during execution of the program. Please note: Though the test program sets only one default style for all the columns, in practice, I need to create an Excel sheet with different default styles for different columns. Because of the performance issue, creating large Excel sheets with thousands of columns is time-consuming and it is also consuming a lot of CPU for the entire duration.
I suspect that ColumnHelper.getColumn1Based is the source of the slowdown - because a CTCol entry can cover more than one column we currently need to loop over all of them to find the appropriate spot Can you try making a call to getColumn1Based for your columns instead of the set style call, and see if that shows the same problem?
Nick, you are right. I modified my code as per your suggestion and as you suspected ColumnHelper.getColumn1Based is the source of the slowdown. Here is the modified for loop: ColumnHelper helper = sheet.getColumnHelper(); for(int i = 0; i < numCols; i++) { long startTime = System.currentTimeMillis(); // sheet.setDefaultColumnStyle(i, style); CTCol col = helper.getColumn1Based(i+1, true); if ((i+1)%100 == 0) { long endTime = System.currentTimeMillis(); System.out.printf("Time for setting default column style on %4dth column (ms): %10d%n", (i+1), (endTime-startTime)); } if (col == null) { col = sheet.getCTWorksheet().getColsArray(0).addNewCol(); col.setMin(i+1); col.setMax(i+1); } col.setStyle(style.getIndex()); }
I have a feeling that we might need to re-write the search as an xpath, rather than the current method of iterating to find the correct one (We probably don't want to try to maintain a cache, as other code could easily change the CTCol entries) Can you try seeing how an xpath would perform? You need to get the CTSheet off the XSSFSheet, then get the CTCols, and finally do a simple xpath query (eg to get the child with a known attribute). If that looks to perform ok as you add the columns, then we know it's a good bet for switching too.