I have an Excel file that contains several sheet pages, each containing one or more attachments, and I want to extract one of the sheet pages and save it to a new workbook. My approach is to remove the rest of the sheet pages from the original document and save them to a new file. This will achieve what I need, but the extracted file is basically the same size as the original file, and I want to be able to keep only the files in the sheet I need in the new workbook. Can anyone help me?
Can you attach a sample workbook and some code for a reproducible test-case which shows your case and allows others to try to help?
Created attachment 38095 [details] this is origin excel template
Created attachment 38096 [details] this is extract whole sheet named "attachment1 [details]" file from the origin excel this is test case. ``` @Test public void testExtractSheet() throws IOException { Workbook workbook = WorkbookFactory.create(new File("template - 副本.xls")); try { int numberOfSheets = workbook.getNumberOfSheets(); boolean found = false; String sheetNameToExtract = "attachment1 [details]"; for (int i = 0; i < numberOfSheets; i++) { Sheet sheetAt = workbook.getSheetAt(i); if (!sheetAt.getSheetName().equalsIgnoreCase(sheetNameToExtract)) { workbook.removeSheetAt(i--); numberOfSheets--; } else { found = true; } } if (!found) { workbook.close(); throw new FileNotFoundException("can not find sheet: " + sheetNameToExtract); } File outputFile = new File(System.currentTimeMillis() + ".xls"); FileUtils.createParentDirectories(outputFile); try (FileOutputStream stream = new FileOutputStream(outputFile)) { workbook.write(stream); } } finally { org.apache.commons.io.IOUtils.closeQuietly(workbook); } } ``` The result is that the original excel file is 847K and the sheet named "attachment1 [details]" contains only a simple excel, but when it is extracted to a new file, the new file size is 845K, so I guess poi will not delete the irrelevant files from this sheet.
Maybe you could write custom code to remove the attachments yourself. https://github.com/apache/poi/blob/trunk/poi/src/main/java/org/apache/poi/ss/extractor/EmbeddedExtractor.java will give you an idea how to iterate over the attachments.