Bug 65490

Summary: Can't remove Hyperlinks that are used in multiple cells
Product: POI Reporter: Joan Pujol <joanpujol>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 5.0.0-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: File to reproduce the error

Description Joan Pujol 2021-08-09 12:50:17 UTC
In case a hyperlink is used in a cell range (for example B2:B4) it can't be deleted.
Because although it can be retrieved with `sheet.getHyperlinkList()` it can't be deleted because currently available method org.apache.poi.xssf.usermodel.XSSFSheet#removeHyperlink` only match a single cell.

You can easily create one of that cells with copy-paste using a cell with a hyperlink and paste it into a selection of several cells.

If you have one of those cells with hyperlinks, operations like `sheet.shiftRows` fail without being able to process the file and not being able to work around by deleting the links.
Comment 1 PJ Fanning 2021-08-09 13:08:56 UTC
can you provide a reproducible use case? the current code assumes that each XSSFHyperlink is for one cell and one cell only (see it is has a cell ref)
Comment 2 PJ Fanning 2021-08-09 13:15:47 UTC
PS org.apache.poi.xssf.usermodel.XSSFSheet#removeHyperlink is annotated as an internal API - removing hyperlinks (in theory) should be done using the Cell removeHyperlink method.
Comment 3 Joan Pujol 2021-08-09 14:00:23 UTC
Yes, I've used the cell method, but as it didn't work I looked at the code and I've seen, as you say, that it doesn't work because it expects a single cell.

I will try to submit a minimal reproducible example.
Comment 4 Joan Pujol 2021-08-09 14:22:21 UTC
Created attachment 37978 [details]
File to reproduce the error
Comment 5 Joan Pujol 2021-08-09 14:23:15 UTC
Code to reproduce de error:

            try (InputStream fileStream = new FileInputStream("c:\\temp\\multiplehyperlinks.xlsx")) {
                XSSFWorkbook workbook = new XSSFWorkbook(fileStream);  //I
                Streams.stream(workbook.getSheetAt(0).rowIterator())
                        .flatMap(row -> Streams.stream(row.cellIterator()))
                        .filter(cell -> cell.getHyperlink() != null)
                        .forEach(Cell::removeHyperlink);

                assert  workbook.getSheetAt(0).getHyperlinkList().size() == 0;
           }
Comment 6 PJ Fanning 2021-08-09 14:35:56 UTC
Current code doesn't support this (from sheet1.xml)


<hyperlink ref="A3:A5" r:id="rId2" display="http://www.apache.org" xr:uid="{DAEE6718-A7F1-4532-A628-07B4379C0500}"/>

It expects that the ref is a single cell.
Comment 7 PJ Fanning 2021-08-09 18:17:24 UTC
Implementing this could be tricky.
* need to fix the getHyperlink logic to be aware the hyperlink cell ref can be a range
* removeHyperlink logic needs to handle case where you are removing a shared hyperlink from just one cell. If you remove from the middle of the range of cells then you need to create extra hyperlinks to replace the one you remove - it might be easier initially to create single cell hyperlinks on the cells that are part of the original shared range but that we have not received a request to remove the shared hyperlink from
* XSSFRow removeCell does not seem to try to handle removing related hyperlinks
* the row shifter logic will need to take into account the fact the hyperlink might be shared by cells on separate rows - like with removeHyperlink if might be tidier to make the hyperlink on the destination a non-shared hyperlink
Comment 8 PJ Fanning 2021-08-09 22:29:21 UTC
Joan - this will be fixed eventually but if you want to remove hyperlinks that have cell refs that span many cells - as a workaround - you could get the hyperlinks from the sheet and iterate over them to find the ones you want to remove and change the cell ref to be one that is just a simple single-cell cell ref and then you can remove that hyperlink using that sheet.removeHyperlink or the cell.removeHyperlink (the sheet.removeHyperlink may need to be changed in the next release to accommodate the fix for this issue.
Comment 9 Joan Pujol 2021-08-10 05:15:18 UTC
PJ Fanning - Thank you very much for your support and the provided workaround.
Comment 10 PJ Fanning 2021-08-10 19:40:28 UTC
I've added a partial fix - r1892189
Comment 11 PJ Fanning 2021-08-10 19:41:55 UTC
I've already updated POI trunk so that XSSFCell#getHyperlink will find a shared hyperlink if there is one
Comment 12 PJ Fanning 2021-08-11 21:15:05 UTC
I added support for removing a hyperlink from one cell and keeping the hyperlink on other cells (when the original hyperlink is a shared one).
Comment 13 PJ Fanning 2021-08-11 21:15:17 UTC
r1892246