Bug 55904

Summary: [PATCH] Cell getHyperlink returns null when the link is drag copied in excel
Product: POI Reporter: Praveen <macpraveen>
Component: XSSFAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: normal CC: guarale, lucamartini
Priority: P2 Keywords: PatchAvailable
Version: 3.9-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Bug Depends on:    
Bug Blocks: 60277    
Attachments: Different solution not using Regex
two test cases
Cumulative patch for Hyperlinks issues
rebased attachment 33669: search for multi-cell hyperlinks
enable multi-cell XSSFHyperlinks, find multi-cell hyperlinks
MultiCellHyperlink.xlsx

Description Praveen 2013-12-18 12:46:17 UTC
When a cell containing hyperlink is drag copied the hyperlink cell reference is stored in range, for example, ref="B3:B10" and so the equals fails in the below getHyperLink code of XSSFSheet


public XSSFHyperlink getHyperlink(int row, int column) {
        String ref = new CellReference(row, column).formatAsString();
        for(XSSFHyperlink hyperlink : hyperlinks) {
            if(hyperlink.getCellRef().equals(ref)) {
                return hyperlink;
            }
        }
        return null;
    }


My solution,

public XSSFHyperlink getHyperlink(int row, int column) {
	String ref = new CellReference(row, column).formatAsString();
	for (XSSFHyperlink hyperlink : hyperlinks) {
		if (hyperlink.getCellRef().contains(":")){
			if(checkWithInRange(hyperlink.getCellRef(), ref)) {
				return hyperlink;
			}
		} else if (hyperlink.getCellRef().equals(ref)) {
			return hyperlink;
		}
	}
	return null;
}

public static final String RANGE_PATTERN = "(.+)(\\d)+:(.+)(\\d)+";
public static final String INPUT_PATTERN = "(.+)(\\d)+";
public static final Pattern P1 = Pattern.compile(RANGE_PATTERN);
public static final Pattern P2 = Pattern.compile(INPUT_PATTERN);

/*
* range - eg, "B3:B8"
* input - B4
*/
public static boolean checkWithInRange(String range, String input) {
	Matcher m1 = P1.matcher(range);
	Matcher m2 = P2.matcher(input);
	if (m1.find() && m2.find()) {
		String prefix1 = m1.group(1);
		String num1 = m1.group(2);
		String prefix2 = m1.group(3);
		String num2 = m1.group(4);

		String inputPrefix = m2.group(1);
		String inputNum = m2.group(2);

		if (prefix1 != null && prefix2 != null && inputPrefix != null			&& prefix1.equalsIgnoreCase(prefix2) && prefix2.equalsIgnoreCase(inputPrefix)		&& num1 != null && num2 != null && inputNum != null) {
			int n1 = Integer.parseInt(num1);
			int n2 = Integer.parseInt(num2);
			int n3 = Integer.parseInt(inputNum);
			if (n3 >= n1 && n3 <= n2) {
				return true;
			}
		}
	}
	return false;
}
Comment 1 Praveen 2013-12-18 13:02:20 UTC
My updated solution code

public static final String RANGE_PATTERN = "(\\D+)(\\d+):(\\D+)(\\d+)";
	public static final String INPUT_PATTERN = "(\\D+)(\\d+)";
	public static final Pattern P1 = Pattern.compile(RANGE_PATTERN);
	public static final Pattern P2 = Pattern.compile(INPUT_PATTERN);
	

	public static boolean checkWithInRange(String range, String input) {
		Matcher m1 = P1.matcher(range);
		Matcher m2 = P2.matcher(input);
		if (m1.find() && m2.find()) {
			String prefix1 = m1.group(1);
			String num1 = m1.group(2);
			String prefix2 = m1.group(3);
			String num2 = m1.group(4);
			String inputPrefix = m2.group(1);
			String inputNum = m2.group(2);

			if (prefix1.equalsIgnoreCase(prefix2) && prefix2.equalsIgnoreCase(inputPrefix)) {
				int n1 = Integer.parseInt(num1);
				int n2 = Integer.parseInt(num2);
				int n3 = Integer.parseInt(inputNum);
				if (n3 >= n1 && n3 <= n2) {
					return true;
				}
			}
		}
		return false;
	}
Comment 2 Alessandro Guarascio 2014-05-15 07:52:44 UTC
Created attachment 31623 [details]
Different solution not using Regex

Here is a different solution that uses CellRangeAddress instead of pattern matching.
Comment 3 Dominik Stadler 2015-03-15 21:50:39 UTC
Can you provide a small sample file which contains such a hyperlink?
Comment 4 Luca Martini 2016-01-22 17:36:12 UTC
Created attachment 33479 [details]
two test cases

Hi all,
You can find enclosed a small Eclipse project containing two tests that highlight the problem.
One of them creates programmatically a multi-cell hyperlink, while the other uses a pre-created Excel file.

I am a colleague of Alessandro (the guy who proposed the second patch, the one not using Regex).
With the patch suggested by Alessandro the two tests succeed, while they fail also with the current (trunk) version of the POI.

Please let me know if there is anything that is not clear.
Comment 5 Luca Martini 2016-01-22 17:37:12 UTC
I change the status of the ticket, because additional information has been provided.
Comment 6 Dominik Stadler 2016-03-12 21:11:51 UTC
I get failures when I try to run the unit-tests with the patch applied, can you take a look?


java.lang.NumberFormatException: For input string: "3:D4"

	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:580)
	at java.lang.Integer.parseInt(Integer.java:615)
	at org.apache.poi.ss.util.CellReference.<init>(CellReference.java:123)
	at org.apache.poi.xssf.usermodel.XSSFHyperlink.buildCellReference(XSSFHyperlink.java:266)
	at org.apache.poi.xssf.usermodel.XSSFHyperlink.getFirstRow(XSSFHyperlink.java:298)
	at org.apache.poi.xssf.usermodel.TestXSSFHyperlink.testGetHyperlink(TestXSSFHyperlink.java:329)

And 

org.junit.ComparisonFailure: 
Expected :B3:D4
Actual   :B3

	at org.junit.Assert.assertEquals(Assert.java:115)
	at org.junit.Assert.assertEquals(Assert.java:144)
	at org.apache.poi.xssf.usermodel.TestXSSFHyperlink.doChecks(TestXSSFHyperlink.java:441)
	at org.apache.poi.xssf.usermodel.TestXSSFHyperlink.testCreateMuliCellHyperlink4(TestXSSFHyperlink.java:429)


and

java.lang.AssertionError: Atteso hyperlink in (2,1)

	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertNotNull(Assert.java:712)
	at org.apache.poi.xssf.usermodel.TestXSSFHyperlink.doChecks(TestXSSFHyperlink.java:439)
	at org.apache.poi.xssf.usermodel.TestXSSFHyperlink.testCreateMuliCellHyperlink1(TestXSSFHyperlink.java:399)
Comment 7 Luca Martini 2016-03-14 10:46:51 UTC
Created attachment 33669 [details]
Cumulative patch for Hyperlinks issues

Cumulative patch for fixing 52903, 55904 56527, 46742
Comment 8 Luca Martini 2016-03-14 10:54:44 UTC
(In reply to Dominik Stadler from comment #6)
> I get failures when I try to run the unit-tests with the patch applied, can
> you take a look?
> 

Dominik, you are right. The tests I attached to this bug additionally depend on some other modifications we did in our branch to deal with a number of issue regarding Hyperlinks:

- Bug 56527
- Bug 46742

Please note that Bug 56527 is marked as resolved, but the correction erroneusly assume that an hyperlink is only related to a single cell.

I attached the cumulative patch to this bug.
Best regards,
    Luca
Comment 9 Javen O'Neal 2016-09-11 04:54:20 UTC
Created attachment 34232 [details]
rebased attachment 33669 [details]: search for multi-cell hyperlinks

I rebased attachment 33669 [details] to the latest trunk (circa 3.15 beta 3).

This needs to be merged with the unit tests TestCreateMultiCellHyperlink.java and TestMultiCellHyperlink.java using TestMultiCellHyperlink.xlsx from attachment 33479 [details].
Comment 10 Javen O'Neal 2016-09-11 06:51:53 UTC
Created attachment 34233 [details]
enable multi-cell XSSFHyperlinks, find multi-cell hyperlinks

Merge unit tests from attachment 33479 [details] with XSSFSheet changes from attachment 34232 [details].

The XSSFHyperlink class assumes that hyperlinks belong to a single cell.
 * XSSFHyperlink.setFirstRow and setLastRow do the same thing, same as setFirst/LastColumn.
 * XSSFHyperlink.setCellAddress(String ref) breaks if ref is an area reference.

This patch fixes the above problems. I need to verify that XSSFHyperlinks are really objects within an XSSFSheet rather than an XSSFCell (only one hyperlink object exists for a multi-cell hyperlink), and then I will commit these changes.
Note that this patch breaks backwards compatibility: before it was sufficient to set first row and column on a hyperlink and it would move the single-cell hyperlink. With the attached code, both the first and last cell will need to be set to avoid expanding a single-cell hyperlink to an area spanned by the upper left and lower right cells. Additionally, it is inefficient to set these cells individually, both in terms of CPU instructions and lines of code. Additional work will need to be done on the Hyperlink class to allow multi-cell hyperlinks.
Comment 11 Javen O'Neal 2016-09-11 06:54:10 UTC
Created attachment 34234 [details]
MultiCellHyperlink.xlsx

This excel file was extracted from attachment 33479 [details], submitted by Luca Martini.