Bug 51815

Summary: Unable to construct record instance
Product: POI Reporter: m.mongia24
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED LATER    
Severity: blocker CC: naveenchandravp, pedro.t.garcia
Priority: P2    
Version: 3.8-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: MS BiffValidator result
BiffViewer output

Description m.mongia24 2011-09-15 06:02:11 UTC
Hi,

I am getting an exception when i am trying to read an excel workbook- 2003(version).Workbook is password protected at workbook level (workbook cant be opened without password)and write protceted(workbook cant be modified without password) as well.

Error Details:

org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:64)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:263)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:392)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:276)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:201)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:183)
	at test.ExcelReader3.read(ExcelReader3.java:35)
	at test.ExcelReader3.main(ExcelReader3.java:85)
Caused by: java.lang.IllegalArgumentException: Name is too long: ü?Cl5‹¸…s."‡L^»ô’‘?˜yêââøýù&—JÖö6²¸øÄùjÿËL+?­ÊäÓ'›¿RV¤?ÐqЪ$E}½då
L~íL £¨Î*€=‚ÿ\Ì ë¼%âñð$yCüˆä¿3¿l4
	at org.apache.poi.hssf.record.WriteAccessRecord.setUsername(WriteAccessRecord.java:107)
	at org.apache.poi.hssf.record.WriteAccessRecord.<init>(WriteAccessRecord.java:74)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
	at java.lang.reflect.Constructor.newInstance(Unknown Source)
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:56)
	... 9 more
Picked up JAVA_TOOL_OPTIONS: -agentlib:jvmhook
Picked up _JAVA_OPTIONS: -Xrunjvmhook -Xbootclasspath/a:C:\PROGRA~1\HP\QUICKT~1\bin\JAVA_S~1\classes;C:\PROGRA~1\HP\QUICKT~1\bin\JAVA_S~1\classes\jasmine.jar


My Java code is ->


package test;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.hssf.record.crypto.Biff8EncryptionKey;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;

public class ExcelReader3 
{
	public static final String PATH = "C:\\Monika\\2003\\";
	//This is Password protected file(it is protceted by both options- i.e., workbook cant be openede without pwd and can't be modified without pwd.) 
	public static final String FILE_NAME = PATH + "Both.xls";
	
	public void read(String password) throws Exception
	{
		
		
		Biff8EncryptionKey.setCurrentUserPassword(password); 		
		
		POIFSFileSystem pois = new POIFSFileSystem(new FileInputStream(FILE_NAME));		
		
		//THIS LINE FAILS
		HSSFWorkbook book = new HSSFWorkbook(pois);
		
		HSSFSheet sheet = book.getSheetAt(0);
		
		System.out.println(sheet.getSheetName());
		
		
		ByteArrayOutputStream bos = new ByteArrayOutputStream();
		
		book.write(bos);		
		
		FileOutputStream fs=new FileOutputStream("C:\\testfos.xls");
		int count;
		ByteArrayInputStream bis=new ByteArrayInputStream(bos.toByteArray());
		while((count = bis.read()) != -1) 
		{
			fs.write(count);
		}	
	}
	
	public static void main(String[] args) 
	{
		try 
		{
			String password = "abc";
			
			ExcelReader3 reader = new ExcelReader3();	
			reader.read(password);
		} 
		catch (Exception e) 
		{
			e.printStackTrace();
		}
	}
}


Same code works fine, if excel is only password protected at workbook level (workbook cant be opened without password)- it gives me an unprotected excel or only write protceted(workbook cant be modified without password) - it gives me the file as is,i.e., write protected.

but if excel is protceted by both the cases, it fails.

i am using POI version - 3.6 .Have tried with 3.7 also. but not working.

Pls help!!! :(

Thanks!!!
Comment 1 Nick Burch 2011-09-15 11:08:59 UTC
Any chance you could share the problem file?

Also, does the problem file pass the Microsoft validator? See the FAQ http://poi.apache.org/faq.html#faq-N10109 for details
Comment 2 Triqui 2012-05-29 10:18:54 UTC
Created attachment 28851 [details]
MS BiffValidator result
Comment 3 Triqui 2012-05-29 10:29:04 UTC
I'm adding this comment because I've found this problem in a number of documents, and I think POI could handle this error a bit better. And I think the same error is showing in the document attached for bug 50833.

The question seems to be that when the WriteAccessRecord finds a corrupt header tries to reconstruct the data for the last document modification and tries to set the username. But sometimes fail. I've seen the code and found a comment which states that this information is optional:
// String header looks wrong (probably missing)
// OOO doc says this is optional anyway.
// reconstruct data

So I wonder if it could be possible to catch the exception thrown by the setUsername method and just clear the username in that case. I've been testing with LibreOffice and Excel 2003 and they ignore that information in any document with this problem.

Something like this could solve the problem:

public WriteAccessRecord(RecordInputStream in) {
	...
	if (nChars > DATA_SIZE || (is16BitFlag & 0xFE) != 0) {
		// String header looks wrong (probably missing)
		// OOO doc says this is optional anyway.
		// reconstruct data
		byte[] data = new byte[3 + in.remaining()];
		LittleEndian.putUShort(data, 0, nChars);
		LittleEndian.putByte(data, 2, is16BitFlag);
		in.readFully(data, 3, data.length-3);
		String rawValue = new String(data);
		try {
			setUsername(rawValue.trim());
		} catch (IllegalArgumentException e) {
			setUsername(""); // or just log a warning and skip it
		}
		return;
	}
	...
}


I don't know if there is a better way of handling corrupted data, but if that bit of info is optional, I think no exception should be thrown.
Please, let me know what you think.

* I attached results from ms biffvalidator and poi biffviewer, I cannot attach a real document since it contains sensible information and when I try to modify it and save the problem goes away. But let me know if you really need it and I will see what I can do.
* Just in case, the "data" byte array final value is:
[5, 63, 63, 63, -122, ... (till index 55), 32, ... (till index 111)]
Comment 4 Triqui 2012-05-29 10:29:49 UTC
Created attachment 28852 [details]
BiffViewer output
Comment 5 Naveenchandra Patil 2014-05-21 10:11:01 UTC
Dear Team,
  Was there any solution found for the problem stated above ? Even I am facing the same issue.

Thanks!

Regards,
Naveen
Comment 6 Nick Burch 2014-06-11 17:21:28 UTC
We need a test file which shows the problem

Comment 3 suggests it could be triggered with the file from bug #50833, but that one loads fine with the current version of Apache POI. If the problem still remains, we need a file that shows it so we can check, investigate and fix
Comment 7 Triqui 2014-07-31 14:53:19 UTC
You are right. I don't know if POI handles the file stream a bit better now and this error is not showing anymore or I was wrong when I said the attachment in bug 50833 had this problem too. Maybe I only tested with BiffViewer and didn't try to create a workbook the normal way.
But, I would like to know what you think about what I said before about the comments in the code stating that the information is optional, shouldn't we avoid throwing exceptions from that piece of code?
Also, in a case like this where a file works with the current version, but the BiffViewer from that same version throws an exception, what should be done? Fix BiffViewer? Since BiffViewer uses the same WriteAccessRecord class, how can this be fixed?
Moreover, do you think it would be possible to write some tests for BiffViewer in order to validate this bug?
I really think a try-catch block surrounding setUsername() would be a nice addition, given that in that place the data is already assumed to be corrupted and poi is just trying its best to reconstruct it. Throwing an exception if something fails while doing so seems very strange.
Comment 8 Nick Burch 2014-07-31 15:11:04 UTC
With a failing test file, we can review exactly what is and isn't option in comparison with the file and the file format docs

We do have tests which run BiffViewer

Since the supplied file passes just fine with trunk, I'm closing this bug. If you do find another one that does still fail on trunk, please re-open the bug and upload the problematic file
Comment 9 Naveenchandra Patil 2014-08-01 07:39:17 UTC
Hello Nick,
  I do have a sample file which would fail the reading by poi. But unfortunately I cannot upload it here as it against our company policies. 

  But I felt this would be interesting for you. The sample file is created by some tool and the poi throws error only when it is read directly after the file created. If the created file is opened once by the user manually and fed to poi it reads fine :)

  After debugging I could see that the eronous username which was coming at the first feed to poi was set properly if we open the file once!!

Regards,
Naveen
Comment 10 Triqui 2014-08-01 07:56:53 UTC
Could you create a sample file with fake data? I cannot do it myself because the files I had with this problem were sent to us by some of our customers and I lost track of which ones were sending those (I applied the proposed patch and they are going through without issues).
That would get this bug fixed.