When signing or verifying a bigger (about 0.5 Gbyte) file got out of memory error in org.apache.xml.security.signature.XMLSignatureInput, at line 1275: bytes=JavaUtils.getBytesFromStream(_inputOctetStreamProxy);
There is only one case where the library can avoid reading the whole reference file in memory. And it is when the only transformation that is going to apply to the transformation is B64 decoding/encoding. And this optimiozation is already implemented. What is your case? Can you post your signature, with the references and trasformation? Thanks. p.s- I understand that all issues are critical to the people suffering them. But this one seems important to me but not critical. Mind this the next time you report a bug to other projects.
Created attachment 16366 [details] Modified CreateSignature.java for testing OutOfMemoryError new lines at line 151: // add a bigger file (size: 45 635 523) File bigfile = new File("jdk-1_5_0-doc.zip"); sig.addDocument(bigfile.toURI().toString());
If you don't put any transformation to a reference, the inclusive C14N is implicit, so the file need to be readed in memory. The only transformation that is not going to convert is base64 decoding. So if you want to sign rawdata base64 encoding it before, and handle like this: { //create the transforms object for the Document/Reference Transforms transforms = new Transforms(doc); transforms.addTransform(Transforms.TRANSFORM_BASE64_DECODE); // add a bigger file (size: 62 163 072 > 45 635 523) // comming from base64 /usr/portage/distfiles/xorg-x11-6.8.99.15.tar.bz2 File bigfile = new File("/tmp/tmp.b64"); sig.addDocument(bigfile.toURI().toString(),transforms, Constants.ALGO_ID_DIGEST_SHA1); } In this way as the test pass without any out of memory error (I have test with just plain java). I will mark this bug as invalid.
(In reply to comment #3) > If you don't put any transformation to a reference, the inclusive C14N is > implicit, so the file need to be readed in memory. > The only transformation that is not going to convert is base64 decoding. > So if you want to sign rawdata base64 encoding it before, and handle like this: > { > //create the transforms object for the Document/Reference > Transforms transforms = new Transforms(doc); > transforms.addTransform(Transforms.TRANSFORM_BASE64_DECODE); > // add a bigger file (size: 62 163 072 > 45 635 523) > // comming from base64 /usr/portage/distfiles/xorg-x11- 6.8.99.15.tar.bz2 > File bigfile = new File("/tmp/tmp.b64"); > sig.addDocument(bigfile.toURI().toString(),transforms, > Constants.ALGO_ID_DIGEST_SHA1); > } > In this way as the test pass without any out of memory error (I have test with > just plain java). > I will mark this bug as invalid. Thank you for your answer. But base64 encoding is not solution for my case. We have a Protection Profile and Security Target which dispose of "What You See Is What You Sign". We have to sign the original file selected and seen by the user not the base64 encoding of the file. The developed application will be CC evaluated and certified so I must satisfy the WYSWYS security objective. I realized that here a resettable FileInputStream could be the solution. Could you give me instruction how and where to put it ? Thanks, Agnes
Created attachment 16449 [details] Solution idea I have an idea to solve signing and verifying big files without Base64 transformation and without OutOfMemoryError. 1) In XMLSignatureInput.java should be a new member: File InputFile = null; 2) In XMLSignatureInput.java should be a new constructor: public XMLSignatureInput(File inputfile) { InputFile = inputfile; } 3) In XMLSignatureInput.java, updateOutputStream method should be modified: /** * @param diOs * @throws IOException * @throws CanonicalizationException */ public void updateOutputStream(OutputStream diOs) throws CanonicalizationException, IOException { if (diOs==outputStream) { return; } if (bytes!=null) { diOs.write(bytes); return; }else if (this.isElement()) { Canonicalizer20010315OmitComments c14nizer = new Canonicalizer20010315OmitComments(); c14nizer.setWriter(diOs); c14nizer.engineCanonicalizeSubTree(this._subNode,this.excludeNode); return; } else if (this.isNodeSet()) { /* If we have a node set but an octet stream is needed, we MUST c14nize * without any comments. * * We don't use the factory because direct instantiation should be a * little bit faster... */ Canonicalizer20010315OmitComments c14nizer = new Canonicalizer20010315OmitComments(); c14nizer.setWriter(diOs); if (this._inputNodeSet.size() == 0) { // empty nodeset return; } c14nizer.engineCanonicalizeXPathNodeSet(this._inputNodeSet); return; } else { /* --> begin of new lines */ if ( InputFile != null ) { byte[] buffer = new byte[32*1024]; BufferedInputStream bis = null; try { bis = new BufferedInputStream(new FileInputStream(InputFile)); int bytesread = 0; while ( (bytesread = bis.read(buffer)) != -1 ) { diOs.write(buffer,0,bytesread); } } finally { if ( bis != null ) { bis.close(); bis = null; } } return; } /* <-- end of new lines */ InputStream is = getResetableInputStream(); if (bytes!=null) { //already read write it, can be rea. diOs.write(bytes,0,bytes.length); return; } is.reset(); int num; byte[] bytesT = new byte[1024]; while ((num=is.read(bytesT))>0) { diOs.write(bytesT,0,num); } } } 4) In ResolverLocalFileSystem.java, line 65 and 66 should be modified: /* original, line 65-66: FileInputStream inputStream = new FileInputStream(fileName); XMLSignatureInput result = new XMLSignatureInput(inputStream); */ /* new, line 65: */ XMLSignatureInput result = new XMLSignatureInput(new File(fileName)); It is only an idea, not a real patch. I tried it with signing and verifying two big files (jdk-1_5_0-doc.zip, size=45635523 and lnx_920_disk2.cpio.gz, size=588798999) and it works fine in this case: <ds:SignedInfo Id="BigFile_2_Signature-1__SignedInfo-1"> <ds:CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /> <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1" /> <ds:Reference Id="BigFile_2_Signature-1__Reference-1" Type="http://uri.etsi.org/01903/v1.2.2#SignedProperties" URI="#BigFile_2_Signature-1__SignedProperties-1"> <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /> <ds:DigestValue>NmZTo8j7D61PP1opL1HTs2YphN0=</ds:DigestValue> </ds:Reference> <ds:Reference Id="BigFile_2_Signature-1__Reference-2" URI="file:jdk-1_5_0-doc.zip"> <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /> <ds:DigestValue>HZqRglwK5G0XFQ36wdGzx00w+kQ=</ds:DigestValue> </ds:Reference> <ds:Reference Id="BigFile_2_Signature-1__Reference-3" URI="file:lnx_920_disk2.cpio.gz"> <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /> <ds:DigestValue>gaHz/04SteF1VCpuLY/+fkWksT8=</ds:DigestValue> </ds:Reference> </ds:SignedInfo> Maybe something like this idea should be useful for other people or should be applied in a new version of the library.
I have the same problem as Agnes. We're using a Manifest in the signature to refer to very large external files. We're expecting to encounter multi-gigabyte files in some situations., and we'd also prefer to have them signed as-is, not as a B64 encoding of the original. Unless I'm misunderstanding the workaround, this suggestion does NOT constitute a useful resolution to the problem. Please provide a direct programatic mechanism to add References to binary files without reading them into memory in their entirety.
To be more precise, I am experiencing this error in a call to Manifest.generateDigestValues(), not in Signature (Agnes appears not to have specified how the error occured)
The idea of Agnes sounds good. But please post a patch, If you want to discuss further. If not sadly it will be end in the TODO list. Sorry
I did find myself a nice workaround for the problem. At issue, as I currently understand the code, is that the code in XMLSignatureInput assumes that it is dealing with a relatively small piece of input (true for internal references, not so for external), and further that the transforms will/may scan the input several times (also true for internal references, probably not so for external). As Agnes found, the crux of the problem starts with updateOutputStream, but extends into the retrieval of the resettable stream (which has far more side effects than I'm comfortable with). For my situation, I already had a custom resolver in place. It was simple enough, once I had a better grasp on the problem, to create a custom XMLSignatureInput implementation that overrides updateOutputStream to open a FileInputStream, copy it to diOS, and close the input stream. I suspect a better solution than either mine or Agnes' would present itself if I could understand exactly what's going on with some of those member variables in XMLSignatureInput. The code is just peculiar enough that I'm not confident that I understand everything it's trying to accomplish. I can observe however that XMLSignatureInput has changed somewhat since Agnes' proposal, so some rework would be necessary to apply it.
This bug is not critical. Change to enhancement. But before we apply the patch, I think that we need to refactor XmlSignatureInput with the strategy pattern.
I took another look at this bug and was able to fix it. The fix I putback was very similar to Agnes' proposed patch, with a few small differences. This should fix the OOM problems if you are simply trying to sign the bytes of a large file without any transformations that cause intermediate node-sets to be created. It doesn't address those use cases but it does address the use case of this bug and the one that I believe most users have complained about.