Bug 67774 - Support empty string ContentType in OPC package
Summary: Support empty string ContentType in OPC package
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: OPC (show other bugs)
Version: 5.3.x-dev
Hardware: PC All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-16 17:29 UTC by Olivier Schmitt
Modified: 2024-02-26 12:27 UTC (History)
1 user (show)



Attachments
C#.Net test program (1.88 KB, text/plain)
2023-10-16 17:39 UTC, Olivier Schmitt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Schmitt 2023-10-16 17:29:17 UTC
The Microsoft PowerBI PBIX file format is an OPC package file type.
It contains file [Content_Types].xml which declares the content types of parts.
Unfortunately, the content types are declared as empty strings.
Apache POI code throws an exception, when trying to open the file.
Is it possible to support empty string content type for parts?
Here is sample java code and exception:

import java.io.File;
import java.util.List;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.openxml4j.opc.PackagePartName;

void test(String filePath) {
  File file = new File(filePath);
  OPCPackage opcPackage = OPCPackage.open(file);
  List<PackagePart> parts = opcPackage.getParts();
  for (PackagePart part: parts) {
    PackagePartName name = part.getPartName();
    System.out.println(name.getName());
  }
}

org.apache.poi.openxml4j.exceptions.InvalidFormatException: The specified content type '' is not compliant with RFC 2616: malformed content type.
at org.apache.poi.openxml4j.opc.internal.ContentType.<init>(ContentType.java:152)
at org.apache.poi.openxml4j.opc.ZipPackagePart.<init>(ZipPackagePart.java:82)
at org.apache.poi.openxml4j.opc.ZipPackage$EntryTriple.register(ZipPackage.java:377)
at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:326)
at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:749)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:288)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:192)

The equivalent test program written with VisualStudio 2019 and C# .Net works fine.

using System.IO.Packaging;

void test(String filePath) {
  Package pack = Package.Open(filePath, System.IO.FileMode.Open);
  PackagePartCollection parts = pack.GetParts();
  foreach (PackagePart part in parts) {
  }
}

The DotNet source code supports empty contentTypes:
https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipPackage.cs
Comment 1 Olivier Schmitt 2023-10-16 17:39:54 UTC
Created attachment 39148 [details]
C#.Net test program

Sample C#.Net test program for VisualStudio 2019, works fine.