Bug 60741

Summary: NPOIFS move from read-only flag to 3-state enum
Product: POI Reporter: Nick Burch <apache>
Component: POIFSAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: enhancement    
Priority: P2    
Version: 3.16-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Nick Burch 2017-02-17 13:23:59 UTC
Currently, in NPOIFS (NPOIFSFileSystem + POIFSFileSystem, but not OPOIFSFileSystem), you can either load from a File or from a Stream. With a
Stream, everything gets buffered into memory

When loading from a File, you can load read-write or read-only. In
read-write mode, we use NIO to mmap the file so we can quickly + low memory
read + write.

The other case, File + read-only, isn't so ideal. We need to support both
people doing true read-only cases, eg reading or text extraction, plus
people doing read-template + write or read-change-save-as cases. To support
that, we buffer into memory.

Instead, if we changed the read-write state to an enum with 3 states, we could have:
 * Read/Write - unchanged
 * Read Only - mmap in read-only mode, text extraction etc only
 * Read as Template - mmap in private mode, changes buffered in memory, able to write out to a new file

That'd mean lower memory use, and more consistent code.

Naming TBC
Comment 1 Nick Burch 2017-02-17 13:24:21 UTC
Help needed with naming the enum and the enum values...
Comment 2 Greg Woolsey 2017-02-17 17:42:54 UTC
(In reply to Nick Burch from comment #1)
> Help needed with naming the enum and the enum values...

Perhaps

public enum WriteState {
    READ_ONLY,
    IN_PLACE,
    BUFFERED,
    ;
}

with appropriate JavaDocs indicating restrictions, performance implications, and example use cases.  The Enum and API method JavaDocs would note that this only applies to paths that read from files, not streams.
Comment 3 Andreas Beeker 2017-02-18 14:04:11 UTC
why invent something new, when there's already an enum [1]?
I'm not saying, to exactly use FileChannel.MapMode, but something look-a-like. I think the term "read_write" is more intuitive - not so sure about "private" ... but "buffered" could be also associated with something else.


[1] https://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.MapMode.html