Bug 65946 - POIXMLPropertiesTextExtractor returns duplicate key for Core properties
Summary: POIXMLPropertiesTextExtractor returns duplicate key for Core properties
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2022-03-09 20:22 UTC by Dhaval Sonawane
Modified: 2022-03-12 10:09 UTC (History)
1 user (show)

You can use this pptx to reproduce (31.93 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2022-03-09 20:22 UTC, Dhaval Sonawane

Note You need to log in before you can comment on or make changes to this bug.
Description Dhaval Sonawane 2022-03-09 20:22:18 UTC
Created attachment 38224 [details]
You can use this pptx to reproduce

## Issue

The POIXMLPropertiesTextExtractor::getCorePropertiesText() method returns a duplicate key "Category".

Check this line of Code: https://github.com/apache/poi/blob/b6aee1ef6d3e92a28ffd4b5c03e677b63b43747f/poi-ooxml/src/main/java/org/apache/poi/ooxml/extractor/POIXMLPropertiesTextExtractor.java#L115

## To reproduce: 

1. Create a new pptx file in Powerpoint. 
2. Add a category metadata to the pptx in Powerpoint (File > Properties > Summary > Enter a value in Category)
3. Read the ppt in POI and print the metadata as such -> 

        FileInputStream stream = new FileInputStream(new File("myfile.pptx"));

        XMLSlideShow doc = new XMLSlideShow(stream);
        SlideShowExtractor<XSLFShape, XSLFTextParagraph> extractor = new SlideShowExtractor<>(doc);
        String metaText = extractor.getMetadataTextExtractor().getText();
4. Observe the duplicate "Category" value
Comment 1 PJ Fanning 2022-03-09 20:44:09 UTC
thanks - I added r1898804 - will add a test later
Comment 2 PJ Fanning 2022-03-09 20:57:15 UTC
added test with r1898805
Comment 3 Dhaval Sonawane 2022-03-09 22:32:50 UTC
Thankyou for the quick fix!