Hi there, We have users that are uploading certain PDF files (only happens on some) and when the system extracts the text, when there is a double letter word like 'Mississauga', it comes up as Misisauga - removes the double letter. This seem to only occur on some PDFs. Also, issue is not present when using the original Word file.
For Apache PDFBox bugs, please use the PDFBox JIRA instance at https://issues.apache.org/jira/browse/PDFBOX