Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Invalid
-
2.0.26
-
None
-
OS: Ubuntu
Java: 16
Description
Hello,
I am experiencing an issue related to the "No Unicode Mapping" warning in the PDFBox debugger. Similar to Apache DebugBar, I am saving font glyphs to disk and then using an AI to detect the characters. My objective is to update the font Unicode map based on the AI results and save the PDF.
Here's my main idea: Save unknown glyph Unicode mappings to disk, send each image to the AI for detection, and then update the font Unicode mapping. I found a helpful example on Stack Overflow (link: https://stackoverflow.com/questions/39485920/how-to-add-unicode-in-truetype0font-on-pdfbox-2-0-0), where the solution involves creating a CosStream to update the font Unicode mapping. This approach seems suitable for my needs.
In the mentioned question, the answer suggests creating a CosStream to update the font Unicode mapping. I want to retrieve the ToUnicode text as shown in the mentioned question and modify the text to fix the font Unicode, then update the font. However, I am unsure of how to obtain the ToUnicode text view (similar to the PDF debugger).
Can anyone provide assistance on how to address this issue? Any help would be greatly appreciated.
Sample pdf file attached