Description
A long standing issue relates to the performance of the existing default TagSoupParser.java. There are a number of issues which now relate to limitations in the way nekohtml parses HTML5 for example ANY23-317, ANY23-273, ANY23-267... there are several others.
I propose to @Deprecate the TagSoupParser.java implementation for the next release (possibly making it configurable via default-configuration.properties). I also propose to replace it with https://jsoup.org/. AFAIK, Apache Tika also did this several years ago.