The best solution I've found was to use cyberneko to parse your string and do some "clever" SAX event handling.
cyberneko will parse your HTML even if it's invalid, which is the case for the vast majority of the HTML you're likely to encounter in the wild.
If you register a custom ContentHandler that essentially ignores all but the character events and just append those to a string builder, you'll get a good first approximation, with an annoying flaw: words separated by a block element will end up concatenated (for
example => forexample).
A better solution is to get a list of all block elements, and have your ContentHandler listen to startElement events. If the element is a block one, just append a space character to your string builder.
Note that while this seems to work fine, it might not be perfect for your use case.
is not, for example, turned into a line break. It shouldn't be too much work to add this if it's required, though.