I'm downloading HTML from a website. The file can be quite large so while the file's downloading, I want to already parse the available chunks of HTML so that the process appears faster for the end-user of my program. I don't have control over how the cunks are generated, so a chunk can begin in the middle of a word, e.g. like so:
chunk 1 --->
XKCD
...and so on.
I have seen example where libxml2 was used to parse XML chunks exactly how I described. Can libxml2 also parse HTML chunks? I have checked with tidy on the html files I'm going to be downloading, it reports warnings but no errors. Can libxml2 parse those HTML chunks as well?