The QWebPage cannot know the final result of loading the page HTML until it has loaded all the external scripts/css/images. Since your code never returns to the event loop the QWebPage never gets a chance to commence loading these. I suspect, therefore, that the HTML document remains empty until then.
Do your parsing work in a slot attached to the loadFinished() signal. Use QWebSettings to disable JavaScript or image loading etc. if you need to.
Bookmarks