2025, Dec 26 05:00

Stop QWebEngineView caching issues: handle setHtml asynchronously with loadFinished or setUrl data URLs

Troubleshoot PyQt QWebEngineView showing outdated content: it's timing, not caching. Use loadFinished, setUrl data URLs, and disable cache and cookies.

Switching a PyQt application from QTextBrowser to QtWebEngine often feels like a drop-in replacement—until the content on screen starts lagging behind your data. If you construct HTML dynamically and feed it to QWebEngineView with setHtml(), you might see pages randomly showing the wrong text. It looks like caching. It smells like caching. But most of the time, it’s timing.

Repro in miniature

Below is a minimal example that renders inline HTML with QWebEngineView via setHtml(). It’s intentionally straightforward to illustrate the common setup that can mislead you into thinking the engine is caching stale content.

import sys
from PyQt5.QtWidgets import QApplication, QMainWindow
from PyQt5.QtWebEngineWidgets import QWebEngineView

class AppFrame(QMainWindow):
    def __init__(self):
        super().__init__()
        self.viewer = QWebEngineView()
        html_doc = """
        <html><body><h1>Inline HTML</h1><p>Rendered via setHtml().</p></body></html>
        """
        self.viewer.setHtml(html_doc)
        self.setCentralWidget(self.viewer)

if __name__ == "__main__":
    app = QApplication(sys.argv)
    wnd = AppFrame()
    wnd.show()
    sys.exit(app.exec_())

What’s actually going on

QTextBrowser loads HTML synchronously, so calling setHtml() means the content is immediately ready to render. QWebEngineView is different. When setHtml() is used, the engine treats the HTML document itself as loaded immediately, while external resources are handled asynchronously. As the documentation explains:

The HTML document is loaded immediately, whereas external objects are loaded asynchronously.

This is where the mismatch appears. If your UI logic assumes the new content is ready right after setHtml()—for example, by issuing refreshes or chaining UI updates—the page may not have actually completed its load lifecycle yet. To make things trickier, loadFinished() can even emit with success = false if the HTML sent via setHtml() exceeds 2MB. Meanwhile, QTextBrowser continues to behave predictably because it is not asynchronous in this way.

Start with the obvious: disable caching properly

Before addressing timing, it’s still worth ensuring HTTP cache and cookies don’t get in the way. QtWebEngine exposes knobs for both. Use the named enums rather than raw integers, clear the cache manually, and drop cookies if needed.

from PyQt5.QtWebEngineWidgets import QWebEngineView, QWebEngineProfile

# somewhere in your widget setup
engine_prof = self.viewer.page().profile()
engine_prof.setHttpCacheType(QWebEngineProfile.NoCache)
engine_prof.setPersistentCookiesPolicy(QWebEngineProfile.NoPersistentCookies)

# extra safety: clear both cache and cookies
engine_prof.clearHttpCache()
engine_prof.cookieStore().deleteAllCookies()

This ensures you are not fighting stale HTTP state. If problems persist, you are almost certainly looking at an async sequencing issue rather than caching.

Make async explicit with loadFinished

When using QWebEngineView, wire your logic to loadFinished so that subsequent actions occur only after the engine signals the page is ready. Listening on the view works reliably in practice.

import sys
from PyQt5.QtWidgets import QApplication, QMainWindow
from PyQt5.QtWebEngineWidgets import QWebEngineView

class MainWin(QMainWindow):
    def __init__(self):
        super().__init__()

        self.html_view = QWebEngineView()
        self.html_view.loadFinished.connect(self.on_render_ready)

        payload = """
        <html><body><p>Ready when loadFinished fires</p></body></html>
        """
        self.html_view.setHtml(payload)
        self.setCentralWidget(self.html_view)

    def on_render_ready(self, ok):
        print("Load completed:", ok)
        # trigger next steps here, not right after setHtml()

if __name__ == "__main__":
    app = QApplication(sys.argv)
    win = MainWin()
    win.show()
    sys.exit(app.exec_())

This simple change prevents race conditions where refresh() or follow-up rendering happens before the engine is done.

Prefer setUrl() with a data URL when you need stronger guarantees

Another approach is to avoid setHtml() and feed the content as a data URL instead. The data is the same, but setUrl() provides a more reliable trigger for loadFinished(), which helps sequencing. The example below demonstrates inline CSS and JavaScript shipped via a base64-encoded data URL.

import sys
import base64
from PyQt5.QtWidgets import QApplication, QMainWindow
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtCore import QUrl

class Shell(QMainWindow):
    def __init__(self):
        super().__init__()

        self.viewer = QWebEngineView()
        self.viewer.loadFinished.connect(self.after_load)

        html_src = """
        <!DOCTYPE html>
        <html lang="en">
        <head>
            <meta charset="UTF-8">
            <title>Data URL demo</title>
            <style>
                #toggle { background: #c00; color: #fff; padding: 10px 20px; border: 0; }
                #toggle.blue { background: #06c; }
            </style>
        </head>
        <body>
            <button id="toggle">color toggle</button>
            <script>
                const btn = document.getElementById('toggle');
                btn.addEventListener('click', () => {
                    btn.classList.toggle('blue');
                });
            </script>
        </body>
        </html>
        """

        b64 = base64.b64encode(html_src.encode("utf-8")).decode("utf-8")
        url = QUrl(f"data:text/html;base64,{b64}")

        # setUrl() is a more reliable trigger for loadFinished than setHtml()
        self.viewer.setUrl(url)
        self.setCentralWidget(self.viewer)

    def after_load(self, ok):
        print("Load finished:", ok)

if __name__ == "__main__":
    app = QApplication(sys.argv)
    wnd = Shell()
    wnd.show()
    sys.exit(app.exec_())

This keeps your application self-contained while giving you predictable load signaling for post-render work.

When upgrading helps

If you have the option, consider moving from PyQt5 to PyQt6. Support for 5 is ending at the end of May, and Qt6 provides callbacks and additional techniques that make handling WebEngine lifecycles more ergonomic. The functional ideas above still apply, but you get a more modern surface to implement them.

Why this matters

Rendering pipelines that appear to “randomly” show outdated content are often victims of asynchronous boundaries. In desktop UI code, it’s easy to carry over synchronous assumptions from widgets like QTextBrowser to QWebEngineView. Once you explicitly align your logic with loadFinished(), issues that look like caching bugs disappear. If you additionally harden your profile with NoCache and NoPersistentCookies and clear cache and cookies when appropriate, you avoid masking timing issues with real HTTP state.

Takeaways

Do not assume that calling setHtml() means the page is ready. Treat QWebEngineView as asynchronous, wire to loadFinished(), and only then perform follow-up work. If you need a stricter signal path, encode the HTML as a data URL and use setUrl(). Keep caching off with QWebEngineProfile.NoCache and QWebEngineProfile.NoPersistentCookies, and clear cache and cookies when you want a perfectly clean slate. With these practices in place, your songbook—or any dynamic HTML view—will render the right content at the right time.