Wednesday, August 15, 2012

When a QThread isn't a thread...

The thread that doesn't thread...

Lets say you have some background work you need to get done. The work is pretty time intensive, so you don't want to block the main UI thread while its executing. To do this work, you implement a worker thread like the one below to do the work once a second
class MyWorkerThread(QThread):
    def __init__(self):
        super(MyWorkerThread, self).__init__()
       
    def run(self):     
        self.timer= QTimer()
        print self.timer.thread()
        self.timer.setSingleShot(False)
        self.timer.timeout.connect(self.doWork)
        self.timer.start(1000)
        self.exec_()

    def doWork(self):
        print "Work!"
On the surface this code seems sane. When the thread starts executing, we setup a QTimer thats going to run in the current thread's event queue. We connect our "doWork" function to the timeout signal. Then we do work in the worker thread's context... right?

If you try this out with any serious work, though, you'll find the main thread is actually getting blocked by the work thats being done. The GUI running in the main thread will not be responsive. Its almost as if threading itself has stopped working.

So what's happening? Well there's two pieces of information we need to learn to figure this out. First is thread affinity. Thread affinity is described by QT as the "thread the QObject lives in". According to the QT docs, when a QObject is created, its "thread" pointer is set to the current executing thread. Now here's the important question -- what's the current executing thread when we create our worker thread?

Well it can't be the worker thread, as its not really created yet. The __init__ for the thread is run in the thread creating it. The QThread doesn't actually really become, well, a thread until you call start on it and it begins to run().

So a QThread's affinity is always the thread that creates it.

Ok the next piece of information is what happens when we connect the signal. By default, the connection is whats known as an AutoConnection. An AutoConnection begins by figuring out the thread affinity of the emitting and receiving QObjects. So what is the affinity of the QTimer firing the signal? Well its created while worker thread is running, so its the worker thread. Ok but the receiver, whats its affinity? Well the receiver is the QThread itself. We just established that the QThread's affinity is the thread that creates it. So we have a signal going from the QTimer living in the worker thread to the QThread living in the main thread.

AutoConnection posts the signal as an event to be queued in the receiving thread's event queue. The timeout signal is posted from the worker thread to the main thread. The main thread eventually gets to this signal, and figures out the slot to call, in this case the slot is doWork in our worker thread. The main thread then calls "doWork" in its own event queue in response to receiving the signal.

The work is intensive. It takes a lot of time. And the main thread's event queue can't run because its doing the work the worker thread should be doing. So none of the GUI events get processed, and the main thread effectively becomes starved.

The solution? 

We need to make sure that the QObject doing the work lives in the worker thread. So we need it to not be the QThread itself. We need something like
class Worker(QObject):
    """ I'll do the work and live in the worker thread!"""
    def __init__(self):
        super(Worker, self).__init__()
       
    def doWork(self):
        print "Work"

class MyWorkerThread(QThread):
    """ I'm just going to setup the event loop and do
        nothing else..."""
    def __init__(self):
        super(QThread, self).__init__()
       
    def run(self:    
        self.timer= QTimer()
        self.worker = Worker()
        print self.timer.thread()
        self.timer.setSingleShot(False)
        self.timer.timeout.connect(self.worker.doWork)
        self.timer.start(1000)
        self.exec_()
So great, notice we created the Worker while the worker thread is executing. Our Worker and the QTimer have the same thread affinity (that is they both live in the worker thread). Therefore, when timeout is fired, doWork is fired from the worker thread's thread context.

Problem solved!

Some caveats....

You might want to control how the worker is created before you create the worker thread. Maybe you want to pass the worker into the worker thread for example. But then wait, if we create it outside of this thread, its not living in the worker, and we have the same problem again! Aaaah!

Luckily QObject has a method moveToThread which takes a QThread, changing the affinity to the passed in thread. This method is intended to allow you to push (important to note pulling is not threadsafe) into another thread. So we can do this:
class MyWorkerThread(QThread):
    """ I'm just going to setup the event loop and do 
        nothing else..."""
    def __init__(self, worker):
        super(QThread, self).__init__()
        worker.moveToThread(self)
        self.worker = worker
        
    def run(self):      
        self.timer= QTimer()
        print self.timer.thread()
        self.timer.setSingleShot(False)
        self.timer.timeout.connect(self.worker.doWork)
        self.timer.start(1000)
        self.exec_()
Of course, you probably don't want to retain the passed in worker in the creator of the thread. You might do call a method on it from outside the worker thread while its doing work in the worker thread's context. Most likely that operation will not be thread-safe and unintentional bugs will ensue. Ideally, you don't want to keep worker around in your worker thread's client code.

This is something that's easy to forget and hard to enforce, so another solution would be to pass in a factory function to create worker for you instead of passing the worker itself. This will remove the ability for the worker thread's owner/client to monkey with the worker in an unsafe way. So we'd then have something like this:
def createWorker():
    return Worker()

class MyWorkerThread(QThread):
    """ I'm just going to setup the event loop and do 
        nothing else..."""
    def __init__(self, workerFactory):
        super(QThread, self).__init__()
        self.workerFactory = createWorker
        
    def run(self):      
        self.timer= QTimer()
        print self.timer.thread()
        self.worker = createWorker()
        self.timer.setSingleShot(False)
        self.timer.timeout.connect(self.worker.doWork)
        self.timer.start(1000)
        self.exec_()

Final thoughts

Its interesting that when researching all this I discovered that for QT 5, "Subclassing QThread is no longer recommended way of using QThread". Well that's not surprising. When you think about it, since QThread lives in the thread that creates it, methods of QThread itself are a pretty unsafe place to do any work. We know run is executing in another context, so we can setup our workers and what not there, but we can't be sure whats going on in other methods of QThread.

This is a little weird if you've ever dealt with other frameworks like MFC. For example, for CWinThread, you'd be used to attaching thread message handlers to methods of CWinThread for direct handling in that CWinThread's context. You get trained to expect methods of you derrived CWinThread are being executed in the CWinThread's context, not in some external context. Sure you can always call into a CWinThread from another thread's context, but MFC developers know its better to post to the other thread and have it deal with the event in its own context.

Classes inheriting from QThread don't work that way. And QT signals/slots are different then Windows thread messages. So beware and keep your work out of QThread, delegating it to other objects that live in that QThread!

3 comments:

  1. I think the general information about the thread affinity of the QThread is great. But I can't help thinking how much the example you use doesn't make sense. Why would you use a QTimer and an event loop in the thread to do work at intervals? Why wouldn't you just loop, call doWork() and then self.usleep(1000)?
    Maybe if the goal was to show a signal slot example from a thread, you might have used a more realistic use case?
    Just something I kept thinking as I read through.

    ReplyDelete
    Replies
    1. usleep will block and not reenter the event queue for the thread. Thus other signals won't get fired for the thread, so its frequently not realistic to call self.usleep(1000) while processing another signal.

      Delete
  2. I am new to threading and struggling to find a *real* example that actually does something (I am trying to move a directory of files).

    It would be -excellent- if you could give a complete working example.

    Please?

    ReplyDelete