I dislike optimizing without measuring so I whipped up a quick test of QRunnable and QThreadPool that created and ran increasing batches of jobs. I figured others using QRunnable might be interested in these numbers as well so here is the result of a typical run on my Thnkpad X201 (~3 years old):
============== Batch size: 1000That is wonderfully linear performance as the batch size grows, which is exactly what we'd hope for. Each job takes ~10 microseconds to be created, run in a thread from the pool and then deleted. This means that preventing job creation is going to give us a savings of ~10 microseconds per job. Put another way, the cost is ~1 second per 100,000 jobs on my laptop. Given that the overhead scales linearly at least to one million jobs, this kind of overhead simply gets lost in the noise and isn't worth being concerned about.
Run 1 : 15 milliS total @ 15 microS
Run 2 : 10 milliS total @ 10 microS
Run 3 : 10 milliS total @ 10 microS
Run 4 : 6 milliS total @ 6 microS
============== Batch size: 10000
Run 1 : 111 milliS total @ 11.1 microS
Run 2 : 111 milliS total @ 11.1 microS
Run 3 : 111 milliS total @ 11.1 microS
Run 4 : 110 milliS total @ 11 microS
============== Batch size: 100000
Run 1 : 1137 milliS total @ 11.37 microS
Run 2 : 1094 milliS total @ 10.94 microS
Run 3 : 1031 milliS total @ 10.31 microS
Run 4 : 1072 milliS total @ 10.72 microS
============== Batch size: 1000000
Run 1 : 11108 milliS total @ 11.108 microS
Run 2 : 10784 milliS total @ 10.784 microS
Run 3 : 11030 milliS total @ 11.03 microS
Run 4 : 11102 milliS total @ 11.102 microS
I tried a few variations how the runs were handled, none of which made any significant difference. For instance, moving the deref of the QAtomicInt counter from the run() method to the destructor of the QRunnable subclass had no real effect, as one would expect. Excluding construction and destruction of the QRunnable objects did not significantly change the per-job numbers, though if you launch a swarm of a million jobs in sequence as in the test one might consider creating a pool of job objects to re-use; that's a pretty specific use case, however. I also tried code that is slightly more analogous to what Sprinter actually does, namely: it creates the jobs and puts them in a QVector for bookkeeping purposes before running them. QVector's performance is also such that this had no real impact at all on the numbers. Even looping once to insert into and again to run the jobs from the QVector wasn't significant.
In Sprinter checking whether or not a job should be created is currently fast (~0.5micros). This is about to gain some additional complexity for more aggressive pre-match screening of runners. This will slow that process down, though I don't yet know what the exact overhead will be at this point. The code is not yet written, after all. I expect a few microseconds at least, but doubt it will be significant. Even if kept outside the jobs, this code would run in the main worker thread and not block the UI. Still, given the low cost of a QRunnable, I will be moving this check into the QRunnable job as this simplifies the code in question and allows Sprinter to parallelize these checks.