A Line in the Sand
First, I'd like to discuss what constitutes an acceptable experience with KDE 4. There will always be some things, like desktop effects used in KWin, Plasma and KRunner, that simply require a decent video card. However, all of these programs should be runnable without those effects and retain an acceptable user experience even on older (within reason; e.g. not 15 years old ;) hardware. Good news is that they do, given decently supported hardware.
It is not acceptable if KDE 4 feels slower than KDE 3, and it is not acceptable if KDE 4 requires brand new hardware. At the same time, if the drivers for older hardware are simply not up to the task and those drivers aren't updated ... there's not much we can do about it and that hardware, which would otherwise be capable of better things, shouldn't be part of our target. It's also absolutely acceptable if certain features only work if the hardware can support them; this is mostly applicable to features reliant on more advanced graphics techniques.
I've seen the KDE 4 Plasma workspace as well as KDE 4 apps run smoothly on devices as small as the N810, on netbooks like the EEE PC, on older desktops and on new bling-bling laptops. The code base does scale well, but unfortunately it doesn't scale well everywhere ... yet. What gives?
Troubleshooting 101: Note the Variances
Have you ever noticed in these discussions how some people say KDE 4.x is "totally unusably slow" for them while others say the performance is just grand? Others say "it sucks!" and then tweak their graphics system (x.org, driver versions, driver config, kernel modules ...) and say "wow, waaay better!"?
Even if we normalize for differing expectations (one woman's "fast enough" might be another's "unnaceptable), we're still left with variance here even with the same code base. This is a critical observation, because it means that we can search for these variances and then work backwards from them towards possible solutions.
So let's go through a handful of these variances ...
Build Time Optimizations
There are some code paths which are god awful slow in a full debug build of some parts of KDE 4. The code that causes the slowness is great for debugging and troubleshooting, but can suck for performance. It's really hard to do proper "seat of your pants" performance measurements unless we're dealing with debug builds. This has bitten me in the past, and I know it's bitten others as well.
It's one of those occasional and unfortunate trade-offs: a bit of the user experience is diminished by running a full debug build that's better for development.
This is also why "seat of your pants" performance measurements are not overly useful. What do I mean by "seat of your pants performance measurements"? Here are some examples: "I ran this program and it just feels slow"; "When I scroll in Konqueror using the arrows keys, scrolling seems really jerky." These observations can be useful as starting points: they let us know what might be actually useful to work on as far as improving the user's experience is.
But to really know what is going on we need to profile the application. Often it's not what you think, and profiling can also let you know when it's debug code that's causing the problem without having to keep a separate release build around. So don't jump to performance conclusions based on "seat of your pants" observations; use them only as clues for where to start.
Valgrind and KCachegrind make profiling stupidly simple and very powerful. Even I can do it. ;)
Areas of Performance
Ok, this is probably stupidly obvious for many people, but there are several parts of the computer that can impact performance negatively.
Main memory (RAM) usage can be one: if main memory fills up for whatever reason (applications using it legitimately, applications leaking, etc) then things tend to fall back to disk swap which is painfully slow; worse, file caching is lost and all file access hits disk, also painfully slow.
CPU over-utilization can also degrade performance (to state the obvoius ;). Unlike memory usage, this usually means one of three things: the application is simply trying to do things that CPU is not capable of (rare these days, outside of truly high end applications); the algorithms used in the application are either buggy or plain inappropriate for the task and should be fixed of changed completely (most likely); there is something happening in software that should be happening in hardware (this is a typical problem with graphics). Again, profile really helps here because it's often next to impossible tell which it is without peering into the runtime execution paths. Sometimes something you think is happening on the graphics hardware isn't because of unexpected behavior in the toolkit or other pieces of software being used, for instance.
GPU under-utilization can degrade painting performance. This is probably one of our biggest weak points right now. Unlike in immersive 3D games which we expect to pin our CPU and/or GPU to 100% to get us the best framerates possible, this is a taboo in productivity applications. But to get a smooth experience without hoggin the CPU, hardware acceleration is required. Doing graphics on the CPU can all too quickly eat up most of the cycles available leaving apps little room to do actual work.
The State of the Art Event Horizon Problem
One of the problems we are facing in KDE 4 right now is that we're trying to deliver a modern user experience. We aren't defining "modern" by "what $OTHER_COMPANY is doing today" but rather by "what matches with our vision of kick ass". There's certainly inspiration and lessons we draw here and there from various companies and other FOSS projects ... but we aren't limiting ourselves to that.
In the process of creating this modern user experience, we are doing things that just haven't been down on the FOSS desktop arena, or in some cases any production user interface arena. This pushes into areas we haven't gone before.
The Plasma performance issues are a great example of that. There have been a number of graphics drivers out there that simply have issues with applications that use translucency (sometimes referred to as "argb visuals"). They work fine with compositing window managers, but that's because they use translucency in a rather different way. x.org itself has/had numerous straight out bugs when it came to argb visuals. I demo'd these issues for various x.org developers and every single one of them I showed it to were surprised about the problems.
Why is this? Well, when Plasma came onto the scene there weren't any other production apps doing to the graphics stack in x.org what Plasma was. Untested code equals unfound bugs and unknown performance boundaries.
Plasma isn't the only area we're running into this kind of "event horizon" issue.
x.org and Graphics, or How I Hope It Won't Do Graphics In the Future
We also deal with things right now like XRender which is supposed to be a way to accelerate certain common rendering activities. Things like text rendering, which is used a lot in applications like web browsers, mail readers, word processors, etc. The probem with XRender is that it is designed in a way that doesn't map all that wonderfully to how graphics hardware tends to work. So while there is decent acceleration for some or most of XRender in some graphics drivers, it's abismal on others and never really reaches the full potential we should be seeing due to design issues.
This isn't the only issue in x.org, but it sort of highlights one of the big ones: x.org has some pretty big issues when it comes to doing graphics. That's why nVidia includes in their driver a rewrite of pretty much every bit of x.org that touches graphics. This in turn causes havoc of a new variety: does nVidia's twinview map nicely to xrandr/xinerama or does it get screwed up? (Answer: often the latter.) Issues that get addressed in x.org need to also be fixed in the nVidia driver if they exist there too, and vice versa. It's just not pretty.
This is one of the primary reasons why I'm very excited about Gallium3D: it's a modern graphic stack done by graphics gurus that is designed for the real world of hardware. I've seen it action, and it's impressive.
If the FOSS desktop world is smart, the future of x.org will be as an application windowing protocol (window management, for instance) and an event system. All hardware support will move into kernel drivers where they belong (this is already happening / happened), and graphics itself will be handled by Gallium3D or something like it.
In the meantime, we're a bit stuck working around and within the limitations of today's x.org reality.
Qt4: Room For Improvement
Qt4 itself has room for improvement. There are graphics paths that do things in software that should be done either smarter in software or in hardware. A proper OpenGL paint engine for Qt4 woudl also rock the house considerably. Note that Qt works very well with OpenGL-in-a-widget, but that's not the same thing. I'm talking about all QPainter operations happening in hardware where available, which is very nearly everywhere these days. Even today's mobile devices can manage this, something driven largely by power usage. We don't have that yet, but when we do things will be nicer.
Before that happens, though, there is still lots of room for improvement in Qt4. I've talked with many of the engineers at Qt Software about these things and they have been working a lot on performance in Qt 4.3 and 4.4 with even more to come in 4.5. Performance is not an easy (or fun) issue, and with all the new code and approaches taken in Qt4 it is taking time to really get it slick-as-snot fast.
This isn't to say that it's slow right now, mind you. It's rather impressive right now given all the features it supports, but it's just not as fast as it could be. Thankfully this is a knwon issue and one that Qt Software people care about enough to actually put substantial and continuous effort into. As these efforts come to fruition, KDE 4 will also "magically" improve in performance.
And Where is KDE's Responsibility In All This?
So I've picked on the whole stack underneath KDE 4 by this point in the blog entry. I don't want to give the false impression, however, that KDE has no culpability in the current situation.
There are few truths to keep in mind that impacts KDE 4's performance:
- For a lot of us, this is the first time we've used these new techniques and we have more to learn about how to use them best.
- There is a lot of very new code in KDE 4, and new code means both new bugs as well as new unoptimized paths.
- We can work around some limitations lower in the stack better than we do now, though sometimes we need to patient and let the stack catch up with us.
Just as with Qt4, and perhaps even more so in places, we have a good amount of optimization work ahead of us. We works for years on optimizing KDE 3 in various ways, and while we carry much of this work with us into KDE 4 we also have huge new sections of code to repeat this effort on.
Bringing the KDE 4 codebase to maturation will help improve performance, and that's something that can only be accomplished over time. Thankfully, we're doing just that. How fast are we doing that? That's something we can only measure in performance deltas between releases.
So ... what does all this mean?!
It means that there is work that needs to be done at every level in the stack. We've taken the FOSS desktop to a whole new level with KDE 4, and we're pushing it even further with each subsequent release. By setting that bar higher than it has been in the past, we've created new areas of work for ourselves.
But because that work is spread out across the stack, it means that there will be vast variance in user experience right now: something as small as a driver revision upgrade can make all the difference in some cases. It also means that we can't just look at the user experience and point a finger at any one thing, e.g. "KDE 4 is slow, so fix it KDE team!" Profiling is critical, so we can pinpoint whether the problem exists in KDE code, Qt code, x.org, drivers .. elsewhere? Then we need to figure out whether it's best to wait for the stack to catch up (and communicate these pain points to the owners of those parts of the stack) or if it is better to change something in KDE's code itself.
Also remember that there are any number of possible issues: text rendering speed, image manipulation performance, graphics in software vs hardware ...
If that sounds like it's not an easy problem, that's because it's not. It's a wicked problem, one with no single cause and no single solution. We need to track down these issues one by one, by profiling code and exercising the software on different hardware/software configurations, and improve them. It will be an incremental process, but it's doable.
It Works For Me
If you think that I'm displaying a lot or maybe even too much confidence, here's why I have that confidence:
Right now, KDE 4 flies on my laptop, and it's hardly a screamer by today's standards. So I know it's possible for KDE 4 to perform very well.
I also keep close tabs on work going on elsewhere in the stack and am very happy with the directionality of it. Work is being done in pretty well all the areas we need work to be done in to improve the event horizon issues.
I wish I could wave a magic wand and make KDE 4 work like a speed demon for everyone, but that's just not realistic right now. At least I know there is light at the end of this tunnel, and that things work well for a large number of people already as it is.
The alternative would be turn around now and go back to a 1990s era desktop as we had in KDE 3. It was solid and stable, but had an extremely limited future for the general user base. Going back to something that has acceptable performance today for everyone but which is a dead end tomorrow isn't really an option, especially when we can get what we have now working as well or better than what we had. We should also avoid rose coloured glasses and remember that performance in KDE 3.0 also sucked with noticeable improvements with each release; the same is happening with KDE 4.
So it's not a perfect world, but one that's getting incrementally better. I doubt that's news to anyone, and I hope that the above helps to at least add some substance to what is often a rather shallow discussion.