Starting with version 3.0, the .NET Framework provides two incompatible and unrelated graphics APIs, both aimed at general GUI application development:
WPF was developed for Windows Vista whose new Desktop Window Manager (DWM) is likewise based on DirectX rather than GDI. The DWM is enabled by switching to the Aero desktop theme (the default on most editions), and disabled by switching to the Basic theme which emulates Windows XP.
Rumor has it that Vista was originally intended to use WPF for its entire GUI, but the performance of the new API was not up to the task. Certainly, developers outside Microsoft have frequently criticized WPF for its sluggish performance, especially compared to GDI/GDI+ on Windows XP.
On this page I attempt to measure the performance of simple drawing operations in both WPF and Windows Forms (i.e. GDI+) under a variety of conditions. For comparison, I implemented the same operations in Java’s AWT (Abstract Window Toolkit). Hopefully the results will prove useful to other developers. The test application and its source code are available for download, so you can run your own tests and modify the test cases as desired.
Before moving on, I’d like to recommend Jeremiah Morrill’s Critical Deep Dive into the WPF Rendering System. This post is largely unrelated to the following discussion, but it’s a fascinating examination of WPF performance at the lowest level.
Attempting to measure the time WPF takes to fully render a window is surprisingly difficult. This is because two different threads collaborate in this task. To explain what that means, here’s a quick overview of how WPF shows things on the screen.
An immediate mode API can simulate a primitive sort of retained mode by drawing to a memory buffer which is later copied to the screen. Such buffering is frequently used for better performance or smoother animations. Our drawing test application provides test windows for both direct and buffered GDI+.
Since WPF uses retained mode, all new display content passes through two stages before appearing on the screen: first internal preparation, then the actual rendering. WPF implements these stages as follows:
1. Preparation — This includes computing the sizes and layout of all WPF objects to be rendered, as well as recording the actual drawing operations. Any WPF methods that you call explicitly, for example within an
OnRender override, are part of this stage.
All preparations are handled synchronously by the message loop running on the (usually single) GUI thread, which is accessible through the
Dispatcher property. This is the same mechanism that transmits user input to Windows applications, and it’s the reason why WPF won’t update the display until your topmost event handler has returned. The GUI thread cannot process any drawing operations while it’s in your code – it must return to the message loop first.
(There’s a dangerous trick to get around this, known as
DoEvents after the eponymous Windows Forms method, which tells the
Dispatcher to immediately work through all pending messages. The drawing test application uses this trick to clear the message queue before the test timer starts.)
2. Rendering — When all preparations are complete, a separate background thread eventually renders the prepared content to the screen. Unfortunately, this thread is completely hidden from user code, and WPF offers no (direct) way to tell when an object has finished rendering. This is a rather big problem for responsive GUI design, as several users have discovered…
(This mechanism also explains why WPF, an API based on DirectX, doesn’t expose a DirectX interface for user drawings. Only the background render thread interacts with DirectX, so any user-supplied DirectX code would have to somehow insert itself into this thread. It’s difficult to see how that could work without messing up existing functionality.)
Measuring Windows Forms is easy: we start a timer before showing a test window, and stop it at the end of the window’s
OnPaint handler. Since Windows Forms operates in immediate mode, the entire window has been fully rendered to the screen at that point. Java’s AWT likewise operates in immediate mode, and can be measured in the same way.
Measuring WPF is more difficult. Once again, we start a timer before showing a test window. But now we need to measure both stages of WPF’s retained mode to find the total time until the window has actually been rendered to screen.
This stage is complete when the UI thread’s message loop has processed all pending messages, which (we assume and hope) all originated from our drawing operations. WPF exposes a
Window.ContentRendered event that is perfect for this purpose. Despite its name, this event fires after all window contents have been prepared for rendering for the first time. We react by setting a flag in our test application that activates measurement of the second stage.
We cannot directly access the rendering thread but WPF does offer one indirect point of access, namely through the
CompositionTarget.Rendering event. This event usually fires at the monitor refresh rate (typically 60 times per second), whether there’s any new content to render or not. It is primarily intended for custom animations that need to generate display updates as quickly as the monitor can show them.
Rendering event is tied to the render thread in a way we can exploit: the event is not raised as long as the render thread is busy! It will be raised again at some point during the next refresh interval after the render thread has gone idle. Since we set a flag immediately after the preparations stage was complete, we can now examine that flag in our
Rendering handler. If the preparations flag is still set, we know that a test window has just been rendered and we can record the elapsed time.
This trick is not foolproof. Sometimes the
Rendering event fires just after a test window has been prepared, but before the render thread has actually started working on it. We circumvent this problem by comparing the event time to the time when the preparations flag was set. If the difference is less than 100 msec (a value tailored to our benchmark), we assume that rendering has not yet happened and wait for the next event to arrive.
All results shown below were obtained with a small test application. The download package DrawingTest.zip (31.8 KB, ZIP archive) comprises the precompiled application for the .NET Framework 4.0 and the complete source code for Visual Studio 2010, as well as a version for Java’s AWT library.
The test application draws 10,000 triangles to a window’s client area, sized 400×400 screen pixels (for GDI+ and AWT) or device-independent units (for WPF). Each triangle is rotated 1° clockwise compared to the previous one. Triangles are drawn either as outlines using pens (“Pens Only”), filled shapes using brushes (“Brushes Only”), or both with different colors (“Pens & Brushes”). All colors are solid, with no patterns, shading, or animation effects of any kind.
The test application for Java’s AWT library is located in a separate folder and run from the command line – please see the enclosed ReadMe file for instructions. The test application for GDI+ and WPF provides a GUI with five buttons on the left start each test window, as follows:
Graphics.FillPolygonto draw the triangles directly to the window. We enable alpha blending (
SourceOver) and high-quality compositing to replicate WPF behavior, but testing with
SourceCopyand high-speed compositing showed no measurable difference.
OnPainthandler creates a
BufferedGraphicsobject covering the entire client area, then calls
Graphics.FillPolygonto draw the triangles to that buffer, and finally renders the buffer to the window. (This is equivalent to setting the
ControlStyles.OptimizedDoubleBufferflag on the
DrawingContext.DrawLinethree times for each triangle. The
DrawingContextclass does not expose a method to fill arbitrary polygons, so this test supports only the “Pens Only” option.
OnRenderhandler creates a
PathFigurefor each triangle, then a
PathGeometrycontaining the figure, and finally calls
DrawingContext.DrawGeometryto draw that geometry.
OnRenderhandler creates a
StreamGeometryfor each triangle, which is once again drawn by
To minimize interference with the test timer, I recommend that you move the mouse cursor away from the application and test windows, and start all tests with keyboard shortcuts rather than mouse clicks. If you use high DPI mode, you’ll notice that the Windows Forms and AWT windows appear smaller than the WPF windows. This is correct and due to the fact that WPF automatically scales all coordinates by the current DPI setting, whereas Windows Forms and AWT do not.
Anti-aliasing, i.e. smoothing the edges of diagonal lines, turns out to have a huge performance impact on all measured drawing APIs. Anti-aliasing is disabled by default for GDI+ and AWT, and enabled by default for WPF. Use “Anti-Aliasing On/Off” to change this setting which is implemented as follows:
SmoothingModeon the current
RenderOptions.EdgeModefor the current
RenderingHints.KEY_ANTIALIASINGon the current
Note that direct (unbuffered) GDI+ on Windows XP does not support anti-aliasing at all; the corresponding
SmoothingMode flag is simply ignored. This is likely a limitation of GDI hardware acceleration on that platform.
The figures and geometries created by the WPF Path & Stream tests are always frozen. Testing showed that leaving them unfrozen makes no discernable difference. However, freezing the pens and brushes used by the three WPF windows makes a very big difference, so this feature is controlled by one last option. All WPF pens and brushes are initially unfrozen until you click the “Freeze WPF” button, at which point they remain frozen until the application is closed.
The application tests exactly one thing: drawing the outlines and/or interiors of many triangles in solid colors. It does not test anything else, including the following:
If you are interested in the performance of some specific drawing operation that is not covered by the application, you should modify its source code to run your own customized tests on your target system. This is ultimately the only way to find reliable answers. Should you discover any results that contradict my own or are otherwise surprising, please let me know.
The following tables show sample test results on my system, comprising an Intel DX58SO motherboard with an Intel Core i7 920 CPU (2.67 GHz), 6 GB RAM (DDR3-1333), and an AMD Radeon HD 6970 (2 GB) graphics card, with current AMD and DirectX drivers. The tests were not conducted with any kind of scientific rigor; I simply ran each test several times and picked a nice round median value. In each table, the first three rows were measured with anti-aliasing disabled and the last three rows (“AA +”) with anti-aliasing enabled.
The first table shows the test results, in milliseconds, for Windows XP SP3 (32 bit, 96 dpi, DirectX 9.0c) running in Virtual PC on Windows 7 SP1 (64 bit).
|Windows XP||GDI+||WPF Unfrozen||WPF Frozen|
|Pens & Brushes||460||1,470||—||3,350||3,150||—||1,950||1,720|
|AA + Pens Only||—||4,760||13,100||4,800||4,500||3,150||3,600||3,400|
|AA + Brushes Only||—||3,930||—||3,000||2,750||—||2,650||2,450|
|AA + Pens & Brushes||—||8,750||—||7,400||7,250||—||6,000||5,800|
The second table shows the test results, in milliseconds, for Windows 7 SP1 (64 bit, 120 dpi) with the Desktop Window Manager disabled (Windows 7 Basic scheme).
|Windows 7 Basic||GDI+||WPF Unfrozen||WPF Frozen|
|Pens & Brushes||6,900||910||—||2,250||2,050||—||880||680|
|AA + Pens Only||7,400||4,550||14,000||6,300||6,100||4,050||5,200||5,000|
|AA + Brushes Only||7,400||3,850||—||680||480||—||400||200|
|AA + Pens & Brushes||14,800||8,450||—||6,700||6,480||—||5,300||5,100|
The third table shows the test results, in milliseconds, for Windows 7 SP1 (64 bit, 120 dpi) with the Desktop Window Manager enabled (Windows 7 Aero scheme).
|Windows 7 Aero||GDI+||WPF Unfrozen||WPF Frozen||Java|
|Pens & Brushes||36,000||920||—||2,300||2,080||—||880||670||590|
|AA + Pens Only||25,600||4,500||13,800||6,200||6,000||4,050||5,150||4,950||5,500|
|AA + Brushes Only||27,800||3,800||—||680||480||—||400||190||4,800|
|AA + Pens & Brushes||55,400||8,400||—||6,700||6,450||—||5,300||5,070||10,200|
This table also shows test results for Java, using the standard AWT library from the Sun/Oracle JDK 1.6u26. As you can see, AWT’s performance is roughly comparable to buffered GDI+.
I make two assumptions in my following attempt to interpret these results:
Once again, I encourage you to download the test application and try it on your own system(s), modifying the test code to your own requirements if necessary. Still, based on my test results as they stand, I’m inclined to draw the following conclusions:
Direct GDI+ is extremely system-dependent — The architectural changes between XP and Vista slowed direct GDI+ operations by two orders of magnitude, whereas WPF shows a significant but smaller variance only in fill rates (see below). On Windows XP, unbuffered GDI+ is 3–67 times faster than WPF; on Windows 7 Basic, between 3 times faster and 37 times slower; and on Windows 7 Aero, never faster and up to 146 times slower!
Conclusion: Unbuffered GDI+ is a great choice for custom drawing on XP (if you don’t need anti-aliasing), but it’s completely useless on newer systems where WPF is usually faster. Buffered GDI+, on the other hand, delivers consistent and competitive performance across all systems – and also supports anti-aliasing on Windows XP.
Anti-aliasing is (almost) always extremely slow — Surprisingly, the fact that WPF enables anti-aliasing by default is the single biggest factor in its apparent slowness compared to other APIs. Turning off AA improves performance in most tests by an order of magnitude, and using identical AA settings dramatically shrinks the performance difference between all three APIs.
Conclusion: Good drawing performance in any of the tested APIs requires disabling anti-aliasing. Once you do that, the choice of API is nearly irrelevant. If you require good performance with AA enabled, however, you’ll need to write raw DirectX or OpenGL code that utilizes your video card’s hardware AA (but see below on WPF brushes).
Freezing WPF pens & brushes is always a good idea — The basic
DrawLine method is highly sensitive to this simple optimization and runs 3–20 times faster with frozen pens. One reason for this large speedup is that
DrawLine is called three times per triangle, evaluating the current pen each time. The
Geometry methods are less sensitive but freezing pens & brushes still yields a speedup of 10–300%, depending on the system and operation.
Conclusion: Always immediately call
Freeze on any freezable WPF object that you don’t want to animate or otherwise change in the future.
More complex WPF APIs are not necessarily faster — The complex “low-level” APIs
StreamGeometry beat equivalent
DrawLine calls only when using unfrozen pens, and
StreamGeometry significantly outperforms
PathGeometry only at hardware-accelerated fill rates. However, note that I only tested very small geometries (i.e. triangles). Larger collections of geometric primitives should improve the relative performance of the
Geometry APIs, especially when reused in multiple drawings.
Conclusion: Don’t expect miracles from the complex
Geometry APIs. Unless the created geometries are reused, disabling anti-aliasing and freezing all possible WPF objects should yield a much greater speedup.
WPF brushes can be much faster than WPF pens — In most tests, filling triangles with brushes is about as fast as drawing their outlines. This changes dramatically for WPF on Windows 7: using the same drawing technique, brushes are always 2–26 times faster than pens. Even more intriguing, the usual anti-aliasing penalty vanishes completely for WPF brushes – but not for pens! I believe that we observe here the fabled “DirectX acceleration” of WPF, so lamentably unnoticeable in most operations, and that the slower fill rates on XP are due to the graphics card emulation provided by Virtual PC.
Conclusion: When running on modern systems with fast graphics cards, try using WPF brushes instead of WPF pens where possible. This may even allow you to keep anti-aliasing enabled. Unfortunately, this trick is system-dependent and probably won’t work on cheap laptops or old office desktops.
Why does WPF have a reputation for being slow? As far as drawing geometric objects is concerned, the apparent reason is that its designers chose two unusual default values: all objects are drawn with anti-aliasing, and most object data is retrieved from expensive mutable dependency properties.
There are good reasons for both choices. Enabling AA by default is necessary since WPF supports automatic display scaling, but its enormous performance impact is virtually unknown and should have received more publicity. WPF objects must remain mutable until all properties have been initialized, but most objects are never animated or otherwise changed afterward. Perhaps pens & brushes created by parameterized constructors should be frozen by default – or perhaps WPF would have been better off without the elegant but slow dependency property mechanism.
Fortunately, once these two big performance stumbling blocks are known they are easy to work around. Calling
Freeze on all eligible WPF objects is tedious but trivial, and the single line
RenderOptions.SetEdgeMode(this, EdgeMode.Aliased); in a control’s constructor disables anti-aliasing for all its contents.