Skip to main content

Optimizing Wine on OS X

I've been doing some performance analysis of EVE running under Wine on OS X. My main test cases are a series of scenes run with the EVE Probe - our internal benchmarking tool. This is far more convenient than running the full EVE client, as it focuses purely on the graphics performance and does not require any user input.

Wine Staging

One thing I tried was to build Wine Staging. On its own, that did not really change anything. Turning on CSMT, on the other hand, made quite a difference, taking the average frame time down by 30% for the test scene I used. While the performance boost was significant there were also significant glitches in the rendering, with parts of the scene flickering in and out. Too bad - it means I can't consider this yet for EVE, but I will monitor the progress of this.

OpenGL Profiler

Apple has the very useful OpenGL profiler available for download. I tried running one of the simpler scenes under the profiler to capture statistics on the OpenGL calls made.

One thing that stood out to me was the high number of glGetError calls. Disabling them turned out to be easy enough, by recompiling Wine with -DWINE_NO_DEBUG_MSGS=1 added to CFLAGS. Running that same scene again took the number of glGetError calls down from 7.4 million to 18 thousand. The overall impact on performance was not significant, though.

The profiler can also capture a full trace of all OpenGL calls made by an application. Looking at the full trace can be interesting, if somewhat daunting. A minute long capture of this fairly simple scene yields 5.8M function calls. I plan to do further tests later on, with ultra simple scenes to better understand how DirectX calls get translated to OpenGL calls.

Multithreaded OpenGL engine

Apple's OpenGL implementation has something called the multithreaded OpenGL engine. This needs to enabled explicitly. The multithreaded OpenGL engine then creates a worker thread and transfers some of its calculations to that thread. On a multicore system (as Macs are today), this allows internal OpenGL calculations performed on the CPU to act in parallel with Wine, improving performance.

It took me a little while to figure out where to put the code to enable this, and it didn't help that the CGLEnable function did not complain when I accidentally passed in an invalid context. Eventually I added the following lines in dlls/winemac.drv/opengl.c, in the function create_context:
    TRACE("Enabling the multithreaded OpenGL engine\n");
    err = CGLEnable(context->cglcontext, kCGLCEMPEngine);
    if(err != kCGLNoError)
         WARN("Enabling the multithreaded OpenGL engine failed\n");

Enabling the multithreaded engine did help with performance although initial tests indicated the opposite. It wasn't until I tested with the glGetError calls disabled and the multithreaded engine enabled that I saw some positive results. The boost in performance was more significant for the heavier scenes, such as our deathcube of 1000 ships.

The future

I'm just starting to get to know the Wine code base and it's hard to say what optimization opportunities are in there. One thing is for certain, though, that I will spend a lot more time analyzing it and trying out different things, both in our codebase as well as Wine.

Popular posts from this blog

Waiting for an answer

I want to describe my first iteration of exsim, the core server for the large scale simulation I described in my last blog post. A Listener module opens a socket for listening to incoming connections. Once a connection is made, a process is spawned for handling the login and the listener continues listening for new connections. Once logged in, a Player is created, and a Solarsystem is started (if it hasn't already). The solar system also starts a PhysicsProxy, and the player starts a Ship. These are all GenServer processes. The source for this is up on GitHub: Player The player takes ownership of the TCP connection and handles communication with the game client (or bot). Incoming messages are parsed in handle_info/2 and handled by the player or routed to the ship, as appropriate. The player creates the ship in its init/1 function. The state for the player holds the ship and the name of the player. Ship The ship holds the state of the ship - …

Large scale ambitions

Learning new things is important for every developer. I've mentioned this before, and in the spirit of doing just that, I've started a somewhat ambitious project.

I want to do a large-scale simulation, using Elixir and Go, coupled with a physics simulation in C++. I've never done anything in Elixir before, and only played a little bit with Go, but I figure, how hard can it be?

Exsim I've dubbed this project exsim - it's a simulation done in Elixir. Someday I'll think about a more catchy name - for now I'm just focusing on the technical bits. Here's an overview of the system as I see it today:

exsim sits at the heart of it - this is the main server, implemented in Elixir. exsim-physics is the physics simulation. It is implemented in C++, using the Bullet physics library. exsim-physics-viewer is a simple viewer for the state of the physics simulation, written in Go. exsim-bot is a bot for testing exsim, written in Go. exsim-client is the game client, for inter…

Mnesia queries

I've added search and trim to my expiring records module in Erlang. This started out as an in-memory key/value store, that I then migrated over to using Mnesia and eventually to a replicated Mnesia table. The fetch/1 function is already doing a simple query, with match_object. Result=mnesia:match_object(expiring_records, #record{key=Key, value='_', expires_at='_'}, read) The three parameters there are the name of the table - expiring_records, the matching pattern and the lock type (read lock). The fetch/1 function looks up the key as it was added to the table with store/3. If the key is a tuple, we can also do a partial match: Result=mnesia:match_object(expiring_records, #record{key= {'_', "bongo"}, value='_', expires_at='_'}, read) I've added a search/1 function the module that takes in a matching pattern and returns a list of items where the key matches the pattern. Here's the test for the search/1 function: search_partial_…