Performance – A tale of two halves
We’ve been looking at performance over the last few days. It was obvious that the application would not scale beyond it’s current meagre limits.
We’ve concentrated on the following:
- Network and downloading
- Map and UI
The application sources all it’s data from the internet, either from Google Maps for the map and route data, or from the Mapcial server running on Google’s App Engine for user configuration. Downloading the route data was taking around 30s over a Wi-Fi link and was longer over the 3G networks. Obviously this isn’t ideal, not to mention the possible network charges that users would incur downloading 1MB of data every time they started up.
We’d already changed the application to cache the image data, and have now extended this to the route data too. This avoids the continually XML DOM parsing of the KML files from My Maps. This has improved the performance significantly and the application will start up in under 4 seconds. This is even more impressive when you realise that it is creating 671 Points of Interest (POI) and over 25,000 points if you load all 13 of the current public routes.
The only minor hiccup in this was after the initial database work it took longer to read the data in. Oh, yes, mustn’t forget to add indexes to the database 😉
The Map api uses Overlays to allow you to add POI and other route data onto the map. When all 13 public routes where loaded up and the user zoomed out to the full extent, scrolling was very slow, taking around 2s to respond to any touch input and redraw the screen.
This was tackled in two stages:
1. UI route drawing
The initial improvement was to look at the route drawing code.
With over 25,000 points making up the lines on the map, there was scope for some improvement. The main change was to remove points that, when zoomed out of the map, would just end up being written over another pixel. The other change was to ignore points that would not be visible when zoomed in. The key in both of these was to write code that would not be computationally intensive, as it needed to be quicker than the actual drawing. We think we’ve managed just that, but there could be some room for further improvement.
2. UI POI drawing
In a similar vein, we investigated not drawing the POIs if they would be obscured by another POI. We implemented a simple grid system that allowed 3 POIs per grid and ignored the others. As the user zooms in, the POIs spread out into adjacent grids to become visible and eventually fall off the screen.
The initial attempt at this didn’t work as well as expected, falling foul of the ‘it takes longer to calculate than to draw’ problem. Whilst looking into this, we noticed that we were using a different overlay per icon type per route, so some overlays only had one or two POIs in them. As an experiment, we put all the POIs into one overlay per route and saw a good performance improvement in this alone. Taking this further, we created one global POI overlay for all the routes and saw another improvement.
When we’re zoomed into the map, then performance is great and there’s hardly any lag. Because of this, we’ve left the performance tuning at this for now, but we may need to revisit the grid idea at a later stage.
The downside of closed-source development
It’s interesting that moving all the POIs to one overlay had such a good performance improvement and this was not expected – we’ve not seen this documented anywhere. We speculate that Google do some bounds checking in their maps code based on the visible map viewport at certain zoom sizes. However, because the source code is closed, we cannot see this and so can’t check. We can only speculate further that it’s a significantly expensive calculation to have to do on more than a couple of overlays (we had around 60 previously).
Another problem with closed source is that you’re continually having to second guess how the code works and what is the best practice when using it. We’ve had to make some significant code ‘enhancements’ to work-around the mapping API to improve it’s performance and to avoid problems. Some of which were to:
- Only call populate() on the Overlay when we had completely populated the Overlay with data. With 671 POIs it’s too expensive to call each time you add something to it
- Don’t make changes to the MapView’s Overlay if you’re not on the UI thread. This was a real pain as the documentation clearly states that the ItemizedOverlay contents are wrapped in a Synchronized Collection and you can add/remove items at will. The only proviso the give is that if you’re going to iterate over the list, then wrap it in a Synchronized block using the list as the lock item. Having notice the occasional ‘Concurrent Modification Exception’ in our app, we spent a few hours dutifully wrapping everything in a Synchronized block where we thought there was the slightest possibility of it causing a Concurrent Modification problem. Needless to say this didn’t work because, we suspect, that Google themselves don’t wrap their code in Synchronized blocks. Once we had sussed this, we removed all our blocks and pushed everything onto the UI thread. (Closed source, grumble, grumble.) To be fair to Google, I also suspect that we were not just adding/removing overlays, but in some instances were changing the contents of the overlays too.
We now have a 5x improvement in app startup time, reduced network traffic and a significantly more responsive UI. This is at the expense of a database that is 1.1MB in size. A small price to pay, we think.
Our next task is to focus on the server. We are holding some user configuration on the server and will query the it every minute to find all the users who are subscribed to the same events as the application user. This is taking around 0.5s, which is OK, but of more concern is the CPU cost Google allocate to this. Since we are constrained to 6.5 hours of free CPU per day, we could burn this up very quickly just by querying on this alone.
We think the major problem here is that the Google App Engine datastore is not relational, yet we are treating it as a relational table with three independent queries to get our data (it doesn’t support joins). Over the next day or two we’ll denormalise this aspect of the data and create a horrible mess of duplicated data and IDs on one table. However, it should be extremely fast.
If this doesn’t work, we’ll look at junking the JPA code we’ve got and seeing if we can hit the database directly.
The Mapcial and Walk2012 versions are functionally identical, with Walk2012 having some additional branding and icon changes. You can now pick up either version from a link on the original annoucement post.
Feel free to leave comments. Did I mention it’s nearly beta, less buggy and even more awesome? 😛