Slowville: std::map

Many people don’t realise how slow the  map that comes with the standard template library is.

Reciently, did some performance comparisons of std::map versus boost::unordered_map.

These are the benchmarks that I tested:

fmap<int, int> mapTest;
fmap<int, int>::iterator iter;
int time1 = GetMilliSeconds();
int n;
int randNum, rand2;
rand2 = 0;
for (n = 0; n < 100000; n++)
 randNum = rand()%1000;
 iter = mapTest.find(randNum);
 if (iter != mapTest.end())
 rand2 = iter->second;
 mapTest[randNum] = rand2;
 rand2 = n;
int time2 = GetMilliSeconds();
int timeElapsed = abs(time2-time1);
cout << "map test1 time:" << timeElapsed << "ms" << endl;
time1 = time2;

fmap<int, int> mapTest2;
fmap<int, int>::iterator iterLowest;
int f;
for (n = 0; n < 10000; n++)
 for (f = 0; f < 10; f++)
 randNum = rand()%1000;
 iter = mapTest2.find(randNum);
 if (iter != mapTest2.end())
 rand2 = iter->second+1;
 mapTest2[randNum] = rand2;
 rand2 = n;
 // find lowest
 iterLowest = mapTest2.begin();
 for (iter = mapTest2.begin(); iter != mapTest2.end(); iter++)
 if (iter->second > iterLowest->second)
 iterLowest = iter;
time2 = GetMilliSeconds();
timeElapsed = abs(time2-time1);
cout << "map test2 time:" << timeElapsed << "ms" << endl;

(Unfortunately, this looks awful because stupid word press strips out my code indentation 😦 )

And these are the results running on 1st gen iPod touch:

map test1 time:115ms
map test2 time:2251ms

map test1 time:200ms
map test2 time:3940ms

As you can see it’s nearly twice as slow.

std::map is an ordered map. In other words, when iterating the values are ordered according to the key. Normally, you don’t need this functionality, so using a hash map like boost::unordered_map is a no-brainer.

Reducing game start up time

A lot game developers consider the start-up time of a game to be of little importance. Unfortunately, the truth is that users do not like looking at the hourglass. So, here are two tips to reduce start-up time:

1. Use Targa instead of PNG.

PNGs take several times longer to decode than compressed Targa. I think it’s around 5 times slower. (I once made some measurements on the iPhone, but I lost the figures <doh>).  On the other hand, compressed Targa files are about 40% bigger, but disk space usually isn’t that critical. You can use the code here to decode targa files:

An interesting side effect of this optimization is that it also reduces development time. Everytime you start up the debugger it has to decode all those 1024×1024 textures, before you can begin.

2. Read files by blocks, not word by word

Preferably, read the whole file in one go. This is how you do it with C code:

FILE* pFile;
pFile = fopen(fileName.c_str(), "rb");
if (pFile == NULL) return;
fseek(pFile, 0, 2 /* SEEK_END */);
fileSize = (int)ftell(pFile);
fseek(pFile, 0, 0 /* SEEK_SET */);
pBuffer = new char[filesize];
int bytesRead = (int)fread(pBuffer, fileSize, 1, pFile);