Unique values in C++

I was profiling a particular piece of code in one of our products at Coherent Labs and noticed some peculiar results. Basically each frame we have to keep track of a certain amount of elements that have changed and at the end of the frame we have to do an operation on all of them. The elements are tracked with pointers that are put in an std::unordered_set because due to different logic paths we might flag the same element more than once.

I noticed that both adding many elements in the set and then walking it took more time than I’d like. Putting aside the operations on the pointers themselves (and the need to achieve better cache coherency in that domain), I concentrated on the set itself.

The VS 2015 implementation seems to keep the different “buckets” of the set in a list. At that point I was already pretty worried, I saw in the profiler that substantial time was going in an std::list iterator – not exactly the high performance code I’d like.

I decided to remove some indirection and improve cache hits by changing the unordered_set with a vector. I blindly push_back elements in the vector. The “uniqueness” of the pointers is required only when we operate on them, so when needed I run and std::sort on the vector and an std::erase with std::unique. This gives me all the elements without duplicates. At this point I have to note that when pushing elements in the set, duplicates happen very rarely. The result I got after changing the std::unordered_set in an std::vector with sort & erase were promising, so I made a larger test.

You can see the whole test here: https://gist.github.com/stoyannk/9df2df9d25e1acba7747

I test 3 cases – a std::unordered_set; and std::vector and Google’s dense_hash_set, which is regarded as very fast. A simple sqrt on each element simulates some “work” done on each element. I also allow some “duplicates” multiplier. The results are interesting.

No duplicates:

Time for set: 236.609
Time for vector: 133.017
Time for google set: 234.986

2 duplicates factor:

Time for set: 133.731
Time for vector: 129.611
Time for google set: 131.001

4 duplicates factor:

Time for set: 71.606
Time for vector: 116.688
Time for google set: 77.66

When duplicates are rare – the std::vector outperforms substantially both sets. Things however quickly degrade if we introduce many duplicates. This makes sense as it puts much more pressure on the std::sort and std::unique calls. At that moment I had an idea. In my use-case the work done on each element is relatively light and doing it twice on an element has no side effect – f(f(x)) == f(x) in this case. I have little duplication anyway, so why don’t we ditch the uniqueness requirement.

I added the “UNIQUE_VEC” define – it controls if we do the sort & erase on the vector and re-ran the tests.

No duplicates:

Time for set: 248.518
Time for vector: 66.078
Time for google set: 234.395

2 duplicates factor:

Time for set: 125.65
Time for vector: 58.664
Time for google set: 138.612

4 duplicates factor:

Time for set: 71.207
Time for vector: 51.306
Time for google set: 77.191

Now the vector in the no duplicates case is really the clear winner. Clearly here we are much more dependent on how ‘heavy’ the work-load per-element is, but in duplication factors close to 1 that should be irrelevant, especially when we have many elements.

In my experience duplication is actually rare when we need it in game code. In that case it is almost always better to use an array (vector) and do the uniqueness yourself. Even better if the operation is fast and the implementation supports it – ditch the uniqueness altogether.

Advertisements