Raster statistics + SIMD === winning
Recently, the .NET framework 4.6 enabled features that would allow us to leverage SIMD (Single Instruction Multiple Data) capabilities that have been present in our CPUs since the 2000s and deliver faster performance for calculating raster statistics.
We were using the .NET bindings for GDAL to calculate a few simple raster statistics for reporting needs. I got curious as to the kind of performance boost that we could be seeing if we leveraged SIMD for this purpose. So, I created a little console app to explore the performance gains. Here is what i observed.
To start off, we need to check if our CPU supports SIMD instructions. Although, I expect that everyone’s computer will support SIMD, here is simple way to check. Download and run CPU-Z. Keep an eye out for the SSE instructions as shown below.
Clone the git repo or just download the executable from here. And run the SIMD.exe file with a GeoTIFF file as a parameter. Needless to say, the console app is pretty rudimentary and was written solely to observe SIMD performance gains. I would highly encourage you to clone the repo and explore further if you are interested. The app currently just reads all the raster image into memory and sums all the values to produce a total sum. Not very useful, but i am hoping it can get you started in the right direction to explore further. The amount of code required or in other words to modify your current app to leverage should be manageable and shouldn’t require a complete rewrite. Also, currently the SIMD support in the .NET framework only works under 64-bit and in release mode. That had me stumped for a bit. But, there is very useful “IsHardwareAccelerated” Flag that helps you find out if your vectorized code is actually reaping the benefits of the SIMD hardware.
Raster X size: 28555
Raster Y size: 17939
Raster cell size: 512248145
Total is: 146097271
Time taken: 4568
SIMD accelerated: True
Another Total is: 146097271
Time taken: 1431
Here is the result from a run on a sample raster image. With the SIMD vectors, the execution went from 4568 to 1431 milliseconds. That’s almost 1/3 of the original time. Pretty tempting isn’t it?
The performance measure doesn’t really prove anything and is purely meant to satisfy my own curiosity. Your mileage may vary. To eek out any gains out of using SIMD vectors, you will need to be processing a large enough array that will justify the overhead of creating the SIMD vectors.