Anyway, this week I have been focusing on trying to learn OpenCL.
It's a bit like only using Compute Shaders. It's pretty different from your other types of shader though. There are devices (hey I know what those are), contexts (not too difficult a concept), and programs, buffers and kernels (shader programs). Some of which I am familiar with from Vulkan and OpenCL.
The hardest parts for me to get my head around were the work items and work groups. It's easier to think of the game 7 Billion Humans at this point. Each worker has a work item, and they are in a work group (like a room). A worker is essentially a thread. The weirdest part of this is that you - the programmer - have to say how many work items you want and how many work groups you want. Each work group has the same number of work items (I think). So you have to divide what you want to do up. For my GPU device I have 4100 workers available (just less than 64x64). So to optimise, I chopped a bit off my 20 billion pixel image - only doing 20246528 pixels. The width and height properties I set arbitrarily at 4096 and 4943.
In order to get it work, some numbers had to be fudged. Workers may not have been working on what they were supposed to in the end. I WILL COME BACK TO THIS AND DO IT RIGHT. But for now, I wanted to share my timing value. It took 0.62 seconds. That is the slowest yet. I will share the code later once I am doing what I am supposed to be doing.
In my Lambda attempt I used the following code:
oddNumber = static_cast(std::count_if(std::begin(data), std::end(data), [](const int val) { return val & 1; })); evenNumber = COUNT - oddNumber;
In conclusion lambda expressions are pretty cool. OpenCL is a little confusing for my graphics orientated brain, but I will get to know it better, and again - parallelism is not always the solution.
âStay tuned for a boring update on my game on Monday - progress has been slow.