A wayfire user decided to port weston smoke into wayfire and while it worked, it was quite slow, even at modest sizes. I patched weston smoke to be resizable but this only showed how slow the cpu-bound algorithm is. So, I decided to try my hand at making it faster, using a compute shader. At first, there was a single shader but realized that each loop needed its own shader. It was also slow at first but after tweaking the workgroups per dispatch and the workgroup size in each shader, the difference was multi-fold. It took some time to implement and optimize but it worked out pretty well. The code has been uploaded here. The performance improvement results are astounding and there isn’t much else to do than watch some comparison videos:
Original Weston Smoke Client:
Implementation in a Decorator:
Side by Side Standalone Clients: