Some dav1dplay features
The --zerocopy
mode is currently a slight CPU usage reduction, but overall a performance loss when benchmarking in untimed mode. The reason is because we suffer from lock contention between the decoder thread and the player thread, which need to access the same gpu
object, even where it's technically totally unnecessary (e.g. in the swapchain block). The best solution would be to make either dav1dplay or (preferably) dav1d smart enough to reuse pictures instead of thrashing them like this.
(Or, alternatively, we need to make libplacebo thread-safe and do more granular internal locking, to avoid holding locks while blocked)