Rework the CDEF top edge handling
Avoids some pointer chasing and simplifies the DSP code, at the cost of making the initialization a little bit more complicated.
Also reduces memory usage by a small amount due to properly sizing the buffers instead of always allocating enough space for 4:4:4.