Use some 8 bit arithmetic in AVX2 CDEF filter
Before:
cdef_filter_8x8_8bpc_avx2: 252.3
cdef_filter_4x8_8bpc_avx2: 182.1
cdef_filter_4x4_8bpc_avx2: 105.7
After:
cdef_filter_8x8_8bpc_avx2: 235.5
cdef_filter_4x8_8bpc_avx2: 174.8
cdef_filter_4x4_8bpc_avx2: 101.8