x86: Rewrite sgr 8-bit SSSE3 asm
sgr_8bpc_ssse3 rewrite:
Old:
sgr_3x3_8bpc_ssse3: 140121.1
sgr_3x3_8bpc_avx2: 72965.4
sgr_5x5_8bpc_ssse3: 89859.1
sgr_5x5_8bpc_avx2: 48881.9
sgr_mix_8bpc_ssse3: 236626.5
sgr_mix_8bpc_avx2: 110552.6
New:
sgr_3x3_8bpc_ssse3: 117294.4
sgr_3x3_8bpc_avx2: 72243.5
sgr_5x5_8bpc_ssse3: 79929.6
sgr_5x5_8bpc_avx2: 49798.4
sgr_mix_8bpc_ssse3: 184183.9
sgr_mix_8bpc_avx2: 109771.7
Also includes some minor opti for 16bpc:
Old:
sgr_5x5_10bpc_ssse3: 87026.6
sgr_5x5_10bpc_avx2: 51864.5
sgr_mix_10bpc_ssse3: 205460.2
sgr_mix_10bpc_avx2: 122199.7
New:
sgr_5x5_10bpc_ssse3: 84786.5
sgr_5x5_10bpc_avx2: 51651.3
sgr_mix_10bpc_ssse3: 202722.2
sgr_mix_10bpc_avx2: 122340.0