aarch64: ipred: Use the right fill width loop in ipred_z3_fill_padding_neon
This makes the code behave as intended, when filling a rectangle with arbitrary width (filling with the largest power of two width until filled); previously, it accidentally fell back on writing 4 pixel wide stripes immediately.
No measurable effect on checkasm benchmarks though.