arm64: itx: Fix overflow/clipping in negation in idct16
Don't assume we can do a clipped negation in 16 bit before the multiplication (as it might affect the end result), but do the multiplication first and negate in 32 bit, just like in the reference.