itx: clip according to spec, fixes #103
This does not adjust the AVX2 asm. The asm clips in many places to the required range (16-bit signed) for performance reason. No mismatch observed with coefs generated by the forward transform in checkasm in 10 thousand runs.