Control the stack size of spawned threads
This deals with #249 (closed) in a different way for the threads we spawn, but we can't do anything about the main thread, so that is up to the user to handle.
Actual stack usage appears to be somewhere between 256 KiB and 512 KiB because by default Fuchsia uses the former and has issues while MacOS uses the latter and that works fine. Windows uses 1024 KiB and Linux 8192 KiB for reference (On x86-64 at least, no idea if the numbers are different for other architectures).
After this change virtual memory usage is reduced by around half a gigabyte on Linux when decoding a video with --framethreads 8 --tilethreads 8 (scaling linearly with thread count).
Edited by Henrik Gramner