url: Optimize vlc_url_decode
This MR simply avoids the call to strtoul with a temporary string, which results in a nice enough improvements for each encoded character in the decoded URL
Here's a quick benchmark on my system:
input: "simple%20test"
current version:
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BenchUrlDecode 28.4 ns 28.3 ns 24731904
BenchUrlDecode 28.6 ns 28.5 ns 24731904
BenchUrlDecode 28.6 ns 28.6 ns 24731904
BenchUrlDecode 28.6 ns 28.4 ns 24731904
BenchUrlDecode 29.9 ns 29.8 ns 24731904
BenchUrlDecode 30.7 ns 30.6 ns 24731904
BenchUrlDecode 29.9 ns 29.8 ns 24731904
BenchUrlDecode 28.6 ns 28.5 ns 24731904
BenchUrlDecode 28.5 ns 28.4 ns 24731904
BenchUrlDecode 28.6 ns 28.5 ns 24731904
BenchUrlDecode_mean 29.0 ns 28.9 ns 10
BenchUrlDecode_median 28.6 ns 28.5 ns 10
BenchUrlDecode_stddev 0.803 ns 0.792 ns 10
BenchUrlDecode_cv 2.77 % 2.74 % 10
optimized version:
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BenchUrlDecode 25.9 ns 25.9 ns 28513422
BenchUrlDecode 24.5 ns 24.5 ns 28513422
BenchUrlDecode 24.4 ns 24.4 ns 28513422
BenchUrlDecode 24.3 ns 24.3 ns 28513422
BenchUrlDecode 24.5 ns 24.5 ns 28513422
BenchUrlDecode 24.5 ns 24.5 ns 28513422
BenchUrlDecode 24.7 ns 24.7 ns 28513422
BenchUrlDecode 24.7 ns 24.7 ns 28513422
BenchUrlDecode 24.7 ns 24.7 ns 28513422
BenchUrlDecode 24.6 ns 24.6 ns 28513422
BenchUrlDecode_mean 24.7 ns 24.7 ns 10
BenchUrlDecode_median 24.6 ns 24.6 ns 10
BenchUrlDecode_stddev 0.447 ns 0.447 ns 10
BenchUrlDecode_cv 1.81 % 1.81 % 10
input: "T%C3%a9l%c3%A9vision %e2%82%Ac"
current version:
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BenchUrlDecode 79.1 ns 79.1 ns 8865286
BenchUrlDecode 79.3 ns 79.3 ns 8865286
BenchUrlDecode 79.2 ns 79.2 ns 8865286
BenchUrlDecode 79.4 ns 79.4 ns 8865286
BenchUrlDecode 79.1 ns 79.1 ns 8865286
BenchUrlDecode 79.1 ns 79.1 ns 8865286
BenchUrlDecode 79.2 ns 79.2 ns 8865286
BenchUrlDecode 79.1 ns 79.1 ns 8865286
BenchUrlDecode 79.0 ns 79.0 ns 8865286
BenchUrlDecode 79.2 ns 79.2 ns 8865286
BenchUrlDecode_mean 79.2 ns 79.2 ns 10
BenchUrlDecode_median 79.1 ns 79.1 ns 10
BenchUrlDecode_stddev 0.117 ns 0.111 ns 10
BenchUrlDecode_cv 0.15 % 0.14 % 10
optimized version:
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BenchUrlDecode 57.5 ns 57.5 ns 12235591
BenchUrlDecode 57.2 ns 57.2 ns 12235591
BenchUrlDecode 57.6 ns 57.6 ns 12235591
BenchUrlDecode 57.5 ns 57.5 ns 12235591
BenchUrlDecode 57.8 ns 57.8 ns 12235591
BenchUrlDecode 57.6 ns 57.5 ns 12235591
BenchUrlDecode 57.6 ns 57.6 ns 12235591
BenchUrlDecode 57.6 ns 57.6 ns 12235591
BenchUrlDecode 57.6 ns 57.6 ns 12235591
BenchUrlDecode 57.4 ns 57.4 ns 12235591
BenchUrlDecode_mean 57.5 ns 57.5 ns 10
BenchUrlDecode_median 57.6 ns 57.6 ns 10
BenchUrlDecode_stddev 0.149 ns 0.149 ns 10
BenchUrlDecode_cv 0.26 % 0.26 % 10