[Discussion] OpenGL upscalers
Hi,
I am attempting to continue the work Louis Régnier did during his GSoC to integrate upscalers from libplacebo into our OpenGL vout (patchset, in particular [PATCH 5/5] opengl: WIP introduce libplacebo sampling filters
):
I am not sure I'm going in the right direction, so I feel the need to discuss it.
Context
Here is a quick reminder of how the OpenGL vout/filters work.
The OpenGL vout basically runs a specific OpenGL filter (a renderer), so the goal is to add support for upscalers to OpenGL filters.
An OpenGL filter executes an OpenGL program, compiled from a vertex shader and a fragment shader. Since the input picture comes from a picture_t
, it could be in any format (I420, RGB, etc.), with any orientation or offset, may be stored in hardware (opaque), etc. In order to avoid handling all these details in every filter, the API exposes a sampler to abstract the input picture from its storage, so that filters may access the input picture uniformly.
Concretely, the sampler generates a piece of fragment shader code to expose a specific GLSL function:
vec4 vlc_texture(vec2 pic_coords)
which handles all the details to return the RGBA values for the given picture coordinates (independently of the concrete texture coordinates). In addition, it exposes ops to interact with the generated GLSL code in the OpenGL program (to fetch locations of any uniform or attributes variables and load the data before drawing).
Thus, to access the input image, an OpenGL filter concatenates GLSL code to create its fragment shader. For illustration purposes, here is such a possible fragment shader:
#version 120
// generated by the sampler
uniform mat4 ConvMatrix;
...
vec4 vlc_texture(vec2 pic_coords) {
...
}
// filter code
varying vec2 pic_coords;
vec4 my_awesome_filter(vec4 pix) {
return …
}
void main() {
vec4 pix = vlc_texture(pic_coords);
gl_FragColor = my_awesome_filter(pix);
}
To support libplacebo upscalers, the idea is to provide a specific implementation of vec2 vlc_texture(vec2 pic_coords)
, which in turn would use the code generated by libplacebo.
In addition to abstract the picture storage, a sampler currently naturally apply linear scaling, so it makes a lot of sense to extend it to be in charge of more complex upscaling algorithms. And as a side effect, it would also allow any filters to benefit from these upscaling algorithms.
Libplacebo input
The first difficulty is that the libplacebo generated code accesses the input texture directly (via texture2D()
for example), so it may not access arbitrary VLC input (with orientation, offset, etc.). Btw, even if it could (by adding support to call some VLC-provided GLSL function for example), we probably would not want to do that for performance reasons: each fragment (pixel) requires several tens of texture accesses (typically the neighboring pixels), and we do not want to apply the transformation matrices on every access.
So to make it work properly, we must first render the input picture in the correct orientation without offset to intermediate textures, then use these intermediate textures as input for a second pass, the OpenGL program with a fragment shader including the libplacebo-generated upscaler code.
First rendering pass
Here comes the second difficulty: the sampler provides a part of a fragment shader, to be embedded into the client OpenGL program; it is not designed to be multipass at all (i.e. run a separate OpenGL program to draw the input picture in a first pass).
Since this first pass is basically a draw filter (but one on each plane without chroma conversion), it is possible to tinker a solution which inserts a separate (modified) draw filter in front of the client filter. It is a bit tricky/hacky due to initialization order (the sampler must know which scaling algorithm to use on creation, but it is created differently depending on its input, in particular if a draw filter in inserted in front), but I managed to make something pretty much work.
Doubts
Now you have more context, here is what bothers me next.
From the beginning, I assumed that an upscaler filter must necessarily be executed on each plane separately (for example on an I420 input) before chroma conversion. The rationale is that if we convert to RGBA first (or at least scale the U and V textures to the Y plane size), then the real upscaler will operate on linearly-upscaled chroma planes, giving suboptimal results.
However, after discussions with @haasn, there is a performance trade-off to consider. In particular:
- upscaling planar textures is less efficient than packed textures (and we must do the heavy work 2 or 3 times, once per texture)
- the chroma conversion to RGBA will happen at higher resolution
Moreover, some scalers might want to operate on RGB input (e.g. neural network trained on RGB data), so it may not operate on individual planes separately.
Therefore, the solution I'm trying to implement, in addition to being a bit hacky (insert a filter from a sampler, whose initialization depends on that inserted filter), is very restrictive: it could not be adapted to implement slight variations.
Here is what libplacebo does (quoting @haasn):
- merge U and V planes into a single UV texture
- upscale UV to same size as Y (using the target upscaler, so the UV planes are upscaled twice with the proper upscaler)
- convert to RGB
- upscale this to target resolution
This behavior does not fit our sampler design at all:
- in our OpenGL filters API, filters are either "normal" (output RGBA) or "planes" (output the same format as input, this had been added in a second time to support deinterlace filters): we could not take I420 input and output NV12 for example (by adding an additional filter in front)
- a sampler does not support multi-rendering pass (btw, each pass would require a sampler)
As a "plan B", we could decide to implement upscalers in the renderer (an OpenGL filter) directly. But it does not work either: the sampler exposes the input picture independently of the format/storage (on purpose, so it can do all the heavy conversion work), but then the UV planes are not exposed to be upscaled before doing the chroma conversion to RGB.
I don't have any reasonable solution in mind.