ffplay源码分析5-图像格式转换

本文为作者原创，转载请注明出处：https://www.cnblogs.com/leisure_chn/p/10311376.html

ffplay 是 FFmpeg 工程自带的简单播放器，使用 FFmpeg 提供的解码器和 SDL 库进行视频播放。本文基于 FFmpeg 工程 8.0 版本进行分析，其中 ffplay 源码清单如下：
https://github.com/FFmpeg/FFmpeg/blob/n8.0/fftools/ffplay.c

在尝试分析源码前，可先阅读如下参考文章作为铺垫：
[1]. 雷霄骅，视音频编解码技术零基础学习方法
[2]. 视频编解码基础概念
[3]. 色彩空间与像素格式
[4]. 音频参数解析
[5]. FFmpeg基础概念

“ffplay源码分析”系列文章如下：
[1]. ffplay源码分析1-概述
[2]. ffplay源码分析2-数据结构
[3]. ffplay源码分析3-代码框架
[4]. ffplay源码分析4-音视频同步
[5]. ffplay源码分析5-图像格式转换
[6]. ffplay源码分析6-音频重采样
[7]. ffplay源码分析7-播放控制

5. 图像格式转换

图像格式转换实际是为视频播放做准备的。FFmpeg 负责解码，SDL 负责播放，FFmpeg 解码得到的视频帧的格式未必能被 SDL 支持，在这种情况下，需要进行图像格式转换，即将视频帧图像格式转换为 SDL 支持的图像格式，否则是无法正常显示的。

图像格式转换有两种实现方式，一种是通过视频滤镜 (filter) 实现，另一种是直接使用 sws_scale() 函数实现。实际上第一种方式中，视频滤镜的底层也是调用 sws_scale() 实现的图像格式转换。

旧版 ffplay 中滤镜是可选功能，代码通过 CONFIG_AVFILTER 宏决定是否使用滤镜。旧版 ffplay 中滤镜实现格式转换和 sws_scale() 实现格式转换两种代码都存在。新版 ffplay (也就是 c29e5ab5 这个提交之后)，已删除了 CONFIG_AVFILTER 宏，滤镜功能成为必须功能，同时删除了 sws_scale() 实现格式转换的代码。这个改动参考如下提交记录：

2023-03-10 c29e5ab5 fftools/ffplay: depend on avfilter

Making lavfi optional adds a lot of complexity for very questionable
gain.

本节将两种图像格式转换的方法都介绍一下。

5.1 FFmpeg 和 SDL 像素格式

像素格式表示一帧图像的像素点数据在内存中的排布格式。同一种像素格式在不同的系统中有不同的命名，这些系统如 FFmpeg 系统、SDL 系统，V4L2 系统等，都定义了各自系统内的像素格式。所谓“图像格式转换”中的“图像格式”，实际指的就是视频帧的像素格式。

关于像素格式的详细内容，可参考“色彩空间与像素格式”

5.1.1 FFmpeg 和 SDL 像素格式映射表

在 ffplay.c 中定义了一个表 sdl_texture_format_map[]，此表定义了 FFmpeg 像素格式与 SDL 像素格式的映射关系，如下：

static const struct TextureFormatEntry {
    enum AVPixelFormat format;
    int texture_fmt;
} sdl_texture_format_map[] = {
    { AV_PIX_FMT_RGB8,           SDL_PIXELFORMAT_RGB332 },
    { AV_PIX_FMT_RGB444,         SDL_PIXELFORMAT_RGB444 },
    { AV_PIX_FMT_RGB555,         SDL_PIXELFORMAT_RGB555 },
    { AV_PIX_FMT_BGR555,         SDL_PIXELFORMAT_BGR555 },
    { AV_PIX_FMT_RGB565,         SDL_PIXELFORMAT_RGB565 },
    { AV_PIX_FMT_BGR565,         SDL_PIXELFORMAT_BGR565 },
    { AV_PIX_FMT_RGB24,          SDL_PIXELFORMAT_RGB24 },
    { AV_PIX_FMT_BGR24,          SDL_PIXELFORMAT_BGR24 },
    { AV_PIX_FMT_0RGB32,         SDL_PIXELFORMAT_RGB888 },
    { AV_PIX_FMT_0BGR32,         SDL_PIXELFORMAT_BGR888 },
    { AV_PIX_FMT_NE(RGB0, 0BGR), SDL_PIXELFORMAT_RGBX8888 },
    { AV_PIX_FMT_NE(BGR0, 0RGB), SDL_PIXELFORMAT_BGRX8888 },
    { AV_PIX_FMT_RGB32,          SDL_PIXELFORMAT_ARGB8888 },
    { AV_PIX_FMT_RGB32_1,        SDL_PIXELFORMAT_RGBA8888 },
    { AV_PIX_FMT_BGR32,          SDL_PIXELFORMAT_ABGR8888 },
    { AV_PIX_FMT_BGR32_1,        SDL_PIXELFORMAT_BGRA8888 },
    { AV_PIX_FMT_YUV420P,        SDL_PIXELFORMAT_IYUV },
    { AV_PIX_FMT_YUYV422,        SDL_PIXELFORMAT_YUY2 },
    { AV_PIX_FMT_UYVY422,        SDL_PIXELFORMAT_UYVY },
};

此表中定义的格式是 SDL 能支持的格式，这些格式的图像帧送给 SDL 显示是不必进行格式转换的。其实 SDL 系统支持的格式，SDL renderer 未必能支持 (renderer 支持的格式与播放设备硬件相关)，这种情况下 SDL 内部会做格式转换，以使 renderer 能正常渲染显示，细节不展开了。

5.1.2 get_sdl_pix_fmt_and_blendmode()

这个函数的作用是获取 FFmpeg 像素格式所对应的 SDL 像素格式，输入参数 format 指定 FFmpeg 像素格式，输出参数 sdl_pix_fmt 保存获取到的 SDL 像素格式：

static void get_sdl_pix_fmt_and_blendmode(int format, Uint32 *sdl_pix_fmt, SDL_BlendMode *sdl_blendmode)
{
    int i;
    *sdl_blendmode = SDL_BLENDMODE_NONE;
    *sdl_pix_fmt = SDL_PIXELFORMAT_UNKNOWN;
    if (format == AV_PIX_FMT_RGB32   ||
        format == AV_PIX_FMT_RGB32_1 ||
        format == AV_PIX_FMT_BGR32   ||
        format == AV_PIX_FMT_BGR32_1)
        *sdl_blendmode = SDL_BLENDMODE_BLEND;
    for (i = 0; i < FF_ARRAY_ELEMS(sdl_texture_format_map); i++) {
        if (format == sdl_texture_format_map[i].format) {
            *sdl_pix_fmt = sdl_texture_format_map[i].texture_fmt;
            return;
        }
    }
}

5.2 滤镜实现图像格式转换

使用滤镜进行图像格式转换是在视频解码线程中实现的。因为视频解码线程已经将视频帧做了格式转换，然后存入视频 frame 队列。这些帧的格式能被 SDL 支持，所以在视频播放线程中直接从队列取出视频帧送给 SDL 显示就可以了。

5.2.1 video_thread()

再次看一下 video_thread() 的实现：

static int video_thread(void *arg)
{
    ...
    AVFilterGraph *graph = NULL;
    AVFilterContext *filt_out = NULL, *filt_in = NULL;
    int last_w = 0;
    int last_h = 0;
    enum AVPixelFormat last_format = -2;
    int last_serial = -1;
    int last_vfilter_idx = 0;
    ...

    for (;;) {
        // 从packet队列中取出一个packet解码得到一个frame
        ret = get_video_frame(is, frame);
        ...

        // 配置视频滤镜
        if (   last_w != frame->width
            || last_h != frame->height
            || last_format != frame->format
            || last_serial != is->viddec.pkt_serial
            || last_vfilter_idx != is->vfilter_idx) {
            ...
            avfilter_graph_free(&graph);
            graph = avfilter_graph_alloc();
            ...
            graph->nb_threads = filter_nbthreads;
            // 如果滤镜未曾配置过则配置，vfilters_list存的是滤镜描述字符串，形如"transpose=cclock,pad=iw+20:ih"
            if ((ret = configure_video_filters(graph, is, vfilters_list ? vfilters_list[is->vfilter_idx] : NULL, frame)) < 0) {
                ...
                goto the_end;
            }
            filt_in  = is->in_video_filter;
            filt_out = is->out_video_filter;
            last_w = frame->width;
            last_h = frame->height;
            last_format = frame->format;
            last_serial = is->viddec.pkt_serial;
            last_vfilter_idx = is->vfilter_idx;
            frame_rate = av_buffersink_get_frame_rate(filt_out);
        }

        // 将frame送入滤镜输入端
        ret = av_buffersrc_add_frame(filt_in, frame);
        ...

        while (ret >= 0) {
            ...
            // 从滤镜输出端取frame
            ret = av_buffersink_get_frame_flags(filt_out, frame, 0);
            ...
            // 将当前帧压入frame_queue
            ret = queue_picture(is, frame, pts, duration, fd ? fd->pkt_pos : -1, is->viddec.pkt_serial);                
            ...
        }
        ...
    }
    ...
    return 0;
}

基本流程为：1) 解码得到一个视频帧，2) 将视频帧送给滤镜处理，做格式转换等，3) 将滤镜处理后的帧存入 frame 队列。

上述代码第一个 if 语句处，发现视频流格式有更新时，就会重新配置滤镜图，因为 last_format 等变量初始化为无效值，所以视频解码线程启动后，第一次执行 for 循环就会进入 if 分支执行 configure_video_filters() 配置好滤镜图。由这个配置好的滤镜图实现图像格式转换功能。

5.2.2 configure_video_filters()

configure_video_filters() 是用来建立滤镜图的，建立好的滤镜图实现一系列图像处理功能。

5.2.2.1 滤镜和滤镜图概念

滤镜 (filter) 处理的是原始帧数据，不同的视频滤镜具有不同的图像处理功能，如旋转、缩放等，多个滤镜可以串联起来，形成滤镜链 (filter_chain) 或滤镜图 (filter_graph)，一个配置好的滤镜图如下：

滤镜图

滤镜图有两个特殊的端点：buffer 滤镜是滤镜图的输入端点，buffersink 滤镜是滤镜图的输出端点。

滤镜的操作涉及配置和使用两方面，先配置后使用。配置滤镜时，先配置好 buffer 滤镜和 buffersink 滤镜，再调用 configure_filtergraph() 把输入端点 buffer 滤镜、输出端点 buffersink 滤镜和中间各滤镜连接起来，代码可参考下一小节。使用滤镜时，将待处理的视频帧被送入 buffer 滤镜 (调用 av_buffersrc_add_frame() 送入)，经各滤镜处理后，从 buffersink 滤镜流出 (调用 av_buffersink_get_frame_flags() 取出 )。

滤镜的详细内容可参考“FFmpeg原始帧处理-滤镜API用法”

5.2.2.2 configure_video_filters() 代码清单

configure_video_filters() 就是配置生成滤镜图 (filter_graph) 的，函数实现如下：

static int configure_video_filters(AVFilterGraph *graph, VideoState *is, const char *vfilters, AVFrame *frame)
{
    enum AVPixelFormat pix_fmts[FF_ARRAY_ELEMS(sdl_texture_format_map)];
    char sws_flags_str[512] = "";
    int ret;
    AVFilterContext *filt_src = NULL, *filt_out = NULL, *last_filter = NULL;
    AVCodecParameters *codecpar = is->video_st->codecpar;
    AVRational fr = av_guess_frame_rate(is->ic, is->video_st, NULL);
    const AVDictionaryEntry *e = NULL;
    int nb_pix_fmts = 0;
    int i, j;
    AVBufferSrcParameters *par = av_buffersrc_parameters_alloc();

    if (!par)
        return AVERROR(ENOMEM);

    // 获取目标像素格式列表(即当前renderer支持的目标像素格式列表)：
    // FFmpeg中的像素格式与SDL中的像素格式具有对应关系，由映射表sdl_texture_format_map[]定义
    // renderer的texture_formats表示当前renderer支持的texture格式(SDL像素格式)
    // 此处查表得到renderer的texture_formats(SDL像素格式)对应FFmpeg中的像素格式
    for (i = 0; i < renderer_info.num_texture_formats; i++) {
        for (j = 0; j < FF_ARRAY_ELEMS(sdl_texture_format_map); j++) {
            if (renderer_info.texture_formats[i] == sdl_texture_format_map[j].texture_fmt) {
                pix_fmts[nb_pix_fmts++] = sdl_texture_format_map[j].format;
                break;
            }
        }
    }

    while ((e = av_dict_iterate(sws_dict, e))) {
        if (!strcmp(e->key, "sws_flags")) {
            av_strlcatf(sws_flags_str, sizeof(sws_flags_str), "%s=%s:", "flags", e->value);
        } else
            av_strlcatf(sws_flags_str, sizeof(sws_flags_str), "%s=%s:", e->key, e->value);
    }
    if (strlen(sws_flags_str))
        sws_flags_str[strlen(sws_flags_str)-1] = '\0';

    graph->scale_sws_opts = av_strdup(sws_flags_str);

    // 为buffer滤镜创建实例：filt_src(只分配"filter context"，而未做初始化)
    filt_src = avfilter_graph_alloc_filter(graph, avfilter_get_by_name("buffer"),
                                           "ffplay_buffer");
    if (!filt_src) {
        ret = AVERROR(ENOMEM);
        goto fail;
    }

    // 设置buffer滤镜：使用frame中的参数设置了像素格式、宽、高等关键信息
    par->format              = frame->format;
    par->time_base           = is->video_st->time_base;
    par->width               = frame->width;
    par->height              = frame->height;
    par->sample_aspect_ratio = codecpar->sample_aspect_ratio;
    par->color_space         = frame->colorspace;
    par->color_range         = frame->color_range;
    par->frame_rate          = fr;
    par->hw_frames_ctx = frame->hw_frames_ctx;
    ret = av_buffersrc_parameters_set(filt_src, par);
    if (ret < 0)
        goto fail;

    // 初始化buffer滤镜，此函数调用完毕后buffer滤镜初始化完毕
    // 应当在调用avfilter_init_dict()或avfilter_init_str()前设置好滤镜的参数
    // 另外一个函数avfilter_graph_create_filter()函数调用了avfilter_graph_alloc_filter()和
    // avfilter_init_str()两个函数，所以如果前面不使用avfilter_graph_alloc_filter()而使用
    // avfilter_graph_create_filter()，在avfilter_graph_create_filter()已经对滤镜完成了初
    // 始化之后又调用av_buffersrc_parameters_set()是错误的用法，这是旧版ffplay存在的问题，
    // 8.0版已修改此问题
    ret = avfilter_init_dict(filt_src, NULL);
    if (ret < 0)
        goto fail;

    // 为buffersink滤镜创建实例：filt_out(只分配"filter context"，而未做初始化)
    filt_out = avfilter_graph_alloc_filter(graph, avfilter_get_by_name("buffersink"),
                                           "ffplay_buffersink");
    if (!filt_out) {
        ret = AVERROR(ENOMEM);
        goto fail;
    }

    // 设置buffersink滤镜：设置pixel_formats和colorspaces参数
    if ((ret = av_opt_set_array(filt_out, "pixel_formats", AV_OPT_SEARCH_CHILDREN,
                                0, nb_pix_fmts, AV_OPT_TYPE_PIXEL_FMT, pix_fmts)) < 0)
        goto fail;
    if (!vk_renderer &&
        (ret = av_opt_set_array(filt_out, "colorspaces", AV_OPT_SEARCH_CHILDREN,
                                0, FF_ARRAY_ELEMS(sdl_supported_color_spaces),
                                AV_OPT_TYPE_INT, sdl_supported_color_spaces)) < 0)
        goto fail;

    // 初始化buffersink滤镜，此函数调用完毕后buffersink滤镜初始化完毕
    ret = avfilter_init_dict(filt_out, NULL);
    if (ret < 0)
        goto fail;

    last_filter = filt_out;

/* Note: this macro adds a filter before the lastly added filter, so the
 * processing order of the filters is in reverse */
// 将名为name值为arg的filter插入filtergraph中last_filter之后，并将新插入的filter与last_filter连接
#define INSERT_FILT(name, arg) do {                                          \
    AVFilterContext *filt_ctx;                                               \
                                                                             \
    ret = avfilter_graph_create_filter(&filt_ctx,                            \
                                       avfilter_get_by_name(name),           \
                                       "ffplay_" name, arg, NULL, graph);    \
    if (ret < 0)                                                             \
        goto fail;                                                           \
                                                                             \
    ret = avfilter_link(filt_ctx, 0, last_filter, 0);                        \
    if (ret < 0)                                                             \
        goto fail;                                                           \
                                                                             \
    last_filter = filt_ctx;                                                  \
} while (0)

    if (autorotate) {   // 自动旋转
        double theta = 0.0;
        int32_t *displaymatrix = NULL;
        AVFrameSideData *sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DISPLAYMATRIX);
        if (sd)
            displaymatrix = (int32_t *)sd->data;
        if (!displaymatrix) {
            const AVPacketSideData *psd = av_packet_side_data_get(is->video_st->codecpar->coded_side_data,
                                                                  is->video_st->codecpar->nb_coded_side_data,
                                                                  AV_PKT_DATA_DISPLAYMATRIX);
            if (psd)
                displaymatrix = (int32_t *)psd->data;
        }
        theta = get_rotation(displaymatrix);

        if (fabs(theta - 90) < 1.0) {
            INSERT_FILT("transpose", displaymatrix[3] > 0 ? "cclock_flip" : "clock");
        } else if (fabs(theta - 180) < 1.0) {
            if (displaymatrix[0] < 0)
                INSERT_FILT("hflip", NULL);
            if (displaymatrix[4] < 0)
                INSERT_FILT("vflip", NULL);
        } else if (fabs(theta - 270) < 1.0) {
            INSERT_FILT("transpose", displaymatrix[3] < 0 ? "clock_flip" : "cclock");
        } else if (fabs(theta) > 1.0) {
            char rotate_buf[64];
            snprintf(rotate_buf, sizeof(rotate_buf), "%f*PI/180", theta);
            INSERT_FILT("rotate", rotate_buf);
        } else {
            if (displaymatrix && displaymatrix[4] < 0)
                INSERT_FILT("vflip", NULL);
        }
    }

    // 配置并建立滤镜图，将所有滤镜(buffer滤镜，buffersink滤镜，vfilters参数中包含的其他滤镜)连接在一起
    if ((ret = configure_filtergraph(graph, vfilters, filt_src, last_filter)) < 0)
        goto fail;

    is->in_video_filter  = filt_src;
    is->out_video_filter = filt_out;

fail:
    av_freep(&par);
    return ret;
}

代码基本流程就是滤镜的配置流程：

创建和初始化 buffer 滤镜 (滤镜图输入端点)：分配“filter context”，设置参数，初始化
创建和初始化 buffersink 滤镜 (滤镜图输出端点)：分配“filter context”"，设置参数，初始化
配置生成滤镜图，将所有滤镜 (buffer 滤镜，buffersink 滤镜，vfilters 参数中包含的其他滤镜) 连接在一起

5.2.2.3 SDL renderer 支持的格式

configure_video_filters() 中的第一个 for 循环是为了获取“格式转换”滤镜的目标格式列表 (将 SDL renderer 支持的 SDL 格式列表转换为 FFmpeg 格式列表)，存在局部变量 pix_fmts[] 中。

这里涉及到 SDL 中的几个概念：window 指播放视频时弹出的窗口；texture 指一帧图像数据，类似 FFmpeg 中的 frame；renderer 是渲染器，将 texture 渲染至 window，在 window 中把图像实际显示出来。configure_video_filters() 中用到的 renderer_info 变量是 ffplay 定义的静态变量，存储当前 renderer 的信息，renderer_info.texture_formats[] 数组就是当前 renderer 支持的像素格式列表，它是 SDL 系统像素格式的子集，因为 renderer 和底层软硬件相关，比如不同类型的显示器所支持的图像格式不同，一个具体的显示器所支持的图像格式显然是 SDL 系统像素格式的子集。大致弄清楚了这些概念，for 循环处的语句就比较好理解了。我们来看一个 ffplay 中定义的几个 SDL 系统的变量：

static SDL_Window *window;
static SDL_Renderer *renderer;
static SDL_RendererInfo renderer_info = {0};
static SDL_AudioDeviceID audio_dev;

int main(int argc, char **argv)
{
    ...
    flags = SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER;
    ...
    if (SDL_Init (flags)) {
        ...
        exit(1);
    }
    ...

    if (!display_disable) {
        ...
        window = SDL_CreateWindow(program_name, SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, default_width, default_height, flags);
        SDL_SetHint(SDL_HINT_RENDER_SCALE_QUALITY, "linear");
        if (window) {
            renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);
            ...
            if (renderer) {
                if (!SDL_GetRendererInfo(renderer, &renderer_info))
                    av_log(NULL, AV_LOG_VERBOSE, "Initialized %s renderer.\n", renderer_info.name);
            }
        }
        ...
    }

    is = stream_open(input_filename, file_iformat);
    ...
}

从上面可以看到，主线程中创建了 SDL window 和 SDL renderer 后，调用 SDL_GetRendererInfo() 将 renderer 的信息存储到了 renderer_info 变量中，这些信息包括 renderer_info.texture_formats[]，可以在视频解码线程的 configure_video_filters() 中使用。

5.2.2.4 格式转换的源格式和目标格式

在格式转换过程中，源格式就是源 frame 中的像素格式，目标格式是 SDL renderer 支持的格式列表 (存储在局部变量 pix_fmts[] 中)。

再看 configure_video_filters() 代码，可以看到，在配置 buffer 滤镜时，将 frame->format 设置为滤镜图的输入格式，在配置 buffersink 滤镜时，通过 av_opt_set_array(filt_out, "pixel_formats", ..., pix_fmts)) 指定了局部变量 pix_fmts[] 作为滤镜图的输出格式，最后调用 configure_filtergraph() 将各滤镜连接起来形成滤镜图，在建立滤镜图时，FFmpeg 内部会自动按需插入格式转换滤镜，这个格式转换滤镜实现图像格式转换功能。

FFmpeg 经常会自动插入格式转换滤镜，因为插入的滤镜不是在命令行或代码中显式指定的，所以这些自动插入的滤镜是隐式滤镜，这些滤镜在默默地起作用，用户一般也不会关注到它们。如果跟踪一下内部实现，可以发现，实际是 vf_scale 视频滤镜实现的像素格式转换功能，vf_scale 滤镜的底层又是调用 sws_scale() 函数来做格式转换的，细节不展开了。

5.2.3 视频解码和播放过程概览

使用滤镜实现图像格式转换的方式中，格式转换在解码线程中进行，视频解码线程的过程为：解码，转换，写队列；视频播放线程的过程为：读队列，音视频同步控制，SDL 显示。

视频解码线程函数调用过程如下：

video_thread()
|-> get_video_frame()                // 解码
|-> av_buffersrc_add_frame()         // 格式转换输入
|-> av_buffersink_get_frame_flags()  // 格式转换输出
|-> queue_picture()                  // 写队列

视频播放线程函数调用过程如下：

main()
|-> event_loop
    |-> refresh_loop_wait_event()
        |-> video_refresh()           // 从队列读出一帧显示
            |-> compute_target_delay()    // 音视频同步时间校正
            |-> video_display()           // 视频帧显示
                |-> SDL_RenderClear()         // 清空SDL渲染目标
                |-> video_image_display()     // 更新SDL渲染目标
                    |-> upload_texture()         // 更新SDL图像数据
                        |-> SDL_UpdateYUVTexture()/SDL_UpdateTexture()
                    |-> SDL_RenderCopyEx()       // 使用SDL图像数据更新SDL渲染目标
                |-> SDL_RenderPresent()       // 执行渲染，更新屏幕显示

5.3 sws_scale() 实现图像格式转换

新版本 ffplay 中已经删除了 sws_scale() 实现图像格式转换的代码。本节内容不必过多关注，可快速了解一下。本节代码主要取自 FFmpeg 4.4 版本 ffplay.c 源文件。

旧版本的 ffplay，如果打开了 CONFIG_AVFILTER 宏，就打开了视频滤镜，用视频滤镜做格式转换和 sws_scale() 做格式转换两种代码都存在，如果没打开 CONFIG_AVFILTER 宏，就只存在 sws_scale() 做格式转换的代码。为简便，本节只考虑未打开 CONFIG_AVFILTER 宏的情况，这种情况下，图像格式转换是在视频播放线程 (主线程) 中的 upload_texture() 函数中实现的， upload_texture() 内部是调用 sws_scale() 来做像素格式转换。

5.3.1 upload_texture()

upload_texture() 源码如下：

static int upload_texture(SDL_Texture **tex, AVFrame *frame, struct SwsContext **img_convert_ctx) {
    int ret = 0;
    Uint32 sdl_pix_fmt;
    SDL_BlendMode sdl_blendmode;
    // 根据frame中的图像格式(FFmpeg像素格式)，获取对应的SDL像素格式
    get_sdl_pix_fmt_and_blendmode(frame->format, &sdl_pix_fmt, &sdl_blendmode);
    // 参数tex实际是&is->vid_texture，此处根据得到的SDL像素格式，为&is->vid_texture
    if (realloc_texture(tex, sdl_pix_fmt == SDL_PIXELFORMAT_UNKNOWN ? SDL_PIXELFORMAT_ARGB8888 : sdl_pix_fmt, frame->width, frame->height, sdl_blendmode, 0) < 0)
        return -1;
    switch (sdl_pix_fmt) {
        // frame格式是SDL不支持的格式，则需要进行图像格式转换，转换为目标格式AV_PIX_FMT_BGRA，对应SDL_PIXELFORMAT_BGRA32
        case SDL_PIXELFORMAT_UNKNOWN:
            /* This should only happen if we are not using avfilter... */
            *img_convert_ctx = sws_getCachedContext(*img_convert_ctx,
                frame->width, frame->height, frame->format, frame->width, frame->height,
                AV_PIX_FMT_BGRA, sws_flags, NULL, NULL, NULL);
            if (*img_convert_ctx != NULL) {
                uint8_t *pixels[4];
                int pitch[4];
                if (!SDL_LockTexture(*tex, NULL, (void **)pixels, pitch)) {
                    sws_scale(*img_convert_ctx, (const uint8_t * const *)frame->data, frame->linesize,
                              0, frame->height, pixels, pitch);
                    SDL_UnlockTexture(*tex);
                }
            } else {
                av_log(NULL, AV_LOG_FATAL, "Cannot initialize the conversion context\n");
                ret = -1;
            }
            break;
        // frame格式对应SDL_PIXELFORMAT_IYUV，不用进行图像格式转换，调用SDL_UpdateYUVTexture()更新SDL texture
        case SDL_PIXELFORMAT_IYUV:
            if (frame->linesize[0] > 0 && frame->linesize[1] > 0 && frame->linesize[2] > 0) {
                ret = SDL_UpdateYUVTexture(*tex, NULL, frame->data[0], frame->linesize[0],
                                                       frame->data[1], frame->linesize[1],
                                                       frame->data[2], frame->linesize[2]);
            } else if (frame->linesize[0] < 0 && frame->linesize[1] < 0 && frame->linesize[2] < 0) {
                ret = SDL_UpdateYUVTexture(*tex, NULL, frame->data[0] + frame->linesize[0] * (frame->height                    - 1), -frame->linesize[0],
                                                       frame->data[1] + frame->linesize[1] * (AV_CEIL_RSHIFT(frame->height, 1) - 1), -frame->linesize[1],
                                                       frame->data[2] + frame->linesize[2] * (AV_CEIL_RSHIFT(frame->height, 1) - 1), -frame->linesize[2]);
            } else {
                av_log(NULL, AV_LOG_ERROR, "Mixed negative and positive linesizes are not supported.\n");
                return -1;
            }
            break;
        // frame格式对应其他SDL像素格式，不用进行图像格式转换，调用SDL_UpdateTexture()更新SDL texture
        default:
            if (frame->linesize[0] < 0) {
                ret = SDL_UpdateTexture(*tex, NULL, frame->data[0] + frame->linesize[0] * (frame->height - 1), -frame->linesize[0]);
            } else {
                ret = SDL_UpdateTexture(*tex, NULL, frame->data[0], frame->linesize[0]);
            }
            break;
    }
    return ret;
}

frame 中的像素格式是 FFmpeg 中定义的像素格式，FFmpeg 中定义的很多像素格式和 SDL 中定义的很多像素格式其实是同一种格式，只是名称不同而已。

根据 frame 中的像素格式与 SDL 支持的像素格式的匹配情况，upload_texture() 处理三种类型，对应 switch 语句的三个分支：

如果 frame 图像格式对应 SDL_PIXELFORMAT_IYUV 格式，不进行图像格式转换，使用 SDL_UpdateYUVTexture() 将图像数据更新到 is->vid_texture
如果 frame 图像格式对应其他被 SDL 支持的格式 (诸如 SDL_PIXELFORMAT_NV12, SDL_PIXELFORMAT_RGBA32 等)，也不进行图像格式转换，使用 SDL_UpdateTexture() 将图像数据更新到 is->vid_texture
如果 frame 图像格式不被 SDL 支持 (即对应 SDL_PIXELFORMAT_UNKNOWN)，则需要进行图像格式转换
上述 1) 2) 两种类型不进行图像格式转换。我们考虑第 3) 种情况。

5.3.2 get_sdl_pix_fmt_and_blendmode()

get_sdl_pix_fmt_and_blendmode() 用于获得 SDL 中的像素格式，其实现参考 5.1.2 节。

5.3.3 realloc_texture()

realloc_texture() 用于分配 texture，函数实现如下：

static int realloc_texture(SDL_Texture **texture, Uint32 new_format, int new_width, int new_height, SDL_BlendMode blendmode, int init_texture)
{
    Uint32 format;
    int access, w, h;
    if (!*texture || SDL_QueryTexture(*texture, &format, &access, &w, &h) < 0 || new_width != w || new_height != h || new_format != format) {
        void *pixels;
        int pitch;
        if (*texture)
            SDL_DestroyTexture(*texture);
        if (!(*texture = SDL_CreateTexture(renderer, new_format, SDL_TEXTUREACCESS_STREAMING, new_width, new_height)))
            return -1;
        if (SDL_SetTextureBlendMode(*texture, blendmode) < 0)
            return -1;
        if (init_texture) {
            if (SDL_LockTexture(*texture, NULL, &pixels, &pitch) < 0)
                return -1;
            memset(pixels, 0, pitch * new_height);
            SDL_UnlockTexture(*texture);
        }
        av_log(NULL, AV_LOG_VERBOSE, "Created %dx%d texture with %s.\n", new_width, new_height, SDL_GetPixelFormatName(new_format));
    }
    return 0;
}

如果 is->vid_texture 未创建，或者图像长、宽或像素格式有改变 (与 is->vid_texture 的不同)，则重新创建 is->vid_texture，先由 SDL_DestroyTexture() 销毁，再由 SDL_CreateTexture() 创建。

5.3.4 sws_getCachedContext()

sws_getCachedContext() 用于复用或新分配一个 SwsContext，函数原型为：

/**
 * Check if context can be reused, otherwise reallocate a new one.
 *
 * If context is NULL, just calls sws_getContext() to get a new
 * context. Otherwise, checks if the parameters are the ones already
 * saved in context. If that is the case, returns the current
 * context. Otherwise, frees context and gets a new context with
 * the new parameters.
 *
 * Be warned that srcFilter and dstFilter are not checked, they
 * are assumed to remain the same.
 */
SwsContext *sws_getCachedContext(SwsContext *context, int srcW, int srcH,
                                 enum AVPixelFormat srcFormat, int dstW, int dstH,
                                 enum AVPixelFormat dstFormat, int flags,
                                 SwsFilter *srcFilter, SwsFilter *dstFilter,
                                 const double *param);

sws_getCachedContext() 被调用的代码如下：

*img_convert_ctx = sws_getCachedContext(*img_convert_ctx,
    frame->width, frame->height, frame->format, frame->width, frame->height,
    AV_PIX_FMT_BGRA, sws_flags, NULL, NULL, NULL);

检查输入参数，第一个输入参数 *img_convert_ctx 对应形参 struct SwsContext *context。如果 context 是 NULL，调用 sws_getContext() 重新获取一个 context。如果 context 不是 NULL，检查其他项输入参数是否和 context 中存储的各参数一样，若不一样，则先释放 context 再按照新的输入参数重新分配一个 context。若一样，直接使用现有的 context。

5.3.5 sws_scale()

sws_scale() 实现图像格式转换，函数原型如下：

/**
 * Scale the image slice in srcSlice and put the resulting scaled
 * slice in the image in dst. A slice is a sequence of consecutive
 * rows in an image.
 *
 * Slices have to be provided in sequential order, either in
 * top-bottom or bottom-top order. If slices are provided in
 * non-sequential order the behavior of the function is undefined.
 *
 * @param c         the scaling context previously created with
 *                  sws_getContext()
 * @param srcSlice  the array containing the pointers to the planes of
 *                  the source slice
 * @param srcStride the array containing the strides for each plane of
 *                  the source image
 * @param srcSliceY the position in the source image of the slice to
 *                  process, that is the number (counted starting from
 *                  zero) in the image of the first row of the slice
 * @param srcSliceH the height of the source slice, that is the number
 *                  of rows in the slice
 * @param dst       the array containing the pointers to the planes of
 *                  the destination image
 * @param dstStride the array containing the strides for each plane of
 *                  the destination image
 * @return          the height of the output slice
 */
int sws_scale(struct SwsContext *c, const uint8_t *const srcSlice[],
              const int srcStride[], int srcSliceY, int srcSliceH,
              uint8_t *const dst[], const int dstStride[]);

sws_scale() 被调用的代码如下：

if (*img_convert_ctx != NULL) {
    uint8_t *pixels[4];
    int pitch[4];
    if (!SDL_LockTexture(*tex, NULL, (void **)pixels, pitch)) {
        sws_scale(*img_convert_ctx, (const uint8_t * const *)frame->data, frame->linesize,
                  0, frame->height, pixels, pitch);
        SDL_UnlockTexture(*tex);
    }
}

上述代码有三个步骤：

SDL_LockTexture() 锁定 texture 中的一个 rect (第二个参数为 NULL 表示锁定整个 texture)，锁定区具有只写属性，用于更新图像数据。pixels 参数包含 4 个指针，指向一组图像 plane，位于只写锁定区中，用来写入格式转换后的新数据。
sws_scale() 进行图像格式转换，转换后的数据写入 pixels 指定的区域。
SDL_UnlockTexture() 将锁定的区域解锁，锁定区被取消只写属性。
上述三步完成后，texture 中已包含经过格式转换后新的图像数据，可用于显示。

5.3.6 视频解码和播放过程概览

使用 sws_scale() 实现图像格式转换的方式中，格式转换在播放线程中进行，视频解码线程的过程为：解码，写队列；视频播放线程的过程为：读队列，音视频同步控制，格式转换，SDL 显示。