c++ – How to get the format of nvdec video frame from ffmpeg decoding ? How to use it in cuda structures?

I am using code, almost exactly github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c, but I am not transferring the decoded data to the host, I keep it on the device for later use with cuda code (my goal will be to use RGB 8-12 bits per channel data).

The interesting part is:

//after decoding/encoding
status = avcodec_receive_frame(avctx, frame);
if (status != AVERROR(EAGAIN) && status != AVERROR_EOF)
    LogMessage(L"frame timestamp: %i\n", (int)frame->best_effort_timestamp);
if (status == AVERROR(EAGAIN) || status == AVERROR_EOF)
{
    av_frame_free(&frame);
    return 0;
}
if (status < 0)
{
    LogMessage(L"Error while decoding\n");
    goto finish;
}
{
    auto descr = AVPixelFormatMap[(AVPixelFormat)frame->format];
    std::wstring dsdescr(descr.begin(), descr.end());
    LogMessage(L"Pixelformat: %s", dsdescr.c_str());

    cudaPitchedPtr CUdeviceptr0{}; // !! hypothetical usage of cudaPitchedPtr !!
    CUdeviceptr0.ptr = frame->data[0];
    CUdeviceptr0.pitch = frame->linesize[0]; //2048 == pitch ?
    CUdeviceptr0.ysize = frame->height; //1080
    CUdeviceptr0.xsize = frame->width; //1920
    cudaPitchedPtr CUdeviceptr1{};
    CUdeviceptr1.ptr = frame->data[1];
    CUdeviceptr1.pitch = frame->linesize[1];
    CUdeviceptr1.ysize = frame->height;
    CUdeviceptr1.xsize = frame->width;
}

I could not find a way to get the data format ((AVPixelFormat)frame->format translates to: “HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.”)

I understand that I have two arrays: frame->data[0] and frame->data[1].

  1. What is each ?

I would like to translate the data into a cuda native structure like:

cudaPitchedPtr CUdeviceptr{};
CUdeviceptr.ptr = frame->data[0];
CUdeviceptr.ysize = frame->height;
CUdeviceptr.xsize = frame->width;

Which is probably wrong, since width==1920 height==1080 and should have colors or YUV data.

  1. How to get to know the data format (RGB, BGR, YUV, 8/10/12bits per channel etc) and the size of the array ?
  2. If it is not RGB8 (or 10-12), can I get that through some ffmpeg implementation, maybe with an nvdec based function ?
  3. What is the correct cuda-native data ? cudaPitchedPtr (I think so since frame->linesize[0] is bigger than frame->width and power of 2) or something else ? if so, how to get the different channels ?
  4. frame->data[0] is uint8_t, but the array is probably something else in the general case; so what is it (as redundancy or extension of “what is the data format”) ?

Notes

  • frame->data[0] contains data corresponding to Y or one of RGB channels, and is pitched. When I save that as an BW image, I recognized the image, except that it is one channel.

Read more here: Source link