ffmpeg – MP4 file starts from a non-key frame
I have used the following ffprobe command to analyse a .mp4 file.
ffprobe -i <input> -show_frames - select_streams v:0 -print_format flat &> save_to_file.text
It produces the following output.
ffprobe version 5.1.3 Copyright (c) 2007-2022 the FFmpeg developers built with gcc 13 (GCC) configuration: --prefix=/home/thanuja/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/thanuja/ffmpeg_build/include --extra-ldflags=-L/home/thanuja/ffmpeg_build/lib --extra-libs=-lpthread --extra-libs=-lm --bindir=/home/thanuja/bin --enable-gpl --enable-libfdk_aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree --enable-openssl --enable-demuxer=spdif --enable-decoder=dolby_e --enable-decoder=ac3 --enable-decoder=eac3 --enable-indev=alsa --enable-outdev=alsa --enable-shared libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'FTV267StoS.mp4': Metadata: major_brand : mp42 minor_version : 1 compatible_brands: isommp41mp42 creation_time : 2023-04-02T23:52:12.000000Z Duration: 00:04:47.84, start: 0.000000, bitrate: 7374 kb/s Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709/unknown/bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 7198 kb/s, 59.94 fps, 59.94 tbr, 90k tbn (default) Metadata: creation_time : 2023-04-02T23:52:12.000000Z handler_name : Core Media Video vendor_id : [0][0][0][0] Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 125 kb/s (default) Metadata: creation_time : 2023-04-02T23:52:12.000000Z handler_name : Core Media Audio vendor_id : [0][0][0][0] frames.frame.0.media_type="video" frames.frame.0.stream_index=0 frames.frame.0.key_frame=0 frames.frame.0.pts=34536 frames.frame.0.pts_time="0.383733" frames.frame.0.pkt_dts=34536 frames.frame.0.pkt_dts_time="0.383733" frames.frame.0.best_effort_timestamp=34536 frames.frame.0.best_effort_timestamp_time="0.383733" frames.frame.0.pkt_duration=1501 frames.frame.0.pkt_duration_time="0.016678" frames.frame.0.pkt_pos="1834827" frames.frame.0.pkt_size="14917" frames.frame.0.width=1920 frames.frame.0.height=1080 frames.frame.0.pix_fmt="yuv420p" frames.frame.0.sample_aspect_ratio="1:1" frames.frame.0.pict_type="P" frames.frame.0.coded_picture_number=120 frames.frame.0.display_picture_number=0 frames.frame.0.interlaced_frame=0 frames.frame.0.top_field_first=0 frames.frame.0.repeat_pict=0 frames.frame.0.color_range="tv" frames.frame.0.color_space="bt709" frames.frame.0.color_primaries="unknown" frames.frame.0.color_transfer="bt709" frames.frame.0.chroma_location="left" frames.frame.0.tags.timecode="20:18:26:50" frames.frame.0.side_data_list.side_data.0.side_data_type="H.26[45] User Data Unregistered SEI message" frames.frame.0.side_data_list.side_data.1.side_data_type="H.26[45] User Data Unregistered SEI message" frames.frame.0.side_data_list.side_data.2.side_data_type="SMPTE 12-1 timecode" frames.frame.0.side_data_list.side_data.2.timecodes.timecode.0.value="20:18:26:50" frames.frame.1.media_type="video" frames.frame.1.stream_index=0 frames.frame.1.key_frame=0 frames.frame.1.pts=36036 frames.frame.1.pts_time="0.400400" frames.frame.1.pkt_dts=36036 frames.frame.1.pkt_dts_time="0.400400" frames.frame.1.best_effort_timestamp=36036 frames.frame.1.best_effort_timestamp_time="0.400400" frames.frame.1.pkt_duration=1501 frames.frame.1.pkt_duration_time="0.016678" frames.frame.1.pkt_pos="1857434" frames.frame.1.pkt_size="14472" frames.frame.1.width=1920 frames.frame.1.height=1080 frames.frame.1.pix_fmt="yuv420p" frames.frame.1.sample_aspect_ratio="1:1" frames.frame.1.pict_type="P"
As seen in this output, the 1st frame appears to be a P frame. I’m curious as to how will the decoder decode this frame without a preceding I frame (key frame)? The P frame will not have enough information to reconstruct the image from its residual data without the key frame.
Read more here: Source link