FFmpeg and SDL Tutorial - Synching Audio

Notice

Recent Posts

Recent Comments

Link

YouTube

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

buntalk.com

FFmpeg and SDL Tutorial - Synching Audio 본문

FFmpeg

FFmpeg and SDL Tutorial - Synching Audio

분석톡톡 2023. 3. 19. 18:17

2019/04/08

Tutorial 06: Synching Audio

http://dranger.com/ffmpeg/tutorial06.html

마지막시간으로, 동기화에 대해 조금더 살펴봐야 하는데, 이름하여 비디오 시간에 오디오를 동기화 하는 것이다. 비디오에서 했던것과 같은 방법으로 하는데, 내부 비디오 클락을 만들어 비디오 스레드가 얼마나 멀리 있는지 추적하고 오디오를 거기에 맞춘다. 이후에 우리는 외부 클락에 맞춰 오디오와 비디오를 동기화하는 일반화 방법에대해 살펴본다.

비디오 클락 구현하기

오디오 클락에서처럼 비디오 클락을 구현하기를 원한다. : 현재 재생한 비디오의 현 시간 옵셋이 주어지는 내부값이 있다. 먼저, 보여진 마지막 프레임의 현 PTS에 맞춘 타이머를 업데이트하는 만큼 간단하게 생각할 수 있다. 그러나, 비디오 프레임 사이의 시간은 밀리세컨드 단위보다 꽤 길수있다는 점이다. 솔루션은 다른 값을 유지하는 것으로서 마지막 프레임의 PTS로 비디오 클락을 설정한 시간값이다. 이 방법으로 비디오 클락의 현 값은 PTS_of_last_frame + (current_time - time_elapsed_since_PTS_value_was_set)이 된다. 이 솔루션은 get_audio_clock에서 했던것과 유사하다.

그러니 큰 구조체에 double video_current_pts와 int64_t video_current_pts_time을 추가한다. 클락 업데이팅은 video_refresh_timer에서 수행된다.

void video_refresh_timer(void *userdata)
{
  /* .. */
  if (is->video_st)
  {
    if (is->pictq_size == 0)
    {
      schedule_refresh(is, 1);
    }
    else
    {
      vp = &is->pictq[is->pictq_rindex];
      is->video_current_pts = vp->pts;
      is->video_current_pts_time = av_gettime();

stream_component_open에서 초기화하는 것도 잊지말자

is->video_current_pts_time = av_gettime();

그리고 이제 해야할 것은 정보를 얻는 방법이다.

double get_video_clock(VideoState *is)
{
  double delta;
  delta = (av_gettime() - is->video_current_pts_time) / 1000000.0;
  return is->video_current_pts + delta;
}

클락 추상화하기

하지만 어째서 비디오 클락을 사용하기를 강제하는가? 비디오 싱크 코드를 수정해 오디오와 비디오가 각각에 싱크하지 않도록 한다. ffplay에서 그렇듯 커맨드라인 옵션을 만든다고 상상해보자. 이 것들을 추상화해보자: 새로운 래퍼 함수를 만들 것이다. get_master_clock 은 av_sync_type 변수를 확인하고 get_audio_clock, get_video_clock 또는 사용할 다른 클락을 호출한다. get_external_clock이라 부를 다른 컴퓨터 클락을 사용할 수도 있다.

enum
{
  AV_SYNC_AUDIO_MASTER,
  AV_SYNC_VIDEO_MASTER,
  AV_SYNC_EXTERNAL_MASTER
};

#define DEFAULT_AV_SYNC_TYPE AV_SYNC_VIDEO_MASTER

double get_master_clock(VideoState *is)
{
  if (is->av_sync_type == AV_SYNC_VIDEO_MASTER)
  {
    return get_video_clock(is);
  }
  else if (is->av_sync_type == AV_SYNC_AUDIO_MASTER)
  {
    return get_audio_clock(is);
  }
  else
  {
    return get_external_clock(is);
  }
}
main()
{
  ... is->av_sync_type = DEFAULT_AV_SYNC_TYPE;
  ...
}

오디오 동기화하기

이제 어려운 부분으로: 오디오를 비디오 클락에 동기화이다. 우리 전략은 오디오가 어디인지 측정하여 비디오 클락에 비교하고 얼마나 많은 샘플을 조율해야 하는지를 확인하고 그래서 샘플을 제거해 스피드를 올리거나 추가해서 속도를 줄여여하는것일까?

synchronize_audio함수를 매번 실행해 적절히 오디오 샘플의 각 세트가 줄어들거나 확장하게 해야한다. 그러나, 우리는 매번 동기화하지는 않는데 비디오 패킷을 처리하는 것보다 오디오를 처리하는 것이 훨씬 자주이기 때문이다. 그러니 synchronize_audio함수로의 연속적 호출을 적게 설정한다. 물론, 이전 처럼"아웃오브싱크"의 의미는 오디오 클락이 비디오 클락과 싱크 스레졸드보다 클때이다.

일러두기: 여기서 무슨일이 벌어지는가? 이 방정식은 마술처럼 보인다. 이 것은 기본적으로 가충치된 평균인데 가중치가 있는 지오메트릭 수열에서 사용하는 것이다. 이 것에 이름은 모르지만 더 많은 정보를 위해 여기에 설명을 해둔다. 이제 프랙셔널 계수인 c를 사용해 N개의 오디오 샘플이 아웃오브싱크라고 하자. 아웃오브싱크의 총량은 달라질 수있으므로 이 평균을 처리하는 것이 좋다. 그래서 예를 들어, 첫 콜이 40ms 두번째가 50ms 등드이면. 그러나 가장 최근의 값이 더욱 중요하므로 단순 평균으로는 안된다. 그러니 c라는 프랙셔널 계수를 사용한다. diff_sum = new_diff + diff_sum*c . 평균차이를 찾으면 간단히 avg_diff = diff_sum*(1-c)가 된다.

weightedmean

***

The function we are using is a weighted mean using a geometric series as its

weights. A weighted mean is defined like this:

w_0*x_0 + w_1*x_1 + ... + w_n*x_n

---------------------------------

w_0 + w_1 + ... + w_n

If you substitute 1 in for each `w_n`, you get your normal everyday arithmetic

mean (a.k.a. an _average_).

Our function is basically a repetition of:

total = d_x + c*avg;

However, you can also look at it like this:

total = c^n*d_0 + c^(n-1)*d_1 + ... + c*d_(n-1) + d_n

in which case, this is just the top part of a weighted mean with `c^n,

c^(n-1)...` as the weights. That means the bottom half is `c^n+c^(n-1)...`,

which, as you may have guessed, is a simple geometric sum which works out to

`1/(1-c)` as n approaches infinity.

So, by approximation, the weighted mean of our sequence of diffs is simply:

total

------- = total * (1-c)

-----

(1-c)

So when we get the final total and want to know the average, we just multiply

it by 1-c and get the answer! There is probably a name for this way of taking

the mean of a sequence, but I'm pretty ignorant and I don't know it. If you

know it, please email me.

***

여기까지의 함수는 다음과 같은 모습을 갖는다.

/* 더 나은 싱크를 위해 샘플을 추가하거나 빼며 새로운 오디오 버퍼사이즈를 반환한다 */
int synchronize_audio(VideoState *is, short *samples, int samples_size, double pts)
{
  int n;
  double ref_clock;
  n = 2 * is->audio_st->codec->channels;
  if (is->av_sync_type != AV_SYNC_AUDIO_MASTER)
  {
    double diff, avg_diff;
    int wanted_size, min_size, max_size, nb_samples;
    ref_clock = get_master_clock(is);
    diff = get_audio_clock(is) - ref_clock;
    if (diff < AV_NOSYNC_THRESHOLD)
    {
      // diff들 합산
      is->audio_diff_cum = diff + is->audio_diff_avg_coef * is->audio_diff_cum;
      if (is->audio_diff_avg_count < AUDIO_DIFF_AVG_NB)
      {
        is->audio_diff_avg_count++;
      }
      else
      {
        avg_diff = is->audio_diff_cum * (1.0 - is->audio_diff_avg_coef);
        /* 버퍼 확장/줄이기 코드 */
      }
    }
    else
    {
      /* 차이가 너무 큼, diff 를 리셋 */
      is->audio_diff_avg_count = 0;
      is->audio_diff_cum = 0;
    }
  }
  return samples_size;
}

우리는 대략적으로 비디오로부터 오디오가 얼마나 차이가 있는지 알 수 있다. 그러니 얼마나 많은 샘플을 추가하고 제거해야 하는지를 이 코드를 추가하여 처리한다. /* 버퍼 확장/줄이기 코드 */

if (fabs(avg_diff) >= is->audio_diff_threshold)
{
  wanted_size = samples_size + ((int)(diff * is->audio_st->codec->sample_rate) * n);
  min_size = samples_size * ((100 - SAMPLE_CORRECTION_PERCENT_MAX) / 100);
  max_size = samples_size * ((100 + SAMPLE_CORRECTION_PERCENT_MAX) / 100);
  if (wanted_size < min_size)
  {
    wanted_size = min_size;
  }
  else if (wanted_size > max_size)
  {
    wanted_size = max_size;
  }

audio_length*(sample_rate* #of channels*2) 는 오디오의 audio_length내의 샘플의 숫자이다. 그러므로, 우리가 원하는 샘플의 갯수는 우리가 이미 가진 샘플의 갯수에 오디오가 드리프트되는 시간의 양에 연관되는 샘플의 갯수를 더하거나 빼서 만들어진다. 코렉션에 크거나 작거나에 제한을 설정해 버퍼가 너무 커지는것을 방지한다.

적절한 샘플의 갯수

이제 오디오를 맞게 해야한다. synchronize_audio함수가 샘플 사이즈를 반환하며 얼마나 많은 바이트를 스트림에 보낼지 알수있게 한다. 그러니 샘플 사이즈를 wanted_size에 조율해야 한다. 이 작업은 샘플 사이즈를 작게 한다. 그러나 원한다면 크게할 수도 있다. 버퍼에 더이상 뎅이터가 없을 정도로 샘플사이즈를 크게하지는 않는다. 그래서 이를 더해야 한다. 하지만 어떤 것을 더해야 할까? 해보고 외삽하는 것은 올바른 선택이 아니다. 그러니 마지막 샘플의 값으로 버퍼를 패딩한 것을 오디오로 사용한다.

if (wante_size < samples_size)
{
  /* 샘플 제거 */
  samples_size = wanted_size;
}
else if (wanted_size > samples_size)
{
  uint8_t *samples_end, *q;
  int nb;
  /* 마지막 샘플을 복사하여 샘플을 추가 */
  nb = (samples_size - wanted_size);
  samples_end = (uint8_t *)samples + samples_size - n;
  q = samples_end + n;
  while (nb > 0)
  {
    memcpy(q, samples_end, n);
    q += n;
    nb -= n;
  }
  samples_size = wanted_size;
}

이제 샘플 사이즈를 반환하고 함수에 대한 작업을 마친다. 지금 해야할 것이 이 것을 사용하는 것이다.

void audio_callback(void *userdata, Uint8 *stream, int len)
{
  VideoState *is = (VideoState *)userdata;
  int len1, audio_size;
  double pts;
  while (len > 0)
  {
    if (is->audio_buf_index >= is->audio_buf_size)
    {
      /* 모든 데이터를 보냈다. 더 얻는다 */
      audio_size = audio_decode_frame(is, is->audio_buf, sizeof(is->audio_buf), &pts);
      if (audio_size < 0)
      {
        /* 만약 에러가 있으면 사일런스를 출력 */
        is->audio_buf_size = 1024;
        memset(is->audio_buf, 0, is->audio_buf_size);
      }
      else
      {
        audio_size = synchronize_audio(is, (int16_t *)is->audio_buf, audio_size, pts);
        is->audio_buf_size = audio_size;

우리가 한 것은 synchronize_audio를 호출한 부분이다. (또한, 소스코드에서 위의 변수를 초기화하는 부분을 확인하자)

마치기전에 마지막 한가지는: 마스터 클락이면 비디오를 동기화하지 않는 점에 대한 if문이다.

if (is->av_sync_type != AV_SYNC_VIDEO_MASTER)
{
  ref_clock = get_master_clock(is);
  diff = vp->pts - ref_clock;
  /* 프레임을 스킵하거나 반복, FFPlay가 여전히 최상의 추측을 모를 때 딜레이한다 */
  sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD;
  if (fabs(diff) < AV_NOSYNC_THRESHOLD)
  {
    if (diff <= -sync_threshold)
    {
      delay = 0;
    }
    else if (diff >= sync_threshold)
    {
      delay = 2 * delay;
    }
  }
}

여기 까지다. 소스파일을 확인해 모든 변수를 초기화하는지 확인한다.

gcc -o tutorial06 tutorial06.c -lavutil -lavformat -lavcodec -lswscale -lz -lm `sdl-config --cflags --libs`

다음 번엔 영화를 리와인드하거나 패스트 포워드하는 부분을 살펴본다.

// tutorial05.c
// A pedagogical video player that really works!
//
// Code based on FFplay, Copyright (c) 2003 Fabrice Bellard,
// and a tutorial by Martin Bohme (boehme@inb.uni-luebeckREMOVETHIS.de)
// Tested on Gentoo, CVS version 5/01/07 compiled with GCC 4.1.1
// With updates from https://github.com/chelyaev/ffmpeg-tutorial
// Updates tested on:
// LAVC 54.59.100, LAVF 54.29.104, LSWS 2.1.101, SDL 1.2.15
// on GCC 4.7.2 in Debian February 2015
// Use
//
// gcc -o tutorial05 tutorial05.c -lavformat -lavcodec -lswscale -lz -lm `sdl-config --cflags --libs`
// to build (assuming libavformat and libavcodec are correctly installed,
// and assuming you have sdl-config. Please refer to SDL docs for your installation.)
//
// Run using
// tutorial04 myvideofile.mpg
//
// to play the video stream on your screen.

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

#include <SDL.h>
#include <SDL_thread.h>

#ifdef __MINGW32__
#undef main /* Prevents SDL from overriding main() */
#endif

#include <stdio.h>
#include <assert.h>
#include <math.h>

// compatibility with newer API
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(55, 28, 1)
#define av_frame_alloc avcodec_alloc_frame
#define av_frame_free avcodec_free_frame
#endif

#define SDL_AUDIO_BUFFER_SIZE 1024
#define MAX_AUDIO_FRAME_SIZE 192000

#define MAX_AUDIOQ_SIZE (5 * 16 * 1024)
#define MAX_VIDEOQ_SIZE (5 * 256 * 1024)

#define AV_SYNC_THRESHOLD 0.01
#define AV_NOSYNC_THRESHOLD 10.0

#define SAMPLE_CORRECTION_PERCENT_MAX 10
#define AUDIO_DIFF_AVG_NB 20

#define FF_REFRESH_EVENT (SDL_USEREVENT)
#define FF_QUIT_EVENT (SDL_USEREVENT + 1)

#define VIDEO_PICTURE_QUEUE_SIZE 1

#define DEFAULT_AV_SYNC_TYPE AV_SYNC_VIDEO_MASTER

typedef struct PacketQueue
{
  AVPacketList *first_pkt, *last_pkt;
  int nb_packets;
  int size;
  SDL_mutex *mutex;
  SDL_cond *cond;
} PacketQueue;

typedef struct VideoPicture
{
  SDL_Overlay *bmp;
  int width, height; /* source height & width */
  int allocated;
  double pts;
} VideoPicture;

typedef struct VideoState
{

  AVFormatContext *pFormatCtx;
  int videoStream, audioStream;

  int av_sync_type;
  double external_clock; /* external clock base */
  int64_t external_clock_time;

  double audio_clock;
  AVStream *audio_st;
  AVCodecContext *audio_ctx;
  PacketQueue audioq;
  uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];
  unsigned int audio_buf_size;
  unsigned int audio_buf_index;
  AVFrame audio_frame;
  AVPacket audio_pkt;
  uint8_t *audio_pkt_data;
  int audio_pkt_size;
  int audio_hw_buf_size;
  double audio_diff_cum; /* used for AV difference average computation */
  double audio_diff_avg_coef;
  double audio_diff_threshold;
  int audio_diff_avg_count;
  double frame_timer;
  double frame_last_pts;
  double frame_last_delay;
  double video_clock;             ///< pts of last decoded frame / predicted pts of next decoded frame
  double video_current_pts;       ///< current displayed pts (different from video_clock if frame fifos are used)
  int64_t video_current_pts_time; ///< time (av_gettime) at which we updated video_current_pts - used to have running video pts
  AVStream *video_st;
  AVCodecContext *video_ctx;
  PacketQueue videoq;
  struct SwsContext *sws_ctx;

  VideoPicture pictq[VIDEO_PICTURE_QUEUE_SIZE];
  int pictq_size, pictq_rindex, pictq_windex;
  SDL_mutex *pictq_mutex;
  SDL_cond *pictq_cond;

  SDL_Thread *parse_tid;
  SDL_Thread *video_tid;

  char filename[1024];
  int quit;
} VideoState;

enum
{
  AV_SYNC_AUDIO_MASTER,
  AV_SYNC_VIDEO_MASTER,
  AV_SYNC_EXTERNAL_MASTER,
};

SDL_Surface *screen;
SDL_mutex *screen_mutex;

/* Since we only have one decoding thread, the Big Struct
can be global in case we need it. */
VideoState *global_video_state;

void packet_queue_init(PacketQueue *q)
{
  memset(q, 0, sizeof(PacketQueue));
  q->mutex = SDL_CreateMutex();
  q->cond = SDL_CreateCond();
}
int packet_queue_put(PacketQueue *q, AVPacket *pkt)
{

  AVPacketList *pkt1;
  if (av_dup_packet(pkt) < 0)
  {
    return -1;
  }
  pkt1 = av_malloc(sizeof(AVPacketList));
  if (!pkt1)
    return -1;
  pkt1->pkt = *pkt;
  pkt1->next = NULL;

  SDL_LockMutex(q->mutex);

  if (!q->last_pkt)
    q->first_pkt = pkt1;
  else
    q->last_pkt->next = pkt1;
  q->last_pkt = pkt1;
  q->nb_packets++;
  q->size += pkt1->pkt.size;
  SDL_CondSignal(q->cond);

  SDL_UnlockMutex(q->mutex);
  return 0;
}
static int packet_queue_get(PacketQueue *q, AVPacket *pkt, int block)
{
  AVPacketList *pkt1;
  int ret;

  SDL_LockMutex(q->mutex);

  for (;;)
  {

    if (global_video_state->quit)
    {
      ret = -1;
      break;
    }

    pkt1 = q->first_pkt;
    if (pkt1)
    {
      q->first_pkt = pkt1->next;
      if (!q->first_pkt)
        q->last_pkt = NULL;
      q->nb_packets--;
      q->size -= pkt1->pkt.size;
      *pkt = pkt1->pkt;
      av_free(pkt1);
      ret = 1;
      break;
    }
    else if (!block)
    {
      ret = 0;
      break;
    }
    else
    {
      SDL_CondWait(q->cond, q->mutex);
    }
  }
  SDL_UnlockMutex(q->mutex);
  return ret;
}

double get_audio_clock(VideoState *is)
{
  double pts;
  int hw_buf_size, bytes_per_sec, n;

  pts = is->audio_clock; /* maintained in the audio thread */
  hw_buf_size = is->audio_buf_size - is->audio_buf_index;
  bytes_per_sec = 0;
  n = is->audio_ctx->channels * 2;
  if (is->audio_st)
  {
    bytes_per_sec = is->audio_ctx->sample_rate * n;
  }
  if (bytes_per_sec)
  {
    pts -= (double)hw_buf_size / bytes_per_sec;
  }
  return pts;
}
double get_video_clock(VideoState *is)
{
  double delta;

  delta = (av_gettime() - is->video_current_pts_time) / 1000000.0;
  return is->video_current_pts + delta;
}
double get_external_clock(VideoState *is)
{
  return av_gettime() / 1000000.0;
}

double get_master_clock(VideoState *is)
{
  if (is->av_sync_type == AV_SYNC_VIDEO_MASTER)
  {
    return get_video_clock(is);
  }
  else if (is->av_sync_type == AV_SYNC_AUDIO_MASTER)
  {
    return get_audio_clock(is);
  }
  else
  {
    return get_external_clock(is);
  }
}

/* Add or subtract samples to get a better sync, return new
audio buffer size */
int synchronize_audio(VideoState *is, short *samples,
                      int samples_size, double pts)
{
  int n;
  double ref_clock;

  n = 2 * is->audio_ctx->channels;

  if (is->av_sync_type != AV_SYNC_AUDIO_MASTER)
  {
    double diff, avg_diff;
    int wanted_size, min_size, max_size /*, nb_samples */;

    ref_clock = get_master_clock(is);
    diff = get_audio_clock(is) - ref_clock;

    if (diff < AV_NOSYNC_THRESHOLD)
    {
      // accumulate the diffs
      is->audio_diff_cum = diff + is->audio_diff_avg_coef * is->audio_diff_cum;
      if (is->audio_diff_avg_count < AUDIO_DIFF_AVG_NB)
      {
        is->audio_diff_avg_count++;
      }
      else
      {
        avg_diff = is->audio_diff_cum * (1.0 - is->audio_diff_avg_coef);
        if (fabs(avg_diff) >= is->audio_diff_threshold)
        {
          wanted_size = samples_size + ((int)(diff * is->audio_ctx->sample_rate) * n);
          min_size = samples_size * ((100 - SAMPLE_CORRECTION_PERCENT_MAX) / 100);
          max_size = samples_size * ((100 + SAMPLE_CORRECTION_PERCENT_MAX) / 100);
          if (wanted_size < min_size)
          {
            wanted_size = min_size;
          }
          else if (wanted_size > max_size)
          {
            wanted_size = max_size;
          }
          if (wanted_size < samples_size)
          {
            /* remove samples */
            samples_size = wanted_size;
          }
          else if (wanted_size > samples_size)
          {
            uint8_t *samples_end, *q;
            int nb;

            /* add samples by copying final sample*/
            nb = (samples_size - wanted_size);
            samples_end = (uint8_t *)samples + samples_size - n;
            q = samples_end + n;
            while (nb > 0)
            {
              memcpy(q, samples_end, n);
              q += n;
              nb -= n;
            }
            samples_size = wanted_size;
          }
        }
      }
    }
    else
    {
      /* difference is TOO big; reset diff stuff */
      is->audio_diff_avg_count = 0;
      is->audio_diff_cum = 0;
    }
  }
  return samples_size;
}

int audio_decode_frame(VideoState *is, uint8_t *audio_buf, int buf_size, double *pts_ptr)
{

  int len1, data_size = 0;
  AVPacket *pkt = &is->audio_pkt;
  double pts;
  int n;

  for (;;)
  {
    while (is->audio_pkt_size > 0)
    {
      int got_frame = 0;
      len1 = avcodec_decode_audio4(is->audio_ctx, &is->audio_frame, &got_frame, pkt);
      if (len1 < 0)
      {
        /* if error, skip frame */
        is->audio_pkt_size = 0;
        break;
      }
      data_size = 0;
      if (got_frame)
      {
        data_size = av_samples_get_buffer_size(NULL,
                                               is->audio_ctx->channels,
                                               is->audio_frame.nb_samples,
                                               is->audio_ctx->sample_fmt,
                                               1);
        assert(data_size <= buf_size);
        memcpy(audio_buf, is->audio_frame.data[0], data_size);
      }
      is->audio_pkt_data += len1;
      is->audio_pkt_size -= len1;
      if (data_size <= 0)
      {
        /* No data yet, get more frames */
        continue;
      }
      pts = is->audio_clock;
      *pts_ptr = pts;
      n = 2 * is->audio_ctx->channels;
      is->audio_clock += (double)data_size /
                         (double)(n * is->audio_ctx->sample_rate);
      /* We have data, return it and come back for more later */
      return data_size;
    }
    if (pkt->data)
      av_free_packet(pkt);

    if (is->quit)
    {
      return -1;
    }
    /* next packet */
    if (packet_queue_get(&is->audioq, pkt, 1) < 0)
    {
      return -1;
    }
    is->audio_pkt_data = pkt->data;
    is->audio_pkt_size = pkt->size;
    /* if update, update the audio clock w/pts */
    if (pkt->pts != AV_NOPTS_VALUE)
    {
      is->audio_clock = av_q2d(is->audio_st->time_base) * pkt->pts;
    }
  }
}

void audio_callback(void *userdata, Uint8 *stream, int len)
{

  VideoState *is = (VideoState *)userdata;
  int len1, audio_size;
  double pts;

  while (len > 0)
  {
    if (is->audio_buf_index >= is->audio_buf_size)
    {
      /* We have already sent all our data; get more */
      audio_size = audio_decode_frame(is, is->audio_buf, sizeof(is->audio_buf), &pts);
      if (audio_size < 0)
      {
        /* If error, output silence */
        is->audio_buf_size = 1024;
        memset(is->audio_buf, 0, is->audio_buf_size);
      }
      else
      {
        audio_size = synchronize_audio(is, (int16_t *)is->audio_buf,
                                       audio_size, pts);
        is->audio_buf_size = audio_size;
      }
      is->audio_buf_index = 0;
    }
    len1 = is->audio_buf_size - is->audio_buf_index;
    if (len1 > len)
      len1 = len;
    memcpy(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, len1);
    len -= len1;
    stream += len1;
    is->audio_buf_index += len1;
  }
}

static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque)
{
  SDL_Event event;
  event.type = FF_REFRESH_EVENT;
  event.user.data1 = opaque;
  SDL_PushEvent(&event);
  return 0; /* 0 means stop timer */
}

/* schedule a video refresh in 'delay' ms */
static void schedule_refresh(VideoState *is, int delay)
{
  SDL_AddTimer(delay, sdl_refresh_timer_cb, is);
}

void video_display(VideoState *is)
{

  SDL_Rect rect;
  VideoPicture *vp;
  float aspect_ratio;
  int w, h, x, y;
  int i;

  vp = &is->pictq[is->pictq_rindex];
  if (vp->bmp)
  {
    if (is->video_ctx->sample_aspect_ratio.num == 0)
    {
      aspect_ratio = 0;
    }
    else
    {
      aspect_ratio = av_q2d(is->video_ctx->sample_aspect_ratio) *
                     is->video_ctx->width / is->video_ctx->height;
    }
    if (aspect_ratio <= 0.0)
    {
      aspect_ratio = (float)is->video_ctx->width /
                     (float)is->video_ctx->height;
    }
    h = screen->h;
    w = ((int)rint(h * aspect_ratio)) & -3;
    if (w > screen->w)
    {
      w = screen->w;
      h = ((int)rint(w / aspect_ratio)) & -3;
    }
    x = (screen->w - w) / 2;
    y = (screen->h - h) / 2;

    rect.x = x;
    rect.y = y;
    rect.w = w;
    rect.h = h;
    SDL_LockMutex(screen_mutex);
    SDL_DisplayYUVOverlay(vp->bmp, &rect);
    SDL_UnlockMutex(screen_mutex);
  }
}

void video_refresh_timer(void *userdata)
{

  VideoState *is = (VideoState *)userdata;
  VideoPicture *vp;
  double actual_delay, delay, sync_threshold, ref_clock, diff;

  if (is->video_st)
  {
    if (is->pictq_size == 0)
    {
      schedule_refresh(is, 1);
    }
    else
    {
      vp = &is->pictq[is->pictq_rindex];

      is->video_current_pts = vp->pts;
      is->video_current_pts_time = av_gettime();
      delay = vp->pts - is->frame_last_pts; /* the pts from last time */
      if (delay <= 0 || delay >= 1.0)
      {
        /* if incorrect delay, use previous one */
        delay = is->frame_last_delay;
      }
      /* save for next time */
      is->frame_last_delay = delay;
      is->frame_last_pts = vp->pts;

      /* update delay to sync to audio if not master source */
      if (is->av_sync_type != AV_SYNC_VIDEO_MASTER)
      {
        ref_clock = get_master_clock(is);
        diff = vp->pts - ref_clock;

        /* Skip or repeat the frame. Take delay into account
        FFPlay still doesn't "know if this is the best guess." */
        sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD;
        if (fabs(diff) < AV_NOSYNC_THRESHOLD)
        {
          if (diff <= -sync_threshold)
          {
            delay = 0;
          }
          else if (diff >= sync_threshold)
          {
            delay = 2 * delay;
          }
        }
      }
      is->frame_timer += delay;
      /* computer the REAL delay */
      actual_delay = is->frame_timer - (av_gettime() / 1000000.0);
      if (actual_delay < 0.010)
      {
        /* Really it should skip the picture instead */
        actual_delay = 0.010;
      }
      schedule_refresh(is, (int)(actual_delay * 1000 + 0.5));

      /* show the picture! */
      video_display(is);

      /* update queue for next picture! */
      if (++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE)
      {
        is->pictq_rindex = 0;
      }
      SDL_LockMutex(is->pictq_mutex);
      is->pictq_size--;
      SDL_CondSignal(is->pictq_cond);
      SDL_UnlockMutex(is->pictq_mutex);
    }
  }
  else
  {
    schedule_refresh(is, 100);
  }
}

void alloc_picture(void *userdata)
{

  VideoState *is = (VideoState *)userdata;
  VideoPicture *vp;

  vp = &is->pictq[is->pictq_windex];
  if (vp->bmp)
  {
    // we already have one make another, bigger/smaller
    SDL_FreeYUVOverlay(vp->bmp);
  }
  // Allocate a place to put our YUV image on that screen
  SDL_LockMutex(screen_mutex);
  vp->bmp = SDL_CreateYUVOverlay(is->video_ctx->width,
                                 is->video_ctx->height,
                                 SDL_YV12_OVERLAY,
                                 screen);
  SDL_UnlockMutex(screen_mutex);

  vp->width = is->video_ctx->width;
  vp->height = is->video_ctx->height;
  vp->allocated = 1;
}

int queue_picture(VideoState *is, AVFrame *pFrame, double pts)
{

  VideoPicture *vp;
  int dst_pix_fmt;
  AVPicture pict;

  /* wait until we have space for a new pic */
  SDL_LockMutex(is->pictq_mutex);
  while (is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE &&
         !is->quit)
  {
    SDL_CondWait(is->pictq_cond, is->pictq_mutex);
  }
  SDL_UnlockMutex(is->pictq_mutex);

  if (is->quit)
    return -1;

  // windex is set to 0 initially
  vp = &is->pictq[is->pictq_windex];

  /* allocate or resize the buffer! */
  if (!vp->bmp ||
      vp->width != is->video_ctx->width ||
      vp->height != is->video_ctx->height)
  {
    SDL_Event event;

    vp->allocated = 0;
    alloc_picture(is);
    if (is->quit)
    {
      return -1;
    }
  }

  /* We have a place to put our picture on the queue */

  if (vp->bmp)
  {

    SDL_LockYUVOverlay(vp->bmp);
    vp->pts = pts;

    dst_pix_fmt = PIX_FMT_YUV420P;
    /* point pict at the queue */

    pict.data[0] = vp->bmp->pixels[0];
    pict.data[1] = vp->bmp->pixels[2];
    pict.data[2] = vp->bmp->pixels[1];

    pict.linesize[0] = vp->bmp->pitches[0];
    pict.linesize[1] = vp->bmp->pitches[2];
    pict.linesize[2] = vp->bmp->pitches[1];

    // Convert the image into YUV format that SDL uses
    sws_scale(is->sws_ctx, (uint8_t const *const *)pFrame->data,
              pFrame->linesize, 0, is->video_ctx->height,
              pict.data, pict.linesize);

    SDL_UnlockYUVOverlay(vp->bmp);
    /* now we inform our display thread that we have a pic ready */
    if (++is->pictq_windex == VIDEO_PICTURE_QUEUE_SIZE)
    {
      is->pictq_windex = 0;
    }
    SDL_LockMutex(is->pictq_mutex);
    is->pictq_size++;
    SDL_UnlockMutex(is->pictq_mutex);
  }
  return 0;
}

double synchronize_video(VideoState *is, AVFrame *src_frame, double pts)
{

  double frame_delay;

  if (pts != 0)
  {
    /* if we have pts, set video clock to it */
    is->video_clock = pts;
  }
  else
  {
    /* if we aren't given a pts, set it to the clock */
    pts = is->video_clock;
  }
  /* update the video clock */
  frame_delay = av_q2d(is->video_ctx->time_base);
  /* if we are repeating a frame, adjust clock accordingly */
  frame_delay += src_frame->repeat_pict * (frame_delay * 0.5);
  is->video_clock += frame_delay;
  return pts;
}

int video_thread(void *arg)
{
  VideoState *is = (VideoState *)arg;
  AVPacket pkt1, *packet = &pkt1;
  int frameFinished;
  AVFrame *pFrame;
  double pts;

  pFrame = av_frame_alloc();

  for (;;)
  {
    if (packet_queue_get(&is->videoq, packet, 1) < 0)
    {
      // means we quit getting packets
      break;
    }
    if (packet_queue_get(&is->videoq, packet, 1) < 0)
    {
      // means we quit getting packets
      break;
    }
    pts = 0;

    // Decode video frame
    avcodec_decode_video2(is->video_ctx, pFrame, &frameFinished, packet);

    if ((pts = av_frame_get_best_effort_timestamp(pFrame)) == AV_NOPTS_VALUE)
    {
    }
    else
    {
      pts = 0;
    }
    pts *= av_q2d(is->video_st->time_base);

    // Did we get a video frame?
    if (frameFinished)
    {
      pts = synchronize_video(is, pFrame, pts);
      if (queue_picture(is, pFrame, pts) < 0)
      {
        break;
      }
    }
    av_free_packet(packet);
  }
  av_frame_free(&pFrame);
  return 0;
}

int stream_component_open(VideoState *is, int stream_index)
{

  AVFormatContext *pFormatCtx = is->pFormatCtx;
  AVCodecContext *codecCtx = NULL;
  AVCodec *codec = NULL;
  SDL_AudioSpec wanted_spec, spec;

  if (stream_index < 0 || stream_index >= pFormatCtx->nb_streams)
  {
    return -1;
  }

  codec = avcodec_find_decoder(pFormatCtx->streams[stream_index]->codec->codec_id);
  if (!codec)
  {
    fprintf(stderr, "Unsupported codec!\n");
    return -1;
  }

  codecCtx = avcodec_alloc_context3(codec);
  if (avcodec_copy_context(codecCtx, pFormatCtx->streams[stream_index]->codec) != 0)
  {
    fprintf(stderr, "Couldn't copy codec context");
    return -1; // Error copying codec context
  }

  if (codecCtx->codec_type == AVMEDIA_TYPE_AUDIO)
  {
    // Set audio settings from codec info
    wanted_spec.freq = codecCtx->sample_rate;
    wanted_spec.format = AUDIO_S16SYS;
    wanted_spec.channels = codecCtx->channels;
    wanted_spec.silence = 0;
    wanted_spec.samples = SDL_AUDIO_BUFFER_SIZE;
    wanted_spec.callback = audio_callback;
    wanted_spec.userdata = is;

    if (SDL_OpenAudio(&wanted_spec, &spec) < 0)
    {
      fprintf(stderr, "SDL_OpenAudio: %s\n", SDL_GetError());
      return -1;
    }
    is->audio_hw_buf_size = spec.size;
  }
  if (avcodec_open2(codecCtx, codec, NULL) < 0)
  {
    fprintf(stderr, "Unsupported codec!\n");
    return -1;
  }

  switch (codecCtx->codec_type)
  {
  case AVMEDIA_TYPE_AUDIO:
    is->audioStream = stream_index;
    is->audio_st = pFormatCtx->streams[stream_index];
    is->audio_ctx = codecCtx;
    is->audio_buf_size = 0;
    is->audio_buf_index = 0;
    memset(&is->audio_pkt, 0, sizeof(is->audio_pkt));
    packet_queue_init(&is->audioq);
    SDL_PauseAudio(0);
    break;
  case AVMEDIA_TYPE_VIDEO:
    is->videoStream = stream_index;
    is->video_st = pFormatCtx->streams[stream_index];
    is->video_ctx = codecCtx;

    is->frame_timer = (double)av_gettime() / 1000000.0;
    is->frame_last_delay = 40e-3;
    is->video_current_pts_time = av_gettime();

    packet_queue_init(&is->videoq);
    is->video_tid = SDL_CreateThread(video_thread, is);
    is->sws_ctx = sws_getContext(is->video_ctx->width, is->video_ctx->height,
                                 is->video_ctx->pix_fmt, is->video_ctx->width,
                                 is->video_ctx->height, PIX_FMT_YUV420P,
                                 SWS_BILINEAR, NULL, NULL, NULL);
    break;
  default:
    break;
  }
}

int decode_thread(void *arg)
{

  VideoState *is = (VideoState *)arg;
  AVFormatContext *pFormatCtx;
  AVPacket pkt1, *packet = &pkt1;

  int video_index = -1;
  int audio_index = -1;
  int i;

  is->videoStream = -1;
  is->audioStream = -1;

  global_video_state = is;

  // Open video file
  if (avformat_open_input(&pFormatCtx, is->filename, NULL, NULL) != 0)
    return -1; // Couldn't open file

  is->pFormatCtx = pFormatCtx;

  // Retrieve stream information
  if (avformat_find_stream_info(pFormatCtx, NULL) < 0)
    return -1; // Couldn't find stream information

  // Dump information about file onto standard error
  av_dump_format(pFormatCtx, 0, is->filename, 0);

  // Find the first video stream

  for (i = 0; i < pFormatCtx->nb_streams; i++)
  {
    if (pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO &&
        video_index < 0)
    {
      video_index = i;
    }
    if (pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO &&
        audio_index < 0)
    {
      audio_index = i;
    }
  }
  if (audio_index >= 0)
  {
    stream_component_open(is, audio_index);
  }
  if (video_index >= 0)
  {
    stream_component_open(is, video_index);
  }

  if (is->videoStream < 0 || is->audioStream < 0)
  {
    fprintf(stderr, "%s: could not open codecs\n", is->filename);
    goto fail;
  }

  // main decode loop

  for (;;)
  {
    if (is->quit)
    {
      break;
    }
    // seek stuff goes here
    if (is->audioq.size > MAX_AUDIOQ_SIZE ||
        is->videoq.size > MAX_VIDEOQ_SIZE)
    {
      SDL_Delay(10);
      continue;
    }
    if (av_read_frame(is->pFormatCtx, packet) < 0)
    {
      if (is->pFormatCtx->pb->error == 0)
      {
        SDL_Delay(100); /* no error; wait for user input */
        continue;
      }
      else
      {
        break;
      }
    }
    // Is this a packet from the video stream?
    if (packet->stream_index == is->videoStream)
    {
      packet_queue_put(&is->videoq, packet);
    }
    else if (packet->stream_index == is->audioStream)
    {
      packet_queue_put(&is->audioq, packet);
    }
    else
    {
      av_free_packet(packet);
    }
  }
  /* all done - wait for it */
  while (!is->quit)
  {
    SDL_Delay(100);
  }

fail:
  if (1)
  {
    SDL_Event event;
    event.type = FF_QUIT_EVENT;
    event.user.data1 = is;
    SDL_PushEvent(&event);
  }
  return 0;
}

int main(int argc, char *argv[])
{

  SDL_Event event;

  VideoState *is;

  is = av_mallocz(sizeof(VideoState));

  if (argc < 2)
  {
    fprintf(stderr, "Usage: test <file>\n");
    exit(1);
  }
  // Register all formats and codecs
  av_register_all();

  if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER))
  {
    fprintf(stderr, "Could not initialize SDL - %s\n", SDL_GetError());
    exit(1);
  }

// Make a screen to put our video
#ifndef __DARWIN__
  screen = SDL_SetVideoMode(640, 480, 0, 0);
#else
  screen = SDL_SetVideoMode(640, 480, 24, 0);
#endif
  if (!screen)
  {
    fprintf(stderr, "SDL: could not set video mode - exiting\n");
    exit(1);
  }

  screen_mutex = SDL_CreateMutex();

  av_strlcpy(is->filename, argv[1], sizeof(is->filename));

  is->pictq_mutex = SDL_CreateMutex();
  is->pictq_cond = SDL_CreateCond();

  schedule_refresh(is, 40);

  is->av_sync_type = DEFAULT_AV_SYNC_TYPE;
  is->parse_tid = SDL_CreateThread(decode_thread, is);
  if (!is->parse_tid)
  {
    av_free(is);
    return -1;
  }
  for (;;)
  {

    SDL_WaitEvent(&event);
    switch (event.type)
    {
    case FF_QUIT_EVENT:
    case SDL_QUIT:
      is->quit = 1;
      SDL_Quit();
      return 0;
      break;
    case FF_REFRESH_EVENT:
      video_refresh_timer(event.user.data1);
      break;
    default:
      break;
    }
  }
  return 0;
}

출처:FFmpeg and SDL Tutorial - Synching Audio

'FFmpeg' 카테고리의 다른 글

FFmpeg and SDL Tutorial - Seeking (0)	2023.03.19
FFmpeg and SDL Tutorial - Synching Video (0)	2023.03.19
FFmpeg and SDL Tutorial - Spawning Threads (0)	2023.03.19
FFmpeg and SDL Tutorial - Playing Sound (0)	2023.03.19
FFmpeg and SDL Tutorial - Outputting to the Screen (0)	2023.03.19

'FFmpeg' Related Articles

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

buntalk.com

buntalk.com

FFmpeg and SDL Tutorial - Synching Audio 본문

FFmpeg and SDL Tutorial - Synching Audio

'FFmpeg' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역