for issue #4418, #4151, #4076 .DVR Missing First Few Seconds of
Audio/Video
### Root Cause
When recording WebRTC streams to FLV files using DVR, the first 4-6
seconds of audio/video are missing. This occurs because:
1. **Packets are discarded before A/V sync is available**: The
RTC-to-RTMP conversion pipeline actively discards all RTP packets when
avsync_time <= 0.
2. **Original algorithm requires 2 RTCP SR packets**: The previous
implementation needed to receive two RTCP Sender Report (SR) packets
before it could calculate the rate for audio/video synchronization
timestamp conversion.
3. **Delay causes packet loss**: Since RTCP SR packets typically arrive
every 2-3 seconds, waiting for 2 SRs means 4-6 seconds of packets are
discarded before A/V sync becomes available.
4. **Audio SR arrives slower than video SR**: As reported in the issue,
video RTCP SR packets arrive much faster than audio SR packets. This
asymmetry causes audio packets to be discarded for a longer period,
resulting in the audio loss observed in DVR recordings.
### Solution
1. **Initialize rate from SDP**: Use the sample rate from SDP (Session
Description Protocol) to calculate the initial rate immediately when the
track is created.
Audio (Opus): 48000 Hz → rate = 48 (RTP units per millisecond)
Video (H.264/H.265): 90000 Hz → rate = 90 (RTP units per millisecond)
2. **Enable immediate A/V sync:** With the SDP rate available,
cal_avsync_time() can calculate valid timestamps from the very first RTP
packet, eliminating packet loss.
3. **Smooth transition to precise rate**: After receiving the 2nd RTCP
SR, update to the precisely calculated rate based on actual RTP/NTP
timestamp mapping.
## Configuration
Added new configuration option `init_rate_from_sdp` in the RTC vhost
section:
```nginx
vhost rtc.vhost.srs.com {
rtc {
# Whether initialize RTP rate from SDP sample rate for immediate A/V sync.
# When enabled, the RTP rate (units per millisecond) is initialized from the SDP
# sample rate (e.g., 90 for video 90kHz, 48 for audio 48kHz) before receiving
# 2 RTCP SR packets. This allows immediate audio/video synchronization.
# The rate will be updated to a more precise value after receiving the 2nd SR.
# Overwrite by env SRS_VHOST_RTC_INIT_RATE_FROM_SDP for all vhosts.
# Default: off
init_rate_from_sdp off;
}
}
```
**⚠️ Important Note**: This config defaults to **off** because:
- ✅ When **enabled**: Fixes the audio loss problem (no missing first 4-6
seconds)
- ❌ When **enabled**: VLC on macOS cannot play the video properly
- ✅ Other platforms work fine (Windows, Linux)
- ✅ FFplay works fine on all platforms
Users experiencing audio loss in DVR recordings can enable this option
if they don't need VLC macOS compatibility. We're investigating the VLC
macOS issue to make this feature safe to enable by default in the
future.
---------
Co-authored-by: winlin <winlinvip@gmail.com>
Co-authored-by: OSSRS-AI <winlinam@gmail.com>
This PR introduces a major refactoring to replace `SrsSharedPtrMessage`
with `SrsMediaPacket` throughout the SRS codebase, providing a more
unified and cleaner approach to media packet handling.
---------
Co-authored-by: OSSRS-AI <winlinam@gmail.com>
The `ffmpeg-opus` tool allows you to control the delay using the
`opus_delay` option. The minimum delay can be set to 2.5ms. However, in
practice, you cannot set it this low. You need to set at least 10 frames
to allow the audio encoder to lookahead. Otherwise, the sound will be
distorted.
---------
Co-authored-by: chundonglinlin <chundonglinlin@163.com>
1. After enabling FFmpeg opus, the transcoding time for each opus packet
is around 4ms.
2. To speed up case execution, our test publisher sends 400 opus packets
at intervals of 1ms.
3. After the publisher starts, wait for 30ms, then the player starts.
4. Due to the lengthy processing time for each opus packet, SRS
continuously receives packets from the publisher, so it doesn't switch
coroutines and can't accept the player's connection.
5. Only after all opus packets are processed will it accept the player
connection. Therefore, the player doesn't receive any data, leading to
the failure of the case.
---------
Co-authored-by: winlin <winlinvip@gmail.com>
### Description
When converting between AAC and Opus formats (aac2opus or opus2aac), the
`av_frame_get_buffer` API is frequently called.
### Objective
The goal is to optimize the code logic and reduce the frequent
allocation and deallocation of memory.
In the case of aac2opus, av_frame_get_buffer is still frequently called.
In the case of opus2aac, the goal is to avoid calling
av_frame_get_buffer and reduce memory allocations.
### Additional Note
Before calling the `av_audio_fifo_read` API, use
`av_frame_make_writable` to check if the frame is writable. If it is not
writable, create a new frame.
---------
Co-authored-by: john <hondaxiao@tencent.com>
Follow the example in FFmpeg's doc, before calling the API
`avcodec_send_frame`, always use `av_frame_alloc` to create a new frame.
---------
Co-authored-by: Haibo Chen <495810242@qq.com>