Introduction
In the rapidly evolving landscape of real-time communication and streaming, integrating different protocols to leverage their unique strengths is crucial. This project presents an RTMP server inspired by the LiveKit Ingress Service. It receives an RTMP stream from a user in a room, transcodes the audio from AAC to Opus (making it WebRTC compatible), and the video to H264, then pushes it to WebRTC tracks connected to clients. The server acts as a peer, maintaining a peer-to-peer (P2P) connection with each client.
Why RTMP and WebRTC?
RTMP: A Proven Protocol for Live Streaming
Real-Time Messaging Protocol (RTMP) is a mature and robust protocol widely used for live streaming. It provides low-latency transmission of audio, video, and data over the Internet. RTMP is favored for its ability to handle high-quality streams with minimal buffering and its support for a variety of codecs and formats. This makes it an excellent choice for ingesting live video streams.
WebRTC: Real-Time Communication in the Browser
Web Real-Time Communication (WebRTC) is a cutting-edge technology that enables real-time audio, video, and data sharing directly between browsers without the need for plugins. WebRTC is designed for low-latency communication, making it ideal for video conferencing, live streaming, and interactive applications. Its peer-to-peer architecture ensures efficient data transmission and scalability.
Integrating RTMP and WebRTC: The Best of Both Worlds
By integrating RTMP for stream ingestion and WebRTC for stream delivery, we can create a powerful real-time streaming solution. RTMP handles the initial high-quality stream intake, and WebRTC ensures efficient, low-latency distribution to end-users. This combination provides a seamless streaming experience with the reliability of RTMP and the real-time capabilities of WebRTC.
Features
RTMP to WebRTC: Receives RTMP streams and delivers them to WebRTC clients.
Audio Transcoding: Transcodes AAC audio to Opus for WebRTC compatibility.
Video Transcoding: Ensures video is encoded in H264 for WebRTC delivery.
Webhook Notifications: Uses webhooks to notify the publishing state of the stream to different rooms.
WebSocket Signaling: Establishes WebRTC connections using WebSockets for offer/answer exchange.
Concurrency for Performance: Utilizes Go’s concurrency patterns and channels to enhance streaming performance and reduce latency.
Core Libraries and Packages
Pion WebRTC: Used for handling WebRTC connections.
Yuptopp RTMP: Used for handling RTMP streams.
fdkaac: For AAC decoding.
gopkg.in/hraban/opus.v2: For Opus encoding.
go-chi/chi: Lightweight, idiomatic, and composable router for building Go HTTP services.
logrus: For logging.
How It Works
RTMP Server
The RTMP server listens for incoming RTMP streams. When a stream is published:
Audio Processing: Decodes AAC audio and encodes it into Opus format using fdkaac and opus.
Video Processing: Ensures the video stream is in H264 format.
WebRTC Integration: Sends processed audio and video to WebRTC tracks connected to clients.
WebRTC Connection
The WebRTC connection is established via WebSockets:
WebSocket Handler: Manages WebRTC signaling (offer/answer exchange) using WebSockets.
Peer Connection: Each client establishes a peer connection with the server.
Track Delivery: Delivers audio and video tracks to clients via WebRTC.
Webhooks
Webhooks listen to the audio and video channels and notify the state of streams to their subscribers:
Notifications: Sent when streams start or stop using a webhook manager.
Challenges Faced
One of the primary challenges in this project was the lack of support for Opus audio in RTMP. RTMP and OBS (Open Broadcaster Software) share only one common audio codec: AAC. This posed a problem since WebRTC requires Opus audio for optimal performance. Here’s how I tackled this issue:
Initial Approach: External Pipeline
My first solution was to use a separate GStreamer or FFmpeg pipeline to convert the AAC encoded audio. This pipeline would process the audio and pass it to an RTP channel, which would then ingest the audio packets directly into WebRTC. However, this approach increased CPU utilization by 70%, significantly impacting performance when handling multiple streams.
Optimized Solution: In-Memory Encoding
After further research, I discovered a more efficient method. By performing in-memory encoding of the audio buffer directly to the Go channel, I could pass it to WebRTC tracks in the Opus codec. I used the gopkg.in/hraban/opus.v2 package, a Go translation layer for C libraries like libopus and libopusfile, which provide encoders and decoders.
This approach allowed for in-memory translation of the audio layer from AAC to Opus, drastically reducing the performance cost compared to the initial solution. The overhead was minimal, making it almost as efficient as streaming without encoding.
Performance Enhancements
Concurrency: Utilizes Go’s concurrency patterns to efficiently handle multiple streams.
Channels: Uses channels for buffering video and audio data, ensuring smooth delivery to WebRTC tracks.
Optimized Transcoding: Efficiently transcodes audio and video to minimize latency.
Conclusion
This project demonstrates the power of combining RTMP and WebRTC to create a real-time streaming solution that is both robust and efficient. By leveraging the strengths of each protocol, we can deliver high-quality, low-latency streams to users seamlessly. Whether you’re building a live streaming platform, a video conferencing tool, or any other real-time application, this RTMP server provides a solid foundation for your needs.
Stay tuned for further updates and enhancements to this project, and feel free to contribute!.
Source Code