In this overview, I will start with a bird’s-eye view of the WebRTC data channel. In the following posts, I will dive deep into implementation details and use cases. This overview and the follow-up posts are based on our direct experience with the technology and will be rather technical in nature.
An important disclaimer
Before discussing WebRTC, I always make this (three-fold) disclaimer: it is a relatively young technology and the technical specification is and will keep changing. Everything described in this series of posts will apply only to the current state of affairs in WebRTC, as of December 2014 (Chrome 39, Firefox 33).
Moreover, as with any complex piece of software, you will find bugs in WebRTC. These bugs can be various degrees of annoying: from introducing unnecessary levels of complexity in the application to blocking the development of a project altogether. I'll talk about a few such bugs later.
Lastly, as on open-source technology, WebRTC evolves through contributions from many developers the world over. Sometimes they change some of the parameters and this tweaking can affect the dynamic behavior of a system even though the API is not violated in any way. Usually, one does not have to worry about this, but tweaks become important when one tries to squeeze every last bit of performance out of the data channel.
What is the WebRTC data channel?
The data channel (DC for short) is a communication pipe, also called a socket. So WebRTC has media channels for handling audio and video and then it has the data channel for anything else. It is not bound to a particular problem domain and can be used for various purposes requiring communication, e.g. inter-browser. The data channel can even be used for audio and video. The reason it is not is because media channels use protocols optimized for real-time media (audio and video), while the data channel uses a protocol that attempts at being suitable for a wide range of tasks. But more on this protocol later.
The data channel was added to WebRTC pretty early on, in 2012, so it is mature and widely supported. However, the initial version, which, incidentally, was the only version until one year ago, was very different from the current one. It was based on RTP, a very simple protocol that does not do much at all. Pretty much the only thing it does, as far as the data channel is concerned, is to count messages: It says this is the first message and this is the second one and so on and so forth.
Crucially, RTP lacks any form of traffic control, and congestion control in particular. This made the WebRTC developers quite uneasy that the data channel could be used by malicious hackers to flood and render existing networks unusable. As a result, they decided to limit the maximum throughput to just about 30 Kbit/s. This was very, very slow - not much one could do with so slow of a connection.
Fortunately, somebody found a simple hack to increase the limit to 1 Mbit/s. The first time we encountered this workaround was in an article by Hadar Weiss, Peer5 CTO. This hack quickly spread throughout the community and all applications at the time started using it. Even though one could do a lot more with 1Mbit/s, it was still too slow; especially for file sharing or for what we at Viblast do - video delivery.
Another important limitation of DC’s initial version was that it could only transmit text messages. I see this as a clear contradiction to the "unbound problem domain" characteristic I mentioned earlier. If one had to send binary data over the DC, one had to convert it to text first. This was usually done through Base64 encoding, which was ugly and silly. One reason I have to say so is that, since Base64 encoding adds 25% overhead, the maximum throughput actually went down to 750 Kbit/s.
A third way in which WebRTC’s initial version needed improvement was that back then there were no reliable modes of operation. That is, if a message was lost during delivery for some reason, the DC would not care; it would not do anything. Lost message detection and retransmission had to be implemented in the application layer.
There is one last limitation worth mentioning: the maximum size of a message was dependent on the network. It had to be less than the maximum transmission unit (MTU) of the network. While 1Kbyte was usually a safe value, WebRTC itself did not offer any way of MTU detection.
The new Stream Control Transmission Protocol / SCTP-based implementation that emerged in December 2013 changed these limitations for the better. You can read more about how it did and the current capabilities of the DC in my next post.
[Go to Post #2: WebRTC’s jump forward: SCTP-based implementation]
Author: Svetlin Mladenov, Chief Architect