Digital Fountains
-prepared by Anand Patwardhan
CSE Dept., OGI School of Science and Engineering
Oregon Health & Science University
Main ideas: distribution of bulk data –
reliable multicast/broadcast – digital fountain – erasure codes – tornado codes
– RMDP protocol
Summary of Papers
Reliable distribution of bulk data:
Unicasting employs Automatic
Retransmission Request (ARQ) for recovering lost packets. Retransmitted/repair
packets do not benefit everyone. Also in case of a large number of receivers there
is feedback implosion at the source. Thus, this process is neither efficient
nor scalable.
In the multicasting
approach, techniques like - limited ARQ (NACK suppression), local recovery are
used. But now, there is added complexity of maintaining this extra state
information (group heirarchy, neighbor info., etc.). Also, in both these cases,
a feedback channel is required.
Ideal Digital Fountain:
A Server (data source)
serving a universe of clients. ‘k’ original source data packets encoded to ‘n’
packets, n = c.k (c >1). Server carousels(cycles) through these ‘n’ packets,
using multicast. Clients (receivers) drink their fill, any ‘k’ packets of total
‘n’ packets will quench the client, i.e. original data can be reconstructed
from any k packets received of n transmitted.
Erasure Codes (Forward Error Correcting codes):
Reed-Solomon codes have this
exact property, but high computational complexity. Tornado codes have a much
lower computational complexity (much faster encoding-decoding), but slightly more
than k packets are needed to reconstruct data at receiver (approx. 5% more).
Space requirements at client are indeterminate (due to random graphs used in
encoding).
[1] and [2] approximate the
Digital Fountain using Tornado codes, favoring the low computational overhead
at the cost of a small reception inefficiency (extra packets). [1] claims that
Tornado codes fare better than Reed-Solomon codes by varying computational time
and no. of file blocks ( µ filesize). [2] proposes
using multiple “mirror’ servers (data sources) to speed up downloads. (1+ e).k
packets are required to fill a bucket. Buckets will fill up quickly when
receiving these packets from “out-of-step” servers, since duplicate packets are
minimized. By using mirrors, all available bandwidth will be utilized to
speed-up download (assuming “disjoint bottlenecks” in paths to each mirror). In
this approach, no feedback is required and hence no feedback channel required.
[3] favors Reed-Solomon
codes to Tornado codes, citing that (mobile devices) clients may not have
enough memory for the indeterminate space requirements of Tornado codes. [3]
proposes the Reliable Multicast data Distribution Protocol (RMDP). This uses
limited ARQ – feedback suppression using rules for timeouts.
|
Advantages |
Disadvantages |
|
¨ Reliability with little or
no feedback (ideal for wireless, satellite, cable) ¨ High Scalability ¨ Clients can join
asynchronously ¨ Resilient to bursty losses ¨ Using mirrors can speedup
downloads |
¨ No feedback at the cost of
bandwidth, buffers, time ¨ Encoding and decoding
overhead ¨ Reception inefficiency
(Tornado) ¨ Data not available till at
least k packets received ¨ Congestion Control not
considered |
Class Discussion
Congestion
Control not considered. Carouseling data (unrestricted bandwidth usage) will
not be friendly to other traffic. Also, this approach will unnecessarilyy use
up bandwidth when the group of listeners is small and/or listens in for small intervals
of time.
References for the papers covered
6.
“Fine-grained
Layered Multicast”, John Byers, Michael Luby, Michael Mitzenmacher