CoGeNT


Description: CoGeNT, a tool for the Controlled Generation of Network Traffic. CoGeNT is a networked synthetic workload generator for evaluating system and network performance in a controlled fashion.  It is based on a simple client-server model and allows the user to evaluate various aspects of network and distributed computing.  The tool generates the appropriate code for all clients and servers specified as well as the necessary Makefiles and distribution scripts for compiling and running the distributed workload.  The focus of the tool will initially be network performance evaluation (i.e. developing realistic source/sink models for Internet applications).  However, we plan on addressing CPU issues in order to model common activities such as database operations on a server and running java applets on a client.


Motivation:  Synthetic workload generation overcomes many of the problems that simulation techniques have.  One problem with simulation is scaleability.  For example, simulating the performance of various queue management and/or scheduling algorithms on a large scale can take a substantial amount of CPU speed and become impractical.  Another problem is that the simulation is only as accurate as the underlying model.  Whether or not simulation reflects behavior which will be observed upon implementation depends directly on the model being used to simulate.  Finally, simulation doesn't give the user a feel for the complexity in real-time and the overhead involved when considering different mechanisms.  Synthetic workloads also have an advantage over real workloads in that they can be parameterized to study a range of mixes and they afford the researcher the ability to repeat experiments easily.

CoGeNT will initially focus on workloads for studying the effects of traffic loads and traffic mixes on the performance of different design features in routers and switches.  For example, recent studies have shown that LRD traffic has a significant influence on network performance and the allocation of network resources.  Using CoGeNT in an isolated testbed, we can quantify this influence in a controlled environment.  Another envisioned use would be to evaluate RED implementations over a range of traffic generated by CoGeNT.  Simulation studies evaluating RED have been limited in the scale of their experiments.


What we support:  The initial implementation of CoGeNT provides a toolbox for modeling the behavior of various Internet applications.  Studies in the literature have provided significant information on the nature and behavior of various applications.  In particular, they have addressed modeling WWW behavior, telnet, ftp, X11, SMTP, and NNTP traffic.  A summary of this research is listed below.

Internet Application Characteristics
Application Arrivals Request Sizes Response Sizes
WWW

Pareto [7]
Heavy-tailed, but
not as much as transfers

?

Pareto for file sizes > 1K [6]
Heavy-tailed
Telnet Session Poisson [1]

?

Fairly Fixed?

Telnet Data
Pareto [1]
Heavy-tailed (Think time)
Packets: Log-normal [1]
Bytes: Log-extreme [2]

?

FTP Session Poisson [1]

Negligible

Fairly Fixed?

FTP Data

Bursty [1]
Network-dependent
(mget operations)

?

Pareto[1]
Very Heavy Upper Tail [1]
SMTP
Not Poisson [1]
(Mailing lists)

?

?

NNTP
Not Poisson [1]
(Flooding)

?

?

X11 Session Poisson? [1]

Negligible

?

X11Connection Not Poisson [1]

?

?


Miscellaneous Sources
Source Distribution
"Self-similar" sources
ON-OFF times with heavy tails [5,6]
Inter-renewal times with infinite variance
VBR video Gamma/Pareto [3]


Characteristics of Aggregate Traffic
Characteristic Distribution
Aggregate
Traffic
(Self-Similar)
Fractional Gaussian Noise[5,6]
Fractional ARIMA [5,6]
Sources generating
self-similar traffic

Pareto [5,6]
ON-OFF sources with inter-renewal
times with infinite variance.  (ON and/or OFF
times have heavy-tails)
Observed WWW
Throughput
Log-normal [4]

Notes: Intensity of self-similarity increases as the aggregate traffic level increases.

As such modeling becomes more mature, we plan on implementing additional models to keep up with current literature. Another important feature we plan on adding is the ability to run in trace-driven mode where interarrival times, file transfer sizes, etc. are taken from a logfile captured from real use (i.e. server logs and/or tcpdump logs)


Related Work:  

Tool What they lack
Raghavan[8] Focus on single server, multiple client.  Not an aggregation of both.  Exponential
source models.  Adapting to new applications?
Netperf Only measures peak throughput of communication between a pair of machines
ttcp Lacks multiple composable tasks.  Measures raw performance between machines.
Spec SFS Mainly for measuring and modeling NFS performance over the network
Byte93 Limited network support
WPI Single server, multiple clients.  "FTP" of large files.  X-windows.  Fixed job mixes.
lmbench Measures aggregate throughput of end-host communication subsystem
Nettest Developed by Cray (Not much info on this one)
SPECweb96 Measures maximum throughput of a server.  Only generates artificial network traffic.
Can't be parameterized.  Don't capture a wide range of application classes.  Can't
evolve to new applications.
AIM Models regular as well as networked (telnet, ftp, etc.) applications. Proprietary.
Unclear how configurable or realistic the workloads are.



Features:



References
[1] Vern Paxson and Sally Floyd, "Wide-Area Traffic:  The Failure of Poisson Modeling"
[2] Vern Paxson, "Empirically-Derived Analytic Models of Wide-Area TCP Connections"
[3] Mark Garrett and Walter Willinger "Analysis, Modeling and Generation of Self-Similar VBR Video Traffic"
[4] Hari Balakrishnan, Srinivasan Seshan, Mark Stemm, and Randy Katz, "Analyzing Stability in Wide-Area Network Performance"
[5] Will Leland, Murad S. Taqqu, Walter Willinger, and Daniel Wilson, "On the Self-Similar Nature of Ethernet Traffic (Extended Version)"
[6] Mark Crovella and Azer Bestavros, "Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes"
[7] Carlos R. Cunha, Azer Bestavros, and Mark E. Crovella, "Characteristics of WWW client-based traces"
[8] S. Raghavan, D. Vasukiammaiyar, and G. Haring, "Generative Networkload Models for a Single Server Environment"