CoGeNT
Description: CoGeNT, a tool for the Controlled Generation of Network Traffic. CoGeNT is a networked synthetic workload generator for evaluating system and network performance in a controlled fashion. It is based on a simple client-server model and allows the user to evaluate various aspects of network and distributed computing. The tool generates the appropriate code for all clients and servers specified as well as the necessary Makefiles and distribution scripts for compiling and running the distributed workload. The focus of the tool will initially be network performance evaluation (i.e. developing realistic source/sink models for Internet applications). However, we plan on addressing CPU issues in order to model common activities such as database operations on a server and running java applets on a client.
Motivation: Synthetic workload generation overcomes many of the problems that simulation techniques have. One problem with simulation is scaleability. For example, simulating the performance of various queue management and/or scheduling algorithms on a large scale can take a substantial amount of CPU speed and become impractical. Another problem is that the simulation is only as accurate as the underlying model. Whether or not simulation reflects behavior which will be observed upon implementation depends directly on the model being used to simulate. Finally, simulation doesn't give the user a feel for the complexity in real-time and the overhead involved when considering different mechanisms. Synthetic workloads also have an advantage over real workloads in that they can be parameterized to study a range of mixes and they afford the researcher the ability to repeat experiments easily.
CoGeNT will initially focus on workloads for studying the effects of traffic loads and traffic mixes on the performance of different design features in routers and switches. For example, recent studies have shown that LRD traffic has a significant influence on network performance and the allocation of network resources. Using CoGeNT in an isolated testbed, we can quantify this influence in a controlled environment. Another envisioned use would be to evaluate RED implementations over a range of traffic generated by CoGeNT. Simulation studies evaluating RED have been limited in the scale of their experiments.
What we support: The initial implementation of CoGeNT provides a toolbox for modeling the behavior of various Internet applications. Studies in the literature have provided significant information on the nature and behavior of various applications. In particular, they have addressed modeling WWW behavior, telnet, ftp, X11, SMTP, and NNTP traffic. A summary of this research is listed below.
| Internet Application Characteristics | |||
| Application | Arrivals | Request Sizes | Response Sizes |
| WWW |
Pareto [7] Heavy-tailed, but not as much as transfers |
? |
Pareto for file sizes > 1K [6] Heavy-tailed |
| Telnet Session | Poisson [1] |
? |
Fairly Fixed? |
| Telnet Data |
Pareto [1] Heavy-tailed (Think time) |
Packets: Log-normal [1] Bytes: Log-extreme [2] |
? |
| FTP Session | Poisson [1] |
Negligible |
Fairly Fixed? |
| FTP Data |
Bursty [1] Network-dependent (mget operations) |
? |
Pareto[1] Very Heavy Upper Tail [1] |
| SMTP |
Not Poisson [1] (Mailing lists) |
? |
? |
| NNTP |
Not Poisson [1] (Flooding) |
? |
? |
| X11 Session | Poisson? [1] |
Negligible |
? |
| X11Connection | Not Poisson [1] |
? |
? |
| Miscellaneous Sources | |
| Source | Distribution |
| "Self-similar" sources |
ON-OFF times with heavy tails [5,6] Inter-renewal times with infinite variance |
| VBR video | Gamma/Pareto [3] |
| Characteristics of Aggregate Traffic | |
| Characteristic | Distribution |
| Aggregate Traffic (Self-Similar) |
Fractional Gaussian Noise[5,6] Fractional ARIMA [5,6] |
| Sources generating self-similar traffic |
Pareto [5,6] ON-OFF sources with inter-renewal times with infinite variance. (ON and/or OFF times have heavy-tails) |
| Observed WWW Throughput |
Log-normal [4] |
Notes: Intensity of self-similarity increases as the aggregate traffic level increases.
As such modeling becomes more mature, we plan on implementing additional models to keep up with current literature. Another important feature we plan on adding is the ability to run in trace-driven mode where interarrival times, file transfer sizes, etc. are taken from a logfile captured from real use (i.e. server logs and/or tcpdump logs)
Related Work:
| Tool | What they lack |
| Raghavan[8] | Focus on single server, multiple client. Not an aggregation of
both. Exponential source models. Adapting to new applications? |
| Netperf | Only measures peak throughput of communication between a pair of machines |
| ttcp | Lacks multiple composable tasks. Measures raw performance between machines. |
| Spec SFS | Mainly for measuring and modeling NFS performance over the network |
| Byte93 | Limited network support |
| WPI | Single server, multiple clients. "FTP" of large files. X-windows. Fixed job mixes. |
| lmbench | Measures aggregate throughput of end-host communication subsystem |
| Nettest | Developed by Cray (Not much info on this one) |
| SPECweb96 | Measures maximum throughput of a server. Only generates artificial
network traffic. Can't be parameterized. Don't capture a wide range of application classes. Can't evolve to new applications. |
| AIM | Models regular as well as networked (telnet, ftp, etc.) applications.
Proprietary. Unclear how configurable or realistic the workloads are. |