Under Construction

Available data sets and associated codes

The following is list of data sets and related codes available to Systems students:
Title Source languages Description Contact person
IXP, C, E-code Some FAA feeds plus other data that represents an Operational Information System (OIS) Ada Gavrilovska (ada@cc)
Molecular Data IXP, C, E-code Output from the a molecular dynamics simulation Matt Wolf (mwolf@cc)
Hydrology Data Java, C, E-code don't know still open
OpenGL/PPM IXP, Kernel, C, E-code Streaming images from visualizing the MD or Hydro data still open
Atmospheric data

Weather simulation data
still open
Linux mirror site traces

Analysis of traffic at the CERCS Linux mirror site
Mohamed Mansour

Delta Data



Source Code Location

Data available

Filters available

Molecular Data

This is an evolving system at this time.


This a streaming application in which data from a simulation engine is streamed to a compute intensive engine, the output is then fed into a visualization server.

information provided by ztcai@cc


Matt Wolf, Zhongtang Cai, Weiyun Huang, and Karsten Schwan, "Smart Pointers: Personalized Scientific Data Portals in Your Hand'', Supercomputing 2002, ACM, November 2002 (doc)

Data available

Codes available



There are two data type here:
1- Data for a list of triangular meshes generated from a molecular dynamics visualization through OpenGL commands plus some encoding logic
2- PPM image data generated from the above.

Ziru (ziru@cc) contributed a detailed description on pipegl filters.


Data available

/users/c/chaos/cercs_data/openGL/mesh*.txt (provided by mansour@cc)

Codes available

IOField list
GS application GS entity name

Hydrology Data




Source Code Location

Data available

Filters available

Atmospheric Data



(contributed by V. Martin vernard@cc)

The atmospheric model data was gathered originally from the UKMO (United
Kingdom Meteorological Organization) group. It was created for use by the
"climate" group under Karsten's leadership. There still should be some data
under ~climate if you have permissions to look at it.

The data is the output of the transport model that Thomas Kindler wrote for
his phd thesis. This is a paralel spectarl transport model that simulates
the transport of about 7 different species (i.e. chemicals) thoughout the
earth's atmosphere at the various pressure levels starting at the earth's
surface and going all to the way to open space. The idea being that if you
have some empirical data from a certain date then you can write a simulator
that mimicks it. You can then put in some initial measured conditions into
your simulator and see if you get the same results from your simulator as
you do when compared to some empirical data that is gathered. So basically,
you need to know the winds on the planet for every day and the resultant
measure concentrations of the trace chemicals that you are watching. You
then input those winds and see if the output chemicals that you calculate
are the same values and locations as the ones that are measured already.

We have both the input wind files and the output UKMO" data files used for
comparison. That data is often called observational data since it was
observed to get it.

Source Code Location

Data available

Filters available

  • Linux mirror site traces


    In 2003 we analyzed the traffic traces of the GA TECH/CERCS Linux mirror site. The analysis aimed at reconstructing user session information from low level request traces. The original traces (xferlog format) are avaialble on request. Please contact Neil Bright for location and access permissions.
    More information is availabe here, this is also where you will find a link that describes the format of the files.

    A companion tool, StreamGen was developed to generate worloads based on these traces. The tool is available in CVS.


    Source Code Location

    Data available

    Filters available

    For questions or comments: mansour@cc
    Last updated: May 17th 2004