Under Construction
Available data sets and associated codes
The following is list of data sets and related codes available to
Systems
students:
Title |
Source languages |
Description |
Contact person |
Delta
|
IXP, C, E-code |
Some FAA feeds plus other data that represents an Operational
Information
System (OIS) |
Ada Gavrilovska (ada@cc) |
Molecular Data |
IXP, C, E-code |
Output from the a molecular dynamics simulation |
Matt Wolf (mwolf@cc)
|
Hydrology Data |
Java, C, E-code |
don't know |
still open |
OpenGL/PPM |
IXP, Kernel, C, E-code |
Streaming images from visualizing the MD or Hydro data |
still open |
Atmospheric data
|
|
Weather simulation data
|
still open
|
Linux mirror
site traces
|
|
Analysis of traffic at the CERCS
Linux mirror site
|
Mohamed Mansour
mansour@cc
|
Delta Data
Description
Papers
- Ada Gavrilovska, Karsten Schwan, and Van Oleson, ``A
Practical Approach for `Zero' Downtime in Operational Information
Systems'', International Conference on Distributed Computing
Systems (ICDCS-22),
IEEE, July 2002.
PostScript
[167K]
- Ada Gavrilovska, Karsten Schwan, and Van Oleson, ``Adaptable
Mirroring in Cluster Servers'', High Performace Distributed
Computing,
(HPDC-10), San Francisco, CA, August 2001.
PostScript
[183K] | PDF
Format [81K]
Source Code Location
Data available
- Delta flight data
/users/c/chaos/cercs_data/delta/flight.dat (provided by pmw@cc)
/users/c/chaos/cercs_data/delta/ticket.xml (provided by wiseman@cc) - FAA
data streams
- Some other data, here is one example
of the pbio representation of the Delta XML data unrolled, with
all
offsets for each field in the data. (useful when working directly on
ethernet
frames on ixp, probably in kernel too). [information provided by
ada@cc]
Filters available
Molecular Data
This is an evolving system at this time.
Description
This a streaming application in which data from a simulation engine is
streamed
to a compute intensive engine, the output is then fed into a
visualization
server.
information provided by ztcai@cc
Papers
Matt Wolf, Zhongtang Cai, Weiyun Huang, and Karsten
Schwan,
"Smart
Pointers: Personalized Scientific Data Portals in Your Hand'', Supercomputing 2002, ACM, November
2002
(doc)
Data available
- Atom data
/users/c/chaos/cercs_data/molecular/large.pbio (provided by mwolf@cc)
/users/c/chaos/cercs_data/molecular/data.pbio (copied from CVS
smartp/sample_data)
/users/c/chaos/cercs_data/molecular/atoms.xml (provided by pmw@cc)
Codes available
- The structure of the atom event (bond_record) along with
its PBIO field list, is defined in smartP/include/common.h
- smartP application is saved in CVS under the smartP project
:pserver:uid@cvs.cc.gatech.edu/net/cvs/chaos
Please contact Greg Eisen for a CVS uid.
- cutting plane (z-coord.) cutting_plane.h
- cutting region region_filter.h
- atom type atom_type.h
OpenGL/PPM
Description
There are two data type here:
1- Data for a list of triangular meshes generated from a molecular
dynamics
visualization
through OpenGL commands plus some encoding logic
2- PPM image data generated from the above.
Ziru (ziru@cc) contributed a detailed description
on pipegl filters.
Papers
Data available
/users/c/chaos/cercs_data/openGL/mesh*.txt
(provided by
mansour@cc)
Codes available
- smartP application is saved in CVS under the smartP project
:pserver:uid@cvs.cc.gatech.edu/net/cvs/chaos
Please contact Greg Eisen for a CVS uid.
- All code is in the smartP/pipegl directory, some headers are in
smartP/include
- OpenGL data is generated by the navigate_server process
- PPM data is generated by the ppm_server process
- Some filters are available in pipegl/pipegl_filter.[ch]
Channel
|
IOField list
|
GS application |
GS entity name |
OpenGL
|
triangle_field
|
navigate_server
|
"output0"
|
PPM
|
media_data_field_list
|
ppm_server
|
"ppm0"
|
Hydrology Data
Description
/net/hj1/chaos/demos/README.hydrology
Papers
Source Code Location
Data available
-
/net/hj5/chaos/demos/atmos/data
Filters available
Atmospheric Data
Description
/net/hj1/chaos/demos/README.atmos
(contributed by V. Martin vernard@cc)
The atmospheric model data was gathered originally from the UKMO (United
Kingdom Meteorological Organization) group. It was created for use by the
"climate" group under Karsten's leadership. There still should be some data
under ~climate if you have permissions to look at it.
The data is the output of the transport model that Thomas Kindler wrote for
his phd thesis. This is a paralel spectarl transport model that simulates
the transport of about 7 different species (i.e. chemicals) thoughout the
earth's atmosphere at the various pressure levels starting at the earth's
surface and going all to the way to open space. The idea being that if you
have some empirical data from a certain date then you can write a simulator
that mimicks it. You can then put in some initial measured conditions into
your simulator and see if you get the same results from your simulator as
you do when compared to some empirical data that is gathered. So basically,
you need to know the winds on the planet for every day and the resultant
measure concentrations of the trace chemicals that you are watching. You
then input those winds and see if the output chemicals that you calculate
are the same values and locations as the ones that are measured already.
We have both the input wind files and the output UKMO" data files used for
comparison. That data is often called observational data since it was
observed to get it.
Source Code Location
Data available
-
/net/hj5/chaos/demos/hydrology/data
Filters available
Linux mirror site traces
Description
In 2003 we analyzed the traffic traces of the GA TECH/CERCS Linux
mirror site. The analysis aimed at reconstructing user session
information from low level request traces. The original traces (xferlog
format) are avaialble on request. Please
contact Neil Bright for location and access permissions.
More information is availabe here,
this is also where you will find a link that describes the format of
the files.
A companion tool, StreamGen was developed to generate worloads based on
these traces. The tool is available in CVS.
Papers
- Mohamed Mansour, Matthew Wolf, and Karsten Schwan, "Dynamic Data
Access to the GT/CERCS Linux Mirror Site," High Performance Grid
Computing workshop of IPDPS '04 (pdf)
- Mohamed Mansour, Matthew Wolf, and Karsten Schwan, "A Workload
Generation Tool for Distributed Information Flow Applications," International Conference on
Parallel Processing (ICPP-04), 2004
(accepted)
Source Code Location
- StreamGen application is saved in CVS under the streamperf
project
:pserver:uid@cvs.cc.gatech.edu/net/cvs/chaos
Please contact Greg Eisen for a CVS uid.
Data available
-
/users/c/chaos/cercs_data/linux_mirror/*
Filters available
For questions or comments:
mansour@cc
Last updated: May 17th 2004