GIT-CERCS-04-04
Jiantao Kong, Karsten Schwan,
KStreams: Kernel Support for Efficient End-to-End Data Streaming
Technology advances are enabling increasingly data-intensive applications,
ranging from peer-to-peer file sharing, to multimedia, to remote graphics and
data visualization. One outcome is the considerable memory pressure imposed on
the machines involved, caused by application-specific data movements and by
repeated crossings of user/kernel boundaries. We address this
problem with a novel system service, termed KStreams, a general facility
for manipulating data without using intermediate buffers when it moves across
multiple kernel objects, like files or sockets. KStreams may be used to implement
kernel-level services that range from application-specific implementations
of sendfile commands, to data mirroring or proxy functions, to fast path data
conversions and transformations for data streaming. The KStreams API permits
individual applications to define fast path operations, which will then execute
at kernel level and if desired, without further application involvement.
By placing application-specific data manipulations into data movement fast
paths, user/kernel boundary crossings are avoided. By operating on data streams
`in-flight', data buffering is made unnecessary, thereby further reducing
the memory pressure imposed on machines.
KStreams is implemented on Linux kernel version 2.4.22.
Its evaluation uses data-intensive tasks performed in conjunction with modern
web services, such as proxy functions, remote media streaming, data
visualization, etc. Initial experiences with the KStreams implementation are
encouraging. Fast path data transformation via KStreams results in increased
throughput of 20-50% compared to user-level data manipulations.
Future work with KStreams uses it with complex multi-machine web services,
evaluated with representative user loads and applications.