Home
Conference
Schedule
SCHEDULE: NOV 11-17, 2006
Warning: It appears you do not have Javascript enabled.
If so, you will have trouble creating and viewing your itinerary information.
Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations
Session:
Data Management and Query
Event Type:
Paper, Best Student Paper Finalist
Time:
11:30am - 12:00pm
Session Chair
:
Beth Plale
Author(s)
:
Tanu Malik, Randal Burns, Nitesh V Chawla, Alex Szalay
Location:
22-23
Abstract:
In a proxy cache for federations of scientific databases it is important to estimate the size of a query before making a caching decision. With accurate estimates, near-optimal cache performance can be obtained. On the other extreme, inaccurate estimates can render the cache totally ineffective.
We present classification and regression over templates (CAROT), a general method for estimating query result sizes, which is suited to the resource-limited environment of proxy caches and the distributed nature of database federations. CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload. We have integrated CAROT into the proxy cache of the National Virtual Observatory (NVO) federation of astronomy databases. Experiments conducted in the NVO show that CAROT dramatically outperforms conventional estimation techniques and provides near-optimal cache performance.
This paper can be found in the ACM and IEEE Digital Libaries
Click here for ACM
Click here for IEEE
Chair/ Author Details:
Beth Plale (Chair)
Indiana University
Tanu Malik
Johns Hopkins University
Randal Burns
Johns Hopkins University
Nitesh V Chawla
University of Notre Dame
Alex Szalay
Johns Hopkins University
Home
|
About
|
Contact Us
|
Registration
|
Sitemap