SC|06 Powerful Beyond Imagination
SC06 is the International Conference for High Performance Computing Networking and Storage

About Registration Conference Technical Program Exhibits News and Press Travel

Home Conference Schedule

SCHEDULE: NOV 11-17, 2006

Entire WeekSaturdaySundayMondayTuesdayWednesdayThursdayFriday
My Itinerary

Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations

Session: Data Management and Query

Event Type: Paper, Best Student Paper Finalist

Time: 11:30am - 12:00pm

Session Chair: Beth Plale

Author(s): Tanu Malik, Randal Burns, Nitesh V Chawla, Alex Szalay

Location: 22-23

In a proxy cache for federations of scientific databases it is important to estimate the size of a query before making a caching decision. With accurate estimates, near-optimal cache performance can be obtained. On the other extreme, inaccurate estimates can render the cache totally ineffective.

We present classification and regression over templates (CAROT), a general method for estimating query result sizes, which is suited to the resource-limited environment of proxy caches and the distributed nature of database federations. CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload. We have integrated CAROT into the proxy cache of the National Virtual Observatory (NVO) federation of astronomy databases. Experiments conducted in the NVO show that CAROT dramatically outperforms conventional estimation techniques and provides near-optimal cache performance.

This paper can be found in the ACM and IEEE Digital Libaries
Click here for ACM
Click here for IEEE

Chair/ Author Details:

Beth Plale (Chair)
Indiana University

Tanu Malik
Johns Hopkins University

Randal Burns
Johns Hopkins University

Nitesh V Chawla
University of Notre Dame

Alex Szalay
Johns Hopkins University

Home | About | Contact Us | Registration | Sitemap
IEEEComputer SocietyACM