Towards a Parallel User Tool (ParDP) for Automatic Data Partitioning of Relational Databases
TimeTuesday, July 246:30pm - 8:30pm
DescriptionIn era of big data, the diversity of big data sets and the complexity of workloads necessitates enhancing the execution performance of queries for interacting with the relational databases. The performance of an SQL query often depends upon the size of the table in a database that is being queried- queries on very large relations are often slow. To improve the performance of SQL queries, several strategies are adopted, one of which is known as partitioning. Querying a partition for data is likely to be faster than querying a very large table. Therefore, we developed an interactive user tool for supporting autopartitioning in legacy relational databases residing inside open-source data management systems. The tool facilitates users with a front-end, developed in Java, that will interact with the user and a back-end, developed in SQL, that will perform the automated partitioning.
In this research work, the benefits of employing the developed user tool for improving the query execution performance of database management systems is investigated. Herein, test cases using the U.S Department of Agriculture (USDA) and Astronomical Sloan Digital Sky Survey (SDSS) data are established and a sensitivity study is executed on the performance of the database queries on several types of partitioning strategies, such as, range, hash, key and more. The details pertaining to the preliminary prototype of this user tool along with the details and experimental results on how the tool can be employed for enhancing the performance of large open-source datasets will be presented in the poster session.