Gagan Agrawal

Xiaogang Li

Swarup Sahoo

The Problem

                 XML has emerged as a standard data exchange format that is widely accepted in a variety of scientific and engineering areas. It is highly likely that more and more datasets in these areas will be supported with an XML interface for data exchanging. Typically, an application process such a interface will be developed in a high-level XML query language, which eases the task of a programmer.   This has created the need to provide compiler support for efficient optimizations and/or parallelization of XML queries over generic scientific dataset.

                    To deal with this challenge, we are developing a compilation framework that integrates optimizations and parallelization of XML queries.Overall, the issues involved are: 1) Optimizing XQuery by using compiler optmizations techniques; 2) Identifying a optimal parallelization scheme of a query by examining the uderlining cost models;  3) Providing high-level abstraction of a dataset to an  application developer through XML Shemas and  4) Translation of XQuery processing to an imperative language like C/C++, which is required for applying low-level APIs provided with a scientific dataset.

                    Our work is different from traditional database query optimizations in that we are dealing with arbitrary scientific datasets, which normally lack sufficient physical support as in a DBMS. Also, we are considering parallelization of XML queries in a shared-nothing environment, to which little attention has been devoted by the database community so far.




*      Xiaogang Li, Renato Ferreira and Gagan Agrawal "Compiler Support for Efficient Processing of XML Datasets" , in proceedings of 17th ACM International Conference on Supercomputing (ICS) 2003.

*      Xiaogang Li and Gagan Agrawal, "Supporting High-level Abstractions through XML Technology" , in proceedings of Languages and Compilers for Parallel Computing (LCPC), 2003.

*      Xiaogang Li and Swarup Kumar Sahoo and Gagan Agrawal, " XQuery Perspective: Using XML?XQuery for Scientific Applications and Applying Scientific Compilation Techniques" , in proceedings of First Workshop on XQuery Implementation, Experiences, and Perspectives,held in conjunction with SIGMOD 2004, June 2004 .

*      Xiaogang  Li  and Gagan Agrawal, "Using XQuery for Flat-File Based Scientific Datasets ", in proceedings of The 9th International Workshop on Data Base Programming Languages (DBPL), Potsdam, Germany, September 2003.

*      Xiaogang  Li  and Gagan Agrawal, "A Framework for Optimizing and Parallelizing Scientific XML Queries" submitted for publication.


*  Xiaogang  Li, Ruoming Jin  and Gagan Agrawal A Compilation Framework for Distributed Memory Parallelization of Data Mining Algorithms ,In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), Nice, France, April, 2003.    




