NSF Gives Funding for Collaborative Effort


nsflogo.gif
The Ohio State University has been awarded 1.4 million by National Science Foundation (NSF) to to develop and evaluate a cyberinfrastructure component for environmental applications. The project is lead by Prof. Gagan Agrawal, from Computer Science and Engineering. Prof. Hakan Ferhatosmanoglu (CSE), Prof. Keith Bedford (Civil and Environmental Engineering and Geodetic Science)and Prof. Ron Li (CEEGS) are the three co-Principal Investigators.
  1. This project focuses on addressing the following concerns associated with the current ocean observation systems. The current systems are very tightly coupled. There is hardly any reuse of algorithm implementations across different systems. It is also extremely hard to test or incorporate new analysis algorithms.
  2. The implementations are closely tied to the available resources.
  3. The existing systems cannot adapt the granularity of analysis to the resource availability and time constraints.

The underlying premise of the project is that the emerging trend towards (closely related) concepts of service-oriented architectures and grid computing can alleviate the above problems. They can enable development of services which are not tied to specific datasets or end applications, and implementation of applications using these services.

We view the cyberinfrastructure software support for environmental applications as comprising four layers. At the lowest level, we have basic grid middleware services: Globus (which provides resource monitoring and security) and related middleware standards and services, including Grid Data Access and Integration (DAI) standards. This work has been developed and supported by programs like the NSF Middleware Initiative (NMI). At the second level, we have three advanced data-intensive middleware services developed at Ohio State. The particular components will be:

a. GATES (Grid-based Adaptive Execution on Streams)
This middleware allows development of grid-based streaming applications, which can adapt the processing granularity to meet real-time constraints. Continuous sensor-based data is available for most environmental applications. There are many situations where one needs to react on a real-time basis, for example, when there is an oil spill in a lake.
b. Data Virtualization and Wrapper Generation Middleware:
The goal here is to make applications or application components independent of the specific data formats. Our work on data virtualization allows application to query or process a complex dataset with a simpler or abstract view, e.g., a relational table based view. Our work on wrapper generation allows data and tools with different formats to be integrated.
c. FREERIDE-G (Framework for Rapid Implementation of Datamining Engines in a Grid)
This system allows parallel implementation of data mining or data-intensive scientific applications that involve data in a remote repository.


At the third level, specific algorithms and data analysis techniques will be implemented as grid services, i.e., they will be implemented so that they can be accessed by different application developers, can be applied on different data sources, and also, can be executed on a variety of resources. Finally, at the top-most level, we have the end applications. These will be developed using the services from previous layers. The two applications we will target are real-time coastal forecasting/nowcasting, and long term coastal analysis and prediction.

Our implementation and evaluation will be in the context of the Great Lakes Observing System (GLOS), and will be done jointly with the National Oceanic and Atmospheric Administration (NOAA).