The underlying premise of the project is that the emerging trend towards (closely related) concepts of service-oriented architectures and grid computing can alleviate the above problems. They can enable development of services which are not tied to specific datasets or end applications, and implementation of applications using these services.
We view the cyberinfrastructure software support for environmental applications as comprising four layers. At the lowest level, we have basic grid middleware services: Globus (which provides resource monitoring and security) and related middleware standards and services, including Grid Data Access and Integration (DAI) standards. This work has been developed and supported by programs like the NSF Middleware Initiative (NMI). At the second level, we have three advanced data-intensive middleware services developed at Ohio State. The particular components will be:a. GATES (Grid-based Adaptive Execution on Streams)
This middleware allows development of grid-based streaming applications, which can adapt the processing granularity to meet real-time constraints. Continuous sensor-based data is available for most environmental applications. There are many situations where one needs to react on a real-time basis, for example, when there is an oil spill in a lake.
b. Data Virtualization and Wrapper Generation Middleware:
The goal here is to make applications or application components independent of the specific data formats. Our work on data virtualization allows application to query or process a complex dataset with a simpler or abstract view, e.g., a relational table based view. Our work on wrapper generation allows data and tools with different formats to be integrated.
c. FREERIDE-G (Framework for Rapid Implementation of Datamining Engines in a Grid)
This system allows parallel implementation of data mining or data-intensive scientific applications that involve data in a remote repository.
At the third level, specific algorithms and data analysis techniques will be implemented as grid services, i.e., they will be implemented so that they can be accessed by different application developers, can be applied on different data sources, and also, can be executed on a variety of resources. Finally, at the top-most level, we have the end applications. These will be developed using the services from previous layers. The two applications we will target are real-time coastal forecasting/nowcasting, and long term coastal analysis and prediction.
Our implementation and evaluation will be in the context of the Great Lakes Observing System (GLOS), and will be done jointly with the National Oceanic and Atmospheric Administration (NOAA).