2010 - 2011 Undergraduate Research Experience (URE) [Website]
Elizabeth City State University :: Elizabeth City, NC
A Comparison of Job Duration Utilizing High Performance Computing on a Distributed Grid
Mentor: Mr. Jeff Wood
Abstract
The Center of Excellence in Remote Sensing Education and Research (CERSER) on the campus of Elizabeth City State University is currently tasked with the responsibility of receiving remotely sensed Advanced Very High Resolution Radiometer (AVHRR) data from orbiting National Oceanic and Atmospheric Administration (NOAA) satellites. This data is collected by the SeaSpace TeraScan® system installed in the CERSER labs in Dixon-Patterson Hall.
When this system was initially installed in 2005, the data was collected, processed, annotated, and transformed into images in the Tagged Image File Format (tiff) on a Windows® based server. These tiff images were then uploaded to the CERSER archive library server located at http://cerser.ecsu.edu. Once uploaded, they were converted into various resolutions and their information was added to a tracking database maintained with Microsoft Access software. This database provided a searchable means for retrieving satellite image data through various parameters.
Since that time the CERSER server has been replaced with a Macintosh based server which cannot interact with the Microsoft based database and the Visual Based Scripting language. The goal of this project was to redesign the database and the code required to process the images. PHP, Structured Query Language, and Command Line instructions to the software package ImageMagick® were utilized to complete these tasks.
Undergraduate Research Experience [URE] - Summer Scholars Institute - Summer 2011 [Poster]
REU IU Bloomington; School of Informatics :: Pervasive Technology Institute Building
Analyzing MapReduce Frameworks Hadoop and Twister
Mentors: Thilina Gunarathne, Stephen Tak-Lon Wu, Bingjing Zhang
Abstract
The primary focus of this research project was to analyze the attributes of MapReduce frameworks for data intensive computing and to compare two different MapReduce frameworks, Hadoop and Twister. MapReduce is a data processing framework that allows developers to write applications that can process large sets of data in a timely manner with the use of distributed computing resources. One of it's main features is the ability to partition a large computation in to a set of discrete tasks to enable of parallel processing of the computation. Google, the most popular search Engine on the internet, uses MapReduce to simplify data processing on it's large clusters. We analyze the performance of Hadoop and Twister using the WordCount application and compare the scalability and efficiency of the two frameworks for this particular application.