FINAL PROJECT: Abstract and Reader's Reponse > A Data Scientist's Approach to Analyzing Maritime Data using Scikit-Learn

ABSTRACT (WC 197): This article introduces some basic methods for processing surface maritime data, using the International Comprehensive Ocean-Atmosphere Data Set (ICOADS). In the recent few centuries, the maritime sector has contributed greatly to the socioeconomic development of nations and society. With the increasing need for discovery and transportation, the medium of maritime shipping was exploited. They also facilitated merchant trading throughout this time and served as lifelines for many nations and people. This article serves as a tutorial for exploratory analysis of the changing shipping trends over three centuries of data, using many types of data visualization and containing observations encompassing the evolution of measurement technology. The code samples and visualizations are comprised of summary statistics, trending time series, linear regressions, geospatial analysis, and temporal analysis. As the most comprehensive maritime dataset available open-source, the ICOADS allows an exploration of how certain properties correlate with global location; what spatial patterns and trends can be observed; the trend of air and sea temperature over three centuries; how significantly have ship capabilities evolved over time. By covering topics regarding weather and climate, the tutorial closes with a further understanding of climate change. The analysis is performed using Python 3 and pandas.

READER'S PROFILE: A reader who does not see the benefit in studying maritime data. Perhaps they do not believe in climate change. Or maybe they would rather we used our data science technologies to study and analyze a different data set. It is also possible that we might be dealing with an older reader who is already well-established in their field. The reader might be opposed to change and it is likely that he/she does not understand the benefits of data science. Perhaps, they are not aware of the implications and relevance of data science protocols and practices.

READER'S RESPONSE: I don’t see the benefits of studying maritime data. How do visualizations, trending time series, linear regressions, and geospatial analysis help with the analysis of maritime data? What relevant information would these analysis methods reveal? I think our time and resources would be better spent on analyzing other data sets. I am also skeptical of “data science” itself. How do we know these protocols are accurate and reliable in the first place? For years, we have been successfully analyzing data by simply conducting interviews and collecting data by hand. I do not see the benefits of bringing a data scientist on board. It would be nice if the author explained why data science is the most beneficial field for analyzing large data sets.
May 5, 2017 | Unregistered CommenterMS
M and A,
If this data is already available, then the costs are relatively low. Can that condition help the reluctant reader in part? Also, will the reader need to start new data collection procedures?

Looks like a good plan; glad this work supports you both with Dr. D.
May 7, 2017 | Registered CommenterMarybeth Shea