Purple Maiʻa

Intern Update: Kylyn takes on machine learning for fish ID

Hi, my name is Kylyn Fernandez and I worked with Purple Maʻia, Maui Ocean Center, Kamehameha Schools and AlgorithmHub Inc. this past summer as a Data Scientist Intern. I am currently a senior at Dixie State University in St. George, UT majoring in Software Development. Before moving to Utah, I graduated from Kamehameha Schools Maui and attended University of Hawaiʻi Maui College to get my Associates degree. In my free time I enjoy hiking, traveling, cooking and Disneyland.

This summer changed the course of my future when I jumped into the unknown world of Data Science. My mentor for this project was Alex Cabello from AlgorithmHub Inc., a Maui based startup focused on developing a machine learning software platform. Alex and Dale Nahoʻolewa won the 2016 Purple Prize ; they built a device that uses underwater technology to count fish in the loko iʻa. After talking with Alex and Dale, we were all inspired. We contacted the Maui Ocean Center about the idea, and met Lily Solano, a marine biologist, who suggested to focus on identifying invasive fish because they compete with the resources of native fish in Hawaiian waters. We decided to work on detecting invasive fish in video using machine learning to track populations in real time.

Before starting this daunting project, there were some required things I needed to learn how to do. First, I needed to get some background knowledge and figure out what Data Science is. I took a crash course in Machine Learning and Neural Networks on Coursera which helped me to get a better understanding of what I would be doing. Second, I learned how to use Scrum Agile Processes to manage product development, which in a nutshell is a fancy planner that consisted of a fifteen minute daily meeting to help me stay on track. At these meetings I answered  three questions: what you worked on in the past 24 hours, what are you going to work on in the next 24 hours, and if there are any roadblocks preventing you from making progress. Finally, I learned how to use Python and Jupyter Notebook to develop and test my models.

After learning those three essential tools, I began to research the feasibility of using an open source, state-of-the-art, real-time object detection system called You Only Look Once (YOLO). It applies a single neural network to an entire image; the neural network divides the image into smaller sections and predicts what is in the image.

The first step to using YOLO is collecting images and preparing them for training. I collected images of two invasive species: the toau (blacktail snapper) and the taʻape (bluestripe snapper). These invasive fish are located in various tanks at the Maui Ocean Center. After collecting the data, I used a labeling tool to identify and label where the fish were in the images.

The second step is to train YOLO using the images I labeled. I used 500 images total in my training set: 250 negative images (images without any invasive fish), 125 images of the toau in different angles and lightings, and 125 images of the taʻape in different angles and lightings. The images and labeled data were uploaded to the AlgorithmHub platform for training. We configured a workspace on AlgorithmHub with the YOLO software installed and compiled to use graphics processing unit (GPU) hardware. After much research, trial and error, I was successful in training prediction models that detect invasive fish. I learned that the GPU was critical in the training process because it offered a 500x speedup from a typical CPU. We were able to train YOLO in 12 hours on a GPU rather than weeks on a CPU.

The last step is evaluating the prediction models to see how accurate they are. My preliminary analysis showed that the neural networks are fairly accurate. In 10 unseen test images that I selected, I found that the model is 70% accurate.

Some next steps to the project will include developing tools to automate the evaluation of the prediction models. This will allow us to evaluate more test images to get a better understanding of model accuracy. The tools will also allow us to choose the most accurate model and allow us to retrain and evaluate new models faster. I will also look into deploying the prediction models to run on an Apple iPhone and/or Raspberry PI to determine the feasibility of running the prediction models on a low powered device.

Overall in two months I learned about a world that I never even knew existed and managed to develop a really cool project that will be useful to Hawaiians in the future. Because of this project and my brief introduction to Data Science, I am considering getting my masters degree in Data Science once I graduate in the Spring of 2019. None of this would have been possible without Purple Maʻia, Maui Ocean Center, Kamehameha Schools, and AlgorithmHub Inc, all of whom I am extremely grateful to.