The Position
The Research and Innovation team at Global Fishing Watch (GFW) connects data science and machine learning experts with leaders in the scientific community to produce new datasets, publish impactful research, and empower others to use our data. Our work aims to harness satellite imagery, computer vision, and big data technology to address some of the most pressing issues facing the marine environment.
We aim to reveal the activity of all ocean-going vessels in the world. Global Fishing Watch has a database of over 10 years of GPS positions, containing over 100 billion GPS positions from nearly half a million vessels. This data consists mostly of messages from the Automatic Identification System (AIS), but also includes a few tens of thousands of vessels from private vessel monitoring systems (VMS), as well as a small but increasing number of other types of tracking devices.
This database serves as the foundation for all of Global Fishing Watch’s work. Using this dataset, which is updated daily as new data arrives, GFW applies numerous algorithms, including machine learning algorithms that determine the type of vessel as well as its activity. In addition, many types of filters need to be applied to eliminate noise and properly combine positions into reasonable tracks of vessels.
The data scientist will play a key role supporting the Research and Innovation team’s work to maintain, update, and improve the key algorithms underpinning how GFW processes this database. Working closely with the GFW engineering and product teams, this person will gain a deep understanding of how to process and combine GPS data at scale to reveal insights about human activity at sea. Also important will be the need to perform data fusion to combine this dataset with other sources. This data fusion includes combining different GPS datasets and combining the dataset with detections of vessels from satellite imagery. There is also an opportunity to work closely with GFW’s machine learning engineers to review and update our key behavioral algorithms. This dataset flows directly into GFW’s core products, facilitates GFW’s direct engagement with governments, and it supports the numerous scientific publications that our research program supports.
The successful candidate will have an interest in environmental issues and a versatile skill set in geospatial analysis, statistics and programming. The incumbent will build diverse technical skills in programming, big data, and cloud computing and gain a variety of experience working for a globally diverse, fully distributed, and growing organization. The successful candidate will also have an enthusiasm for inspecting datasets, visualizing them, and digging deeply into understanding model results.
Principal Duties and Responsibilities
Geospatial data processing, fusion, and analysis:
- Advance GFW’s core GPS processing algorithms and datasets. Possible tasks include:
- Improving how we identify false AIS positions due to noise in the dataset.
- Developing methods to interpolate activity between known GPS positions.
- Modifying key algorithms for different data sources (largely AIS and VMS).
- Improve and implement behavioral algorithms to detect key activities of vessels at sea, including:
- When vessels visit port visits and anchorages.
- When vessels meet up at sea (see Miller et al. 2018).
- Fishing events and other key behaviors from our dataset.
- Share updates to these algorithms with the wider GFW staff and partners through clear documentation and communication.
Additional tasks may include:
- Provide technical support to the senior data scientist(s) responsible for developing and advancing other Global Fishing Watch datasets.
- Maintain and improve internal Python tools, such as modules and template repositories, to assist with migrating research projects from proof-of-concepts to automated prototypes.
- Work with GFW’s research partners to publish high impact science.
- Support GFW’s team of analysts in their efforts to support better governance.
Candidate description:
Qualifications you should have:
- Bachelor’s degree and four years of professional experience, or an equivalent combination of education and experience, in physical/earth sciences, computational science, statistics, fisheries, quantitative ecology, or engineering.
- Experience developing analysis methods with Python or R.
- Strong foundation in mathematics and statistics.
- Ability to work with large datasets and visualize data effectively.
- Experience with version control software and collaboration tools such as git and GitHub.
- Highly organized, analytical, detail oriented, and self-motivated.
- Ability to work efficiently and with an eye for detail.
- Willingness to take ownership of projects and effectively communicate updates in a transparent and proactive manner.
- Ability to manage multiple priorities while performing in a fast-paced, collaborative environment.
- Some experience with SQL languages.
Also great:
- Experience engaging with academic research and the peer-review process.
- An understanding of how to collaborate in a global and remote organization.
- Experience analyzing tracking or spatiotemporal data.
Application deadline: May 3, 2024