SemiLake aims to help regularly inspect and analyse algae blooms in lakes, which not only produce unpleasant smells but also negatively affect health and tourism sectors. The current main practice for harmful algae monitoring is to dispatch experts and manually collect and analyse algae samples, which is costly for frequent monitoring. Sensor-based approaches can be promising. However, existing technologies require expensive equipment installation as well as maintenance. To address the issues, we have turned to satellite multispectral imaging to make more affordable algae bloom monitoring. With Sentinel-2, we have developed a novel machine learning pipeline for urban lakes and algae monitoring. This is based upon state-of-the-art semi-supervised representation learning that allows us to leverage limited labelled datasets by exploiting large unlabelled datasets. SemiLake can recognise lakes and estimate algae bloom coverages in the lakes.
To achieve our solution, we have first manually collated more than 100 lake coordinates by exploiting the Copernicus SciHub given the lack of publicly available urban lake coordinate corpus. Then, we have implemented our lake multispectral patch collection and preprocessing pipeline, which is with the WEkEO and AI4Copernicus pre-processing services. Specifically, we have developed our image enhancement functions as well as helper indices such as NDVI to address some difficulty in visually identifying algae blooms on existing bands due to atmospheric conditions. These have greatly helped SemiLake full annotation task, which result in a small set of fully labelled SemiLake dataset. Finally, we have implemented our SemiLake model based on one of the latest contrastive learning methods for our semi-supervised learning. Our experimental results demonstrate the SemiLake's robustness in inferring algae blooms (IOU-Algae: 0.856) as well as lake shapes (IOU-lake:0.989). The SemiLake model is dockerised and readily available (rrtai/semilake_server:v1.3) for any future collaboration.
* This sub-project has received funding from the European Union’s Horizon 2020 research and innovation programme within the framework of the AI4Copernicus Project funded under grant agreement No 101016798