Zero-shot classification of salmon lice images by siamese neural networks

Lian, Kristian Mølbach

Lian, Kristian Mølbach

Master thesis

Åpne

master thesis (8.739Mb)

Permanent lenke

https://hdl.handle.net/11250/3084674

Utgivelsesdato

2023-06-22

Metadata

Vis full innførsel

Samlinger

Master theses [120]

Sammendrag

Deep learning models, such as neural networks and its variations, have proven exceptionally useful in the current state of society. However, facilitating competitive performances requires large amounts of data for the models to train on, which is especially true in the problem of classification. Addressing this issue for the scarce image datasets containing salmon lice images used in this thesis, can be done by recasting the problem of "which class does this image belong to?", to rather be a question of image similarity, i.e. "is image i similar to j?". In regards to this thesis, siamese neural networks are employed to distinguish images, rather than to explicitly classify them, which has the effect of producing more data points for training. Exactly how many data points for training is readily developed in this thesis (specifically triplet cardinality). Furthermore, the thesis extensively compares the performance measures of F1-score and TAR@FAR(p) in regards to siamese neural networks, and finds that they differ in terms of prediction strictness and what elements of the confusion matrix they focus on. Specifically, TAR@FAR is designed to be more strict because a bound can be set on the allowance of percentage p of false accepts, whereas F1-score also considers false rejects. Moving on, the thesis is the first work to cover the procedure of cylindrical convolution in siamese neural networks, and shows that they in fact contribute in addressing the problem of rotated images. Additionally, cylindrical convolution seemingly solves the problem of inconsistent distribution of data. Conclusively, the best model at predicting image similarity on the synthetic dataset was Siamese_LeNet5_var with cylindrical convolutions. On this dataset augmented 100 times, it performed a testing F1-score of 72.5 ± 2.6% and a testing TAR of 72.8 ± 3.0% (mean ± std). In terms of the real dataset, testing performances could not be calculated due to dataset scarcity. Regardless, the model that performed the best on the validation dataset was also Siamese_LeNet5_var with cylindrical convolutions. On this dataset augmented 100 times, it performed a median validation F1-score of 60.9% and a median TAR@FAR(0.01) of 46.7\%.