Representing (Dis)Similarities Between Prediction and Fixation Maps Using Intersection-over-Union Features
by ,
Abstract:
A classic evaluation of the quality of gaze prediction models consists in comparing a set of ground truth fixation maps against a set of predictions. The quality of the prediction depends on the spatial similarity between the predicted and the observed fixated and non-fixated areas. Typically, (dis)similarity is evaluated by computing distribution-based metrics. However, the shortcoming of the metric scores is that they provide no information about the different types of (dis)similarities present in the prediction, for example, to determine whether the prediction fails wholly or partially to account for all the fixations. In this paper, we propose a set of features for representing the spatial (dis)similarities using intersection-over-union features, which provide helpful information that cannot be retrieved with traditional metrics for analyzing and evaluating prediction maps. We exemplify the usage of the features by analyzing the performance of different prediction models on a saliency benchmark dataset.
Reference:
Representing (Dis)Similarities Between Prediction and Fixation Maps Using Intersection-over-Union Features (Jaime Maldonado, Christoph Zetzsche), In Proceedings of the 2023 Symposium on Eye Tracking Research and Applications, Association for Computing Machinery, 2023.
Bibtex Entry:
@inproceedings{maldonado2023etra,
author = {Maldonado, Jaime and Zetzsche, Christoph},
title = {Representing (Dis)Similarities Between Prediction and Fixation Maps Using Intersection-over-Union Features},
year = {2023},
isbn = {9798400701504},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {10.1145/3588015.3589843">https://doi.org/10.1145/3588015.3589843},
doi = {10.1145/3588015.3589843},
abstract = {A classic evaluation of the quality of gaze prediction models consists in comparing a set of ground truth fixation maps against a set of predictions. The quality of the prediction depends on the spatial similarity between the predicted and the observed fixated and non-fixated areas. Typically, (dis)similarity is evaluated by computing distribution-based metrics. However, the shortcoming of the metric scores is that they provide no information about the different types of (dis)similarities present in the prediction, for example, to determine whether the prediction fails wholly or partially to account for all the fixations. In this paper, we propose a set of features for representing the spatial (dis)similarities using intersection-over-union features, which provide helpful information that cannot be retrieved with traditional metrics for analyzing and evaluating prediction maps. We exemplify the usage of the features by analyzing the performance of different prediction models on a saliency benchmark dataset.},
booktitle = {Proceedings of the 2023 Symposium on Eye Tracking Research and Applications},
articleno = {68},
numpages = {8},
location = {Tubingen, Germany},
series = {ETRA '23},
keywords = {EASE-H1},
}