SAPIENS

Saliency and Attention: rePresentation, Interpretation and EmergeNce

Attention is a complex cognitive function essential for explaining human behaviour that allows us to select the most relevant events or items in our environment in order to focus our sensory and cognitive resources on them. It can be modulated either by bottom-up sensory-driven factors, or top-down task-specific goals. In the former case it is also referred to as salience or saliency.

Understanding saliency and attention is a highly challenging scientific endeavour and creating artificial machines capable of imitating them, a remarkable technological step forward. Despite the substantial advances in the field, including our own under our previous MinEco funded project SAMURAI (Saliency and Attention: MUltimodality, context-awaReness, self-Adaptation and bio-Inspiration) and others, the challenge is still far from being overcome.

In the meantime, two prominent technologies spanning this and several other disciplines have caused a profound impact on this research agenda: the deep learning paradigm in machine learning, and the maturity of sensor devices. On both these lines our research group has accumulated significant expertise. In this light we have identified the following key directions for advancing this technology:

1. Representation. How to measure and describe attention at various levels of detail is a hard problem due to the limited ability of the measuring devices to capture the phenomenon and inter-subject variability. From the data modeling perspective, this problem is two-fold: first, there is an underspecification of the target labels and second, a lack of appropriateness in input features, specially to model the dynamics and multimodality of phenomena. In this line, our proposal with SAPIENS is to explore unsupervised, weakly supervised and multilabeling methods with an emphasis on the representation of multimodal streams to take advantage of cross-modal synergies. This is fully aligned with the open lines of research in the machine learning community and therefore current methods are likely to evolve in this direction.

2. Interpretation. With the advent and spread of deep learning techniques, a growing concern is their lack of interpretability, in essense, their being conceived as black-boxes. This undermines their applicability, on the one hand, as a tool for scientific understanding of phenomena and, on the other, as an aid for experts in which they are likely to be used as modules inside a more complex system, possibly including human interaction in the loop. SAPIENS will adopt the Exploratory Data Analysis framework, explore information-theoretical evaluation methods for their characterization and search for bioinspired mathematical models behind current state-of-the-art machine learning methods.

3. Emergence. The emergence of order and organization in systems composed of many autonomous entities is a very basic process but very difficult to model or explain. In the phenomenon of attention we observe how a complex task-driven response arises from low level sensory inputs. In SAMURAI we demonstrated the capability to acquire knowledge through the discovery of latent classes or topics in top-down visual systems but not with aural or multimodal data streams. Within this wider framework SAPIENS aims at providing a more generally applicable solution making use of the wealth of scientific results available on this matter.

SAPIENS considers several real applications to test the theoretical advances in these three directions.

Publications

(Open access in our Institutional Repository: https://e-archivo.uc3m.es/handle/10016/1591)

  • [DOI] F. J. Valverde-Albacete and C. Peláez-Moreno, “A framework for supervised classification performance analysis with information-theoretic methods,” IEEE Transactions on Knowledge and Data Engineering, pp. 1-1, 2020.
    [Bibtex]
    @ARTICLE{val:pel:20,
    author={F. J. {Valverde-Albacete} and C. {Peláez-Moreno}},
    journal={IEEE Transactions on Knowledge and Data Engineering},
    title={A framework for supervised classification performance analysis with information-theoretic methods},
    year={2020},
    volume={},
    number={},
    pages={1-1},
    keywords={Task analysis;Entropy;Mutual information;Proposals;Tools;Performance analysis;Performance evaluation;classification algorithms;information entropy;mutual information;formal concept analysis},
    doi={10.1109/TKDE.2019.2915643},
    ISSN={2326-3865},
    month={},}
  • [DOI] E. Pla-Sacristán, I. González-Díaz, T. Martínez-Cortés, and F. Díaz-de-María, “Finding landmarks within settled areas using hierarchical density-based clustering and meta-data from publicly available images,” Expert Systems with Applications, vol. 123, pp. 315-327, 2019.
    [Bibtex]
    @article{pla:gon:mar:dia:19,
    title = "Finding landmarks within settled areas using hierarchical density-based clustering and meta-data from publicly available images",
    journal = "Expert Systems with Applications",
    volume = "123",
    pages = "315 - 327",
    year = "2019",
    issn = "0957-4174",
    doi = "https://doi.org/10.1016/j.eswa.2019.01.046",
    url = "http://www.sciencedirect.com/science/article/pii/S0957417419300521",
    author = "Eduardo Pla-Sacristán and Iván González-Díaz and Tomás Martínez-Cortés and Fernando Díaz-de-María",
    keywords = "Density-based clustering, K-DBSCAN, V-DBSCAN, Hierarchical clustering, Landmark detection, Tourism",
    abstract = "The process of determining relevant landmarks within a certain region is a challenging task, mainly due to its subjective nature. Many of the current lines of work include the use of density-based clustering algorithms as the base tool for such a task, as they permit the generation of clusters of different shapes and sizes. However, there are still important challenges, such as the variability in scale and density. In this paper, we present two novel density-based clustering algorithms that can be applied to solve this: K-DBSCAN, a clustering algorithm based on Gaussian Kernels used to detect individual inhabited cores within regions; and V-DBSCAN, a hierarchical algorithm suitable for sample spaces with variable density, which is used to attempt the discovery of relevant landmarks in cities or regions. The obtained results are outstanding, since the system properly identifies most of the main touristic attractions within a certain region under analysis. A comparison with respect to the state-of-the-art show that the presented method clearly outperforms the current methods devoted to solve this problem."
    }
  • [DOI] M. Fernández-Torres, I. González-Díaz, and F. Díaz-de-María, “Probabilistic Topic Model for Context-Driven Visual Attention Understanding,” IEEE Transactions on Circuits and Systems for Video Technology, pp. 1-1, 2019.
    [Bibtex]
    @ARTICLE{tor:gon:dia:19,
    author={M. {Fernández-Torres} and I. {González-Díaz} and F. {Díaz-de-María}},
    journal={IEEE Transactions on Circuits and Systems for Video Technology},
    title={Probabilistic Topic Model for Context-Driven Visual Attention Understanding},
    year={2019},
    volume={},
    number={},
    pages={1-1},
    keywords={Visualization;Task analysis;Adaptation models;Feature extraction;Computational modeling;Probabilistic logic;Context modeling;Top-down visual attention;hierarchical probabilistic framework;context-aware model;latent topic models},
    doi={10.1109/TCSVT.2019.2909427},
    ISSN={1558-2205},
    month={},}
  • [DOI] M. Molina-Moreno, I. González-Díaz, and F. Díaz-de-María, “Efficient Scale-Adaptive License Plate Detection System,” IEEE Transactions on Intelligent Transportation Systems, pp. 1-13, 2018.
    [Bibtex]
    @ARTICLE{Molina-Moreno2018, 
    author={M. Molina-Moreno and I. González-Díaz and F. Díaz-de-María}, 
    journal={IEEE Transactions on Intelligent Transportation Systems}, 
    title={Efficient Scale-Adaptive License Plate Detection System}, 
    year={2018}, 
    volume={}, 
    number={}, 
    pages={1-13}, 
    keywords={Licenses;Detectors;Feature extraction;Deformable models;Lighting;Robustness;Image edge detection;License plate detection;GentleBoost;scale-adaptive part-based model;video surveillance}, 
    doi={10.1109/TITS.2018.2859035}, 
    ISSN={1524-9050}, 
    month={},}
  • [DOI] F. Fernández-Martínez, A. Hernández-García, M. A. Fernández-Torres, I. González-Díaz, Á. García-Faura, and F. Díaz-de-María, “Exploiting visual saliency for assessing the impact of car commercials upon viewers,” Multimedia Tools and Applications, vol. 77, iss. 15, pp. 18903-18933, 2018.
    [Bibtex]
    @ARTICLE{FernandezMartinez2018, 
    author="Fern{\'a}ndez-Mart{\'i}nez, F.
    and Hern{\'a}ndez-Garc{\'i}a, A.
    and Fern{\'a}ndez-Torres, M. A.
    and Gonz{\'a}lez-D{\'i}az, I.
    and Garc{\'i}a-Faura, {\'A}.
    and  D{\'i}az-de-Mar{\'i}a, F.",
    journal={Multimedia Tools and Applications}, 
    title={Exploiting visual saliency for assessing the impact of car commercials upon viewers}, 
    year={2018}, 
    volume={77}, 
    number={15}, 
    pages={18903-18933}, 
    keywords={Visual attention, Saliency, Scene analysis, Aesthetics assessment, Feature extraction, Video impact assessment}, 
    doi={10.1007/s11042-017-4879-3}, 
    ISSN={1573-7721}, 
    month={August},}
  • [DOI] F. J. Valverde-Albacete and C. Peláez-Moreno, “The Case for Shifting the Rényi Entropy,” Entropy, vol. 21, iss. 1, 2019.
    [Bibtex]
    @Article{val:pel:19,
    AUTHOR = {Valverde-Albacete, Francisco J. and Peláez-Moreno, Carmen},
    TITLE = {The Case for Shifting the Rényi Entropy},
    JOURNAL = {Entropy},
    VOLUME = {21},
    YEAR = {2019},
    NUMBER = {1},
    ARTICLE-NUMBER = {46},
    URL = {http://www.mdpi.com/1099-4300/21/1/46},
    ISSN = {1099-4300},
    ABSTRACT = {We introduce a variant of the R\´enyi entropy definition that aligns it with the well-known H\"older mean: in the new formulation, the r-th order Renyi Entropy is the logarithm of the inverse of the r-th order H\"older mean. This brings about new insights into the relationship of the R\´enyi entropy to quantities close to it, like the information potential and the partition function of statistical mechanics. We also provide expressions that allow us to calculate the R\' enyi entropies from the Shannon cross-entropy and the escort probabilities. Finally, we discuss why shifting the R\`enyi entropy is fruitful in some applications.},
    DOI = {10.3390/e21010046}
    }
  • [DOI] F. J. Valverde-Albacete and C. Peláez-Moreno, “K-Formal Concept Analysis as linear algebra over idempotent semifields,” Information Sciences, vol. 467, pp. 579-603, 2018.
    [Bibtex]
    @article{val:pel:18c,
    title = "K-Formal Concept Analysis as linear algebra over idempotent semifields",
    journal = "Information Sciences",
    volume = "467",
    pages = "579 - 603",
    year = "2018",
    issn = "0020-0255",
    doi = "https://doi.org/10.1016/j.ins.2018.07.067",
    url = "http://www.sciencedirect.com/science/article/pii/S0020025516312051",
    author = "Francisco J. Valverde-Albacete and Carmen Pel\'aez-Moreno",
    keywords = "Generalised Formal Concept Analysis, Concept lattice, Neighborhood lattice, Idempotent semiring, Dioid, Confusion matrix",
    abstract = "We report on progress in characterizing K-valued FCA in algebraic terms, where K is an idempotent semifield. In this data mining-inspired approach, incidences are matrices and sets of objects and attributes are vectors. The algebraization allows us to write matrix-calculus formulae describing the polars and the fixpoint equations for extents and intents. Adopting also the point of view of the theory of linear operators between vector spaces we explore the similarities and differences of the idempotent semimodules of extents and intents with the subspaces related to a linear operator in standard algebra. This allows us to shed some light into Formal Concept Analysis from the point of view of the theory of linear operators over idempotent semimodules. In the opposite direction, we state the importance of FCA-related concepts for dual order homomorphisms of linear spaces over idempotent semifields, specially congruences, the lattices of extents, intents and formal concepts."
    }
  • [DOI] A. Rodríguez-Hidalgo, C. Peláez-Moreno, and A. Gallardo-Antolín, “The Robustness of Echoic Log-Surprise Auditory Saliency Detection,” IEEE Access, vol. 6, pp. 72083-72093, 2018.
    [Bibtex]
    @ARTICLE{rod:pel:gal:18b,
    author={A. Rodr\'iguez-Hidalgo and C. Pel\'aez-Moreno and A. Gallardo-Antol\'in},
    journal={IEEE Access},
    title={The Robustness of Echoic Log-Surprise Auditory Saliency Detection},
    year={2018},
    volume={6},
    number={},
    pages={72083-72093},
    keywords={Acoustics;Robustness;Task analysis;Saliency detection;Signal processing algorithms;Bayes methods;Spectrogram;Acoustic saliency;echoic memory;multi-scale;statistical divergence;Jensen-Shannon;acoustic event detection},
    doi={10.1109/ACCESS.2018.2882055},
    ISSN={2169-3536},
    month={},}

Conferences

  • [DOI] F. J. V. Albacete, C. Peláez-Moreno, P. Cordero, and M. Ojeda-Aciego, “Formal Equivalence Analysis,” in 2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019), 2019.
    [Bibtex]
    @inproceedings{val:pel:cor:oje:19,
      title={Formal Equivalence Analysis},
      author={Francisco José Valverde Albacete and Carmen Peláez-Moreno and Pablo Cordero and Manuel Ojeda-Aciego},
      year={2019},
      booktitle={2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019)},
      issn={2589-6644},
      isbn={978-94-6252-770-6},
      url={https://doi.org/10.2991/eusflat-19.2019.109},
      doi={https://doi.org/10.2991/eusflat-19.2019.109},
      publisher={Atlantis Press}
    }
  • [DOI] A. Gallardo-Antolín and J. M. Montero, “A Saliency-Based Attention LSTM Model for Cognitive Load Classification from Speech,” in Proc. Interspeech 2019, 2019, pp. 216-220.
    [Bibtex]
    @inproceedings{gal:mon:19,
      author={Ascensión Gallardo-Antolín and Juan Manuel Montero},
      title={{A Saliency-Based Attention LSTM Model for Cognitive Load Classification from Speech}},
      year=2019,
      booktitle={Proc. Interspeech 2019},
      pages={216--220},
      doi={10.21437/Interspeech.2019-1603},
      url={http://dx.doi.org/10.21437/Interspeech.2019-1603}
    }
  • I. González-Díaz, J. Benois-Pineau, J. Domenger, and A. de Rugy, “Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp,” in ACM International Conference on Multimedia Retrieval, ICMR, 2018, pp. 434-441.
    [Bibtex]
    @inproceedings{Gonzalez2018,
      author    = {Iv{\'{a}}n Gonz{\'{a}}lez{-}D{\'{i}}az and
                   Jenny Benois{-}Pineau and
                   Jean{-}Philippe Domenger and
                   Aymar de Rugy},
      title     = {Perceptually-guided Understanding of Egocentric Video Content: Recognition
                   of Objects to Grasp},
      booktitle = {ACM International Conference on Multimedia Retrieval, {ICMR}},
      pages     = {434--441},
      year      = {2018},
      timestamp = {Mon, 11 Jun 2018 09:27:11 +0200},
      bibsource = {dblp computer science bibliography, https://dblp.org}
    }
  • F. J. Valverde-Albacete, C. Peláez-Moreno, I. P. Cabrera, P. Cordero, and M. Ojeda-Aciego, “A Data Analysis Application of Formal Independence Analysis,” in Concept Lattices and their Applications (CLA 2018), , 2018, pp. 1-12.
    [Bibtex]
    @incollection{val:pel:cab:cor:oje:18b,
      Author = {Valverde-Albacete, Francisco J and Pel{\'a}ez-Moreno, Carmen and Cabrera, Inma P and Cordero, P and Ojeda-Aciego, Manuel},
      Booktitle = {Concept Lattices and their Applications (CLA 2018)},
      Date-Added = {2018-05-08 07:39:20 +0000},
      Date-Modified = {2018-05-08 07:39:20 +0000},
      Pages = {1--12},
      Title = {{A Data Analysis Application of Formal Independence Analysis}},
      Year = {2018}}

PhDThesis

  • M. Á. Fernández-Torres, “Hierarchical representations for spatio-temporal visual attention modeling and understanding,” PhD Thesis, 2019.
    [Bibtex]
    @phdthesis{tesis6,
      author       = {Miguel Ángel Fernández-Torres}, 
      title        = {Hierarchical representations for spatio-temporal visual attention modeling and understanding},
      school       = {Escuela Politécnica Superior, Universidad Carlos III de Madrid.},
      year         = 2019,
      month        = 2,
      note         = {An optional note}
    }
  • A. Rodríguez-Hidalgo, “Bayesian and Echoic Log-surprise for auditory saliency detection,” PhD Thesis, 2019.
    [Bibtex]
    @phdthesis{tesis7,
      author       = {Antonio Rodríguez-Hidalgo}, 
      title        = {Bayesian and Echoic Log-surprise for auditory saliency detection},
      school       = {Escuela Politécnica Superior, Universidad Carlos III de Madrid.},
      year         = 2019,
      month        = 2,
      note         = {An optional note}
    }

Fechas de ejecución: 01/01/2018-31/12/2020

Financiado por: FEDER/Ministerio de Ciencia, Innovación y Universidades – Agencia Estatal de Investigación/TEC2017-84395-P

Comments are closed.