Demos & Tools

WebKFCA

An EDA (Exploratory Data Analysis) framework based on K-FCA has been developed as an aid for scientific discovery. A more ad hoc tool, specifically designed for Gene Expression Analysis is also available in (url_WebGeneKFCA4GPM). https://webgenekfca.com/webgenekfca/general/changetype/webkfca

The Entropy Triangle

New implementations of the set of information-theoretic tools for the assessment of multi-class classifiers that include The Entropy Triangle, NIT and EMA. 

Implementations

Use case vignettes in R

If you really want to get dirty, these are the use cases we will use to illustrate the affordances of the Entropy Triangle in Rmd: Analysis of Confusion Matrices and Simple Use Case for the CBET on classification. You will be able to analyse different classifiers and find out yourself what the Entropy Triangle is doing. In this case, it is recommended to have R Studio installed.

For those who just want to peruse the illustration cases: Analysis of Confusion Matrices and Simple Use Case for the CBET on classification

Our project in ResearchGate

Web page at ResearchGate where updates to it are posted:

The main papers for the theory

The first introduction to the Entropy Triangle CBET [1] and related metrics EMA \& NIT [2], the source multivariate extension  SMET [3] and the channel multivariate extension CMET [4].

[1] [doi] Francisco~J.~Valverde-Albacete and C. Peláez-Moreno, “Two information-theoretic tools to assess the performance of multi-class classifiers,” Pattern Recognition Letters, vol. 31, iss. 12, pp. 1665-1671, 2010.
[Bibtex]
@article{val:pel:10b,
  Doi = {10.1016/j.patrec.2010.05.017},
  Author = {Francisco~J.~Valverde-Albacete and Carmen Pel\'aez-Moreno},
  Date-Added = {2016-02-11 13:49:50 +0000},
  Date-Modified = {2016-02-11 13:49:50 +0000},
  Journal = {Pattern Recognition Letters},
  Number = {12},
  Pages = {1665--1671},
  Title = {Two information-theoretic tools to assess the performance of multi-class classifiers},
  Volume = {31},
  Year = {2010}}
[2] [doi] F. J.~Valverde-Albacete and C. Peláez-Moreno, “100\% classification accuracy considered harmful: the normalized information transfer factor explains the accuracy paradox,” PLOS ONE, pp. 1-10, 2014.
[Bibtex]
@article{val:pel:14a,
  Author = {Francisco J.~Valverde-Albacete and Carmen Pel\'aez-Moreno},
  Date-Added = {2016-02-11 13:49:37 +0000},
  Date-Modified = {2016-02-11 13:49:37 +0000},
  Doi = {10.1371/journal.pone.0084217},
  Journal = {PLOS ONE},
  Month = {january},
  Pages = {1--10},
  Title = {100\% classification accuracy considered harmful: the normalized information transfer factor explains the accuracy paradox},
  Year = {2014},
  Bdsk-Url-1 = {http://dx.doi.org/10.1371/journal.pone.0084217}}
[3] [doi] F. J. Valverde-Albacete and C. Peláez-Moreno, “The Evaluation of Data Sources using Multivariate Entropy Tools,” Expert Systems with Applications, vol. 78, pp. 145-157, 2017.
[Bibtex]
@article{val:pel:17b,
  Author = {Valverde-Albacete, Francisco J and Pel\'aez-Moreno, C},
  Date-Added = {2017-03-02 10:57:06 +0000},
  Date-Modified = {2017-03-02 10:57:06 +0000},
  Doi = {10.1016/j.eswa.2017.02.010},
  Journal = {Expert Systems with Applications},
  Pages = {145-157},
  Title = {The Evaluation of Data Sources using Multivariate Entropy Tools},
  Volume = {78},
  Year = {2017},
  Bdsk-Url-1 = {http://dx.doi.org/10.1016/j.eswa.2017.02.010}}
[4] [doi] F. J. Valverde-Albacete and C. Peláez-Moreno, “Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle,” Entropy, vol. 20, iss. 7, 2018.
[Bibtex]
@article{val:pel:18c,
AUTHOR = {Valverde-Albacete, Francisco J. and Pel\'aez-Moreno, Carmen},
TITLE = {Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle},
JOURNAL = {Entropy},
VOLUME = {20},
YEAR = {2018},
NUMBER = {7},
Doi = {10.3390/e20070498},
ISSN = {1099-4300},
ABSTRACT = {Data transformation, e.g., feature transformation and selection, is an integral part of any machine learning procedure. In this paper, we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transformation of a discrete, multivariate source of information X¯ into a discrete, multivariate sink of information Y¯ related by a distribution PX¯Y¯. The first contribution is a decomposition of the maximal potential entropy of (X¯,Y¯), which we call a balance equation, into its (a) non-transferable, (b) transferable, but not transferred, and (c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate channel multivariate entropy triangle, is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equations also apply to the entropies of X¯ and Y¯, respectively, and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for Principal Component Analysis and Independent Component Analysis as unsupervised feature transformation and selection procedures in supervised classification tasks.}
}

Our tutorial at WCCI18

Slides for IJCNN-04 tutorial: IJCNN18-EntropyTriangle

See more here.

 

 

Comments are closed.