Knowledge Management and Discovery Lab
Otto-von-Guericke-University Magdeburg, Germany
daniel.kottke@ovgu.de
www.daniel.kottke.eu/talks/2016_ECAI
\(\Rightarrow\) Active Learning
Selects label candidates at random.
The probability for each label candidate to be selected is equal. |
Prefers label candidates near the decision boundary.
Candidates in the darker green area (near the decision boundary) are preferred. |
Simulates the selection of each candidate and each class and estimates the expected error reduction on the data.
Candidates in the darker green area (complex calculation) are preferred. |
Basic idea and main tools
[1] "Optimised probabilistic active learning",
by G. Krempl, D. Kottke, V. Lemaire.
Machine Learning, 2014.
[1] "Optimised probabilistic active learning",
by G. Krempl, D. Kottke, V. Lemaire.
Machine Learning, 2014.
[1] "Optimised probabilistic active learning",
by G. Krempl, D. Kottke, V. Lemaire.
Machine Learning, 2014.
[1] "Optimised probabilistic active learning",
by G. Krempl, D. Kottke, V. Lemaire.
Machine Learning, 2014.
Given the true posterior, the observed labels are Multinomial distributed.
The normalized likelihood function is:
\[P(\vec{p} \mid \vec{k}) = \frac{ \operatorname{\Gamma}{\sum(k_i + 1)} }{ \prod\left(\operatorname{\Gamma}{k_i+1}\right) } \cdot \prod\left( p_i^{k_i} \right)\]
Example: In the binary case, the normalized likelihood is a Beta-distribution (see figure).
Describe all possible combinations of labels that could appear in that region given that \(m\) labels are acquired.
Three-Class Example (\(m=2\)): \[\vec{l} \in \{(2,0,0), (0,2,0), (0,0,2), (1,1,0), (1,0,1), (0,1,1)\} \]
This labeling vector is Multinomial distributed.\[P(\vec{l} \mid \vec{p}) = \textrm{Multinomial}_{\vec{p}}(\vec{l}) = \frac{ \operatorname{\Gamma}((\sum l_i) + 1) }{ \prod\left(\operatorname{\Gamma}(l_i+1)\right) } \cdot \prod\left( p_i^{l_i} \right)\]
The main contribution is to have a fast closed-form solution for the calculation of a candidate's usefulness.
\[\operatorname{expPerf}(\vec{k}, m) = \mathbb{E}_{\vec{p}}[ \mathbb{E}_{\vec{l}}[ \operatorname{acc}(\vec{k}+\vec{l} \mid \vec{p}) ]] \\ = \sum_{\vec{l}} \left(\prod_{j = \sum(k_i + 1)}^{\big(\sum(k_i + l_i + d_i + 1) \big) - 1} \frac{1}{j} \right) \cdot \prod \left( \prod_{j=k_i+1}^{k_i + l_i + d_i} j \right) \cdot \frac{ \operatorname{\Gamma}((\sum l_i) + 1) }{ \prod\left(\operatorname{\Gamma}(l_i+1)\right) } \]
(details in the paper)
McPAL combines well-known advantages by using local statistics:
McPAL is superior in experimental evaluation.
Code is available: https://kmd.cs.ovgu.de/res/mcpal/
(ready to use in practice - cooperations welcome)
Slides, Paper, Bibtex:
www.daniel.kottke.eu/talks/2016_ECAI
Supplemental material:
kmd.cs.ovgu.de/res/mcpal/
Workshop at iKNOW (Oct 18, 2016):
Active Learning: Applications, Foundations and Emerging Trends
http://i-know.tugraz.at/
Multi-Class Probabilistic Active Learning
Daniel Kottke, Georg Krempl, Dominik Lang, Johannes Teschner, Myra Spiliopoulou
European Conference on Artificial Intelligence (ECAI)
The Hague, Netherlands, 2016.