Direct Approximation of Quadratic Mutual Information and Its Application to Dependence-Maximization Clustering
スポンサーリンク
概要
- 論文の詳細を見る
Mutual information (MI) is a standard measure of statistical dependence of random variables. However, due to the log function and the ratio of probability densities included in MI, it is sensitive to outliers. On the other hand, the L^2-distance variant of MI called quadratic MI (QMI) tends to be robust against outliers because QMI is just the integral of the squared difference between the joint density and the product of marginals. In this paper, we propose a kernel least-squares QMI estimator called least-squares QMI (LSQMI) that directly estimates the density difference without estimating each density. A notable advantage of LSQMI is that its solution can be analytically and efficiently computed just by solving a system of linear equations. We then apply LSQMI to dependence-maximization clustering, and demonstrate its usefulness experimentally.
- 一般社団法人電子情報通信学会の論文
- 2013-07-11
著者
-
Sugiyama Masashi
Department of Applied Chemistry, Yamanashi University
-
SAINUI Janya
Department of Computer Science, Tokyo Institute of Technology
関連論文
- Statistical active learning for efficient value function approximation in reinforcement learning (ニューロコンピューティング)
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Approximating the Best Linear Unbiased Estimator of Non-Gaussian Signals with Gaussian Noise
- Adaptive importance sampling with automatic model selection in value function approximation (ニューロコンピューティング)
- Recent Advances and Trends in Large-Scale Kernel Methods
- Syntheses of New Artificial Zinc Finger Proteins Containing Trisbipyridine-ruthenium Amino Acid at The N-or C-terminus as Fluorescent Probes
- Analytic Optimization of Shrinkage Parameters Based on Regularized Subspace Information Criterion(Neural Networks and Bioengineering)
- Constructing Kernel Functions for Binary Regression(Pattern Recognition)
- Optimal design of regularization term and regularization parameter by subspace information criterion
- Information-maximization clustering: analytic solution and model selection (情報論的学習理論と機械学習)
- Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation
- Adaptive importance sampling with automatic model selection in reward weighted regression (ニューロコンピューティング)
- SERAPH: semi-supervised metric learning paradigm with hyper sparsity (情報論的学習理論と機械学習)
- Analysis and improvement of policy gradient estimation (情報論的学習理論と機械学習)
- Output divergence criterion for active learning in collaborative settings (数理モデル化と問題解決・バイオ情報学)
- Dependence minimizing regression with model selection for non-linear causal inference under non-Gaussian noise (情報論的学習理論と機械学習)
- Canonical dependency analysis based on squared-loss mutual information (情報論的学習理論と機械学習)
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (パターン認識・メディア理解)
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (情報論的学習理論と機械学習)
- Modified Newton Approach to Policy Search (情報論的学習理論と機械学習)
- Relative Density-Ratio Estimation for Robust Distribution Comparison (情報論的学習理論と機械学習)
- Squared-loss Mutual Information Regularization
- Computationally Efficient Multi-Label Classification by Least-Squares Probabilistic Classifier
- Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering (情報論的学習理論と機械学習)
- Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation
- Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier (情報論的学習理論と機械学習)
- Efficient Sample Reuse in Policy Gradients with Parameter-based Exploration (情報論的学習理論と機械学習)
- Output Divergence Criterion for Active Learning in Collaborative Settings
- Output Divergence Criterion for Active Learning in Collaborative Settings
- Photochromism of benzylviologens containing methyl groups on pyridinium rings and embedded in solid poly(N-vinyl-2-pyrrolidone) matrix.
- Clustering Unclustered Data : Unsupervised Binary Labeling of Two Datasets Having Different Class Balances
- Direct Approximation of Quadratic Mutual Information and Its Application to Dependence-Maximization Clustering
- Direct Learning of Sparse Changes in Markov Networks by Density Ratio Estimation
- Squared-loss Mutual Information Regularization
- Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier
- Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering
- Improving Importance Estimation in Pool-based Batch Active Learning for Approximate Linear Regression