Information-maximization clustering: analytic solution and model selection (情報論的学習理論と機械学習)
スポンサーリンク
概要
- 論文の詳細を見る
A recently-proposed information-maximization clustering method (Gomes et al., NIPS2010) learns a kernel logistic regression classifier in an unsupervised manner so that mutual information between feature vectors and cluster assignments is maximized. A notable advantage of this approach is that it only involves continuous optimization of a logistic model, which is substantially easier than discrete optimization of cluster assignments. However, this method still suffers from two weaknesses: (i) manual tuning of kernel parameters is necessary, and (ii) finding a good local optimal solution is not straightforward due to the strong non-convexity of logistic-regression learning. In this paper, we first show that the kernel parameters can be systematically optimized by maximizing mutual information estimates. We then propose an alternative information-maximization clustering approach using a squared-loss variant of mutual information. This novel approach allows us to obtain clustering solutions analytically in a computationally very efficient way. Through experiments, we demonstrate the usefulness of the proposed approaches.
- 2011-03-21
著者
-
Hachiya Hirotaka
Department Of Computer Science Tokyo Institute Of Technology
-
Yamada Makoto
Department Of Chemistry Faculty Of Science Okayama University Of Science
-
Kimura Manabu
Department Of Materials Science And Engineering Metal Section Nagoya Institute Of Technology
-
Sugiyama Masashi
Department Of Computer Science Tokyo Institute Of Technology
-
YAMADA Makoto
Tokyo Institute of Technology
-
Hachiya Hirotaka
Tokyo Inst. Of Technol.
-
Kimura Manabu
Department Of Computer Science Tokyo Institute Of Technology
-
Yamada Makoto
Tokyo Inst. Of Technol.
-
Yamada Makoto
Department Of Chemistry And Biomolecular Science Toho University
-
Sugiyama Masashi
Department Of Chemistry Faculty Of Science Tokyo University Of Science
-
Sugiyama Masashi
Department of Applied Chemistry, Yamanashi University
関連論文
- Statistical active learning for efficient value function approximation in reinforcement learning (ニューロコンピューティング)
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Least Absolute Policy Iteration — A Robust Approach to Value Function Approximation
- A New Meta-Criterion for Regularized Subspace Information Criterion
- Approximating the Best Linear Unbiased Estimator of Non-Gaussian Signals with Gaussian Noise
- A new algorithm of non-Gaussian component analysis with radial kernel functions (Special issue: Information geometry and its applications)
- Spontaneous Rupture of the Iliac Vein : Report of a Case
- Methods of cross-domain object matching (情報論的学習理論と機械学習)
- Conjugation of Laminin Derived Cell Adhesive Peptides on a Chitosan Membrane and Their Biological Activity
- Screening of Amyloidogenic Peptides in Laminin-1
- Multifunctional Peptide-Chitosan Membrane : New Biomedical Materials
- γ-Functional Prolines Based on Naturally Occurred Hydroxyproline. III
- Accumulation-Exclusion Combined System for the DNA-Binding Harmful Chemicals with Insolubilized DNA
- UV-Irradiated DNA Matrix Selectively Accumulates Heavy Metal Ions
- Effect of Nucleoplasmin on a Nucleosome Structure
- Neurite Outgrowth Promoting Sites on the Laminin Alpha 3 Chain
- Cell Adhesive and Heparin Binding Sites on the Laminin Alpha Chain G Domains
- γ-Functional Prolines Based on Naturally Occurring Hydroxyproline. II
- Identification of Biologically Active Sequences on the Laminin Alpha Chain G Domains
- Identification of Cell Adhesive Sites on the Laminin Alpha Chain Domain VI
- Multi-task learning with least-squares probabilistic classifiers (パターン認識・メディア理解)
- Multi-task learning with least-squares probabilistic classifiers (情報論的学習理論と機械学習)
- Adaptive importance sampling with automatic model selection in value function approximation (ニューロコンピューティング)
- Analytic Optimization of Adaptive Ridge Parameters Based on Regularized Subspace Information Criterion(Neural Networks and Bioengineering)
- Adaptive Ridge Learning in Kernel Eigenspace and Its Model Selection
- Mossbauer Study on the Phase Separation of Fe-Co-Si Alloys
- On Computational Issues of Semi-Supervised Local Fisher Discriminant Analysis
- Myonase is Localized in Skeletal Muscle Myofibrils
- Recent Advances and Trends in Large-Scale Kernel Methods
- Covered Stent Treatment for Traumatic Cervical Carotid Artery Aneurysms : Two Case Reports
- Syntheses of New Artificial Zinc Finger Proteins Containing Trisbipyridine-ruthenium Amino Acid at The N-or C-terminus as Fluorescent Probes
- Practical Synthesis of 4-cis Hydroxy-L-Proline
- Improving Model-based Reinforcement Learning with Multitask Learning
- Improving Model-based Reinforcement Learning with Multitask Learning
- Ovarian Histological Findings in an Adult Patient with the Steroidogenic Acute Regulatory Protein (StAR) Deficiency Reveal the Impairment of Steroidogenesis by Lipoid Deposition
- Analytic Optimization of Shrinkage Parameters Based on Regularized Subspace Information Criterion(Neural Networks and Bioengineering)
- Least-Squares Conditional Density Estimation
- Direct Importance Estimation with a Mixture of Probabilistic Principal Component Analyzers
- Constructing Kernel Functions for Binary Regression(Pattern Recognition)
- Optimal design of regularization term and regularization parameter by subspace information criterion
- Solid-papillary carcinoma of the breast : Clinicopathological study of 20 cases
- A fast sequence kernel for sequential data classification (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Information-maximization clustering: analytic solution and model selection (情報論的学習理論と機械学習)
- Conditional Density Estimation Based on Density Ratio Estimation
- Conditional Density Estimation Based on Density Ratio Estimation
- New feature selection method for reinforcement learning: conditional mutual information reveals implicit state-reward dependency (情報論的学習理論と機械学習)
- Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation
- Independent component analysis by direct density-ratio estimation (ニューロコンピューティング)
- A New Meta-Criterion for Regularized Subspace Information Criterion(Pattern Recognition)
- Spectral Methods for Thesaurus Construction
- Adaptive importance sampling with automatic model selection in reward weighted regression (ニューロコンピューティング)
- SERAPH: semi-supervised metric learning paradigm with hyper sparsity (情報論的学習理論と機械学習)
- Analysis and improvement of policy gradient estimation (情報論的学習理論と機械学習)
- Direct density-ratio estimation with dimensionality reduction via hetero-distributional subspace analysis (情報論的学習理論と機械学習)
- Output divergence criterion for active learning in collaborative settings (数理モデル化と問題解決・バイオ情報学)
- Estimation of squared-loss mutual information from paired and unpaired samples (情報論的学習理論と機械学習)
- Direct Importance Estimation with Gaussian Mixture Models
- Dependence minimizing regression with model selection for non-linear causal inference under non-Gaussian noise (情報論的学習理論と機械学習)
- Canonical dependency analysis based on squared-loss mutual information (情報論的学習理論と機械学習)
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (パターン認識・メディア理解)
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (情報論的学習理論と機械学習)
- Efficient Sample Reuse in Policy Gradients with Parameter-based Exploration (情報論的学習理論と機械学習)
- Generalization Error Estimation for Non-linear Learning Methods(Neural Networks and Bioengineering)
- Improving Precision of the Subspace Information Criterion(Neural Networks and Bioengineering)
- Canonical dependency analysis based on squared-loss mutual information (パターン認識・メディア理解)
- Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation (情報論的学習理論と機械学習)
- Modified Newton Approach to Policy Search (情報論的学習理論と機械学習)
- Computationally Efficient Multi-Label Classification by Least-Squares Probabilistic Classifier (情報論的学習理論と機械学習)
- Relative Density-Ratio Estimation for Robust Distribution Comparison (情報論的学習理論と機械学習)
- Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation
- Modified Newton Approach to Policy Search
- Squared-loss Mutual Information Regularization
- Computationally Efficient Multi-Label Classification by Least-Squares Probabilistic Classifier
- Feature Selection via l_1-Penalized Squared-Loss Mutual Information
- Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching (情報論的学習理論と機械学習)
- Relative Density-Ratio Estimation for Robust Distribution Comparison
- Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting
- Prospective Assessment of Pain and Functional Status After Percutaneous Vertebral Body-Perforation Procedure for Treatment of Vertebral Compression Fractures
- Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering (情報論的学習理論と機械学習)
- Computationally Efficient Multi-Label Classification by Least-Squares Probabilistic Classifiers
- Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation
- Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier (情報論的学習理論と機械学習)
- Multi-Task Approach to Reinforcement Learning for Factored-State Markov Decision Problems
- Efficient Sample Reuse in Policy Gradients with Parameter-based Exploration (情報論的学習理論と機械学習)
- Output Divergence Criterion for Active Learning in Collaborative Settings
- Output Divergence Criterion for Active Learning in Collaborative Settings
- Photochromism of benzylviologens containing methyl groups on pyridinium rings and embedded in solid poly(N-vinyl-2-pyrrolidone) matrix.
- Clustering Unclustered Data : Unsupervised Binary Labeling of Two Datasets Having Different Class Balances
- Prospective Assessment of Pain and Functional Status After Percutaneous Vertebral Body-Perforation Procedure for Treatment of Vertebral Compression Fractures
- Direct Approximation of Quadratic Mutual Information and Its Application to Dependence-Maximization Clustering
- Direct Learning of Sparse Changes in Markov Networks by Density Ratio Estimation
- On Kernel Parameter Selection in Hilbert-Schmidt Independence Criterion
- Squared-loss Mutual Information Regularization
- Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier
- Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering
- Improving Importance Estimation in Pool-based Batch Active Learning for Approximate Linear Regression