Empirical Evaluation of Similarity-Based Missing Data Imputation for Effort Estimation
スポンサーリンク
概要
- 論文の詳細を見る
Multivariate regression models have been commonly used to estimate the software development effort to assist project planning and/or management. Since project data sets for model construction often contain missing values, we need to build a complete data set that has no missing values either by using imputation methods or by removing projects and metrics having missing values (removing method). However, while there are several ways to build the complete data set, it is unclear which method is the most suitable for the project data set. In this paper, using project data of 706 cases (47% missing value rate) collected from several companies, we applied four imputation methods (mean imputation, pair-wise deletion, k-nn method and applied CF method) and the removing method to build regression models. Then, using project data of 143 cases (having no missing values), we evaluated the estimation performance of models after applying each imputation and removing method. The result showed that the similarity-based imputation methods (k-nn method and applied CF method) showed the best performance.
- 日本ソフトウェア科学会の論文
日本ソフトウェア科学会 | 論文
- LCDと透明弾性体の光弾性を用いたユーザインタフェース (特集 インタラクティブシステムとソフトウェア)
- Bluetoothによる位置検出
- COINSにおけるSIMD並列化(最新コンパイラ技術とCOINSによる実践)
- データ型を考慮した軽量なXML文書処理系の自動生成(ソフトウェア開発を支援する基盤技術)
- 計算と論理のための自然枠組NF/CAL(システム検証の科学技術)