Solving Missing Values : A Case Study

Abstract

One of the most important issues in information theory related to data in both Database and Data Warehouse is the missing values (unknown, not available and required). This represents a great challenge to the analysis process. Features or data attributes (fields or columns in relational DB) in data repositories represent the core of any analytical process in OLAP(On Line Analytical Processing)and OLTP(On Line Transaction Processing). These attributes are required to be studied and processed. Many papers were published to solve such problem in different goals and algorithms. However, the aim of this research proposal is to improve the algorithms applied to these topics to insure data consistency, correctness, completeness, and time and space complexity. Different algorithms and techniques were applied on more than 20000 records collected from different hospitals and clinics around Iraq to study the effectiveness of the proposed algorithms including Most Common Value, overall average, and classification.

Keywords

DB, DW, OLTP, OLAP, Missing Values