A Proposed Framework for Analyzing Crime Data Set Using Decision Tree and Simple K-Means Mining Algorithms

Abstract

AbstractThis paper presents a proposed framework for the crime and criminal data analysis and detection using Decision tree Algorithms for data classification and Simple K Means algorithm for data clustering. The paper tends to help specialists in discovering patterns and trends, making forecasts, finding relationships and possible explanations, mapping criminal networks and identifying possible suspects. The classification is based mainly on grouping the crimes according to the type, location, time and other attributes; Clustering is based on finding relationships between different Crime and Criminal attributes having some previously unknown common characteristics. The results of both classifications and Clustering are used for prediction of trends and behavior of the given objects (Crimes and Criminals).Data for both crimes and criminals were collected from free police departments’ dataset available on the Internet to create and test the proposed framework, and then these data were preprocessed to get clean and accurate data using different preprocessing techniques (cleaning, missing values and removing inconsistency). The preprocessed data were used to find out different crime and criminal trends and behaviors, and crimes and criminals were grouped into clusters according to their important attributes. WEKA mining software and Microsoft Excel were used to analyze the given data.