A Survey on Arabic Text Classification Using Deep and Machine Learning Algorithms

Abstract

Text categorization refers to the process of grouping text or documents intoclasses or categories according to their content. Text categorization process consistsof three phases which are: preprocessing, feature extraction and classification. Incomparison to the English language, just few studies have been done to categorizeand classify the Arabic language. For a variety of applications, such as textclassification and clustering, Arabic text representation is a difficult task becauseArabic language is noted for its richness, diversity, and complicated morphology.This paper presents a comprehensive analysis and a comparison for researchers inthe last five years based on the dataset, year, algorithms and the accuracy they got.Deep Learning (DL) and Machine Learning (ML) models were used to enhance textclassification for Arabic language. Remarks for future work were concluded.