Statistics, Optimization & Information Computing

A Deep Inverse Weibull Network

Ola Abuelamayem — Fri, 01 Nov 2024 00:00:00 +0800

Survival analysis is heavily used in different fields like economics, engineering and medicine. The main core of the analysis is to understand the relationship between the covariates and the survival function. The analysis can be performed using traditional statistical models or neural networks. Recently, neural networks has attracted attention in analyzing lifetime data due to its flexibility in handling complex covariates. The networks introduced in the literature have some restrictions such as proportional hazard assumption, data discretization, monotonicity of hazard rates and heavy tailed assumption. In this paper, a novel neural network is introduced based on inverse Weibull distribution and random censoring that removes some of the restrictions introduced in the literature. The network doesn't put monotonicity, proportionality or heavy tailed assumptions on the hazard function. Also, the network doesn't require data discretization. To test its applicability, the network is applied on both simulated and real datasets and the numerical results show that our model outperforms some other methods in the literature.

Prediction of New Lifetimes of a Step-Stress Test Using Cumulative Exposure Model with Censored Gompertz Data

Mohammad A. Amleh, Israa F. Al-Freihat — Mon, 18 Nov 2024 00:00:00 +0800

In this paper, we address the problem of predicting the time until failure for censored units, following a Gompertz distribution. This prediction is carried out within a simple step-stress strategy operating under a cumulative exposure model. We explore various prediction techniques, including the maximum likelihood predictor, conditional median predictor, and best unbiased predictor. Additionally, we delve into the prediction interval estimation for the future lifetimes of these censored units. We discuss methods such as the pivotal quantity, highest conditional density, and shortest-length approaches to achieve this. To assess the performance of the proposed prediction methods, we conduct Monte Carlo simulations. Furthermore, we utilize a real dataset for illustrative purposes and comparative analysis.

Exploring the Shift in Symmetry Phenomenon in Exponentially Weighted Moving Average Quality Charts for Statistics Derived from Beta Distribution

Mohammad Hamasha, Ghada Shawaheen, Ahmad Mayyas — Thu, 26 Dec 2024 00:00:00 +0800

The Exponential Weighted Moving Average (EWMA) is a statistical method used to create moving averages that assign greater weight to more recent data, frequently used in quality control. This approach, which mitigates asymmetry via the central limit theorem, encounters skewness issues majorly influenced by the lambda parameter ($\lambda$). This research investigates the impact of various EWMA smoothing factors on skewness reduction, utilizing the beta distribution, which can replicate diverse real-world distributions from heavily skewed to nearly symmetric, for data generation. With Matlab, random beta-distributed data was analyzed with the EWMA to observe changes in skewness and kurtosis. This study aids in comprehending data distributions in new or significantly modified processes, assisting in the adjustment of control chart parameters. It identifies a gap in existing literature regarding indeterminate distributions and underscores the need for further investigation in this field.

Density-Adaptive Clustering of Multivariate Angular Data Using Dirichlet Process Mixture Models with Circular Normal Distribution for Artificial Intelligence Applications

Said Benlakhdar, Saralees Nadarajah, Mohammed Rziza, Rachid Oulad Haj Thami — Thu, 02 Jan 2025 00:00:00 +0800

Data clustering is an essential technique for organizing unsupervised data, extracting subjects automatically, and swiftly retrieving or filtering information. In this study, we approach the task of clustering multivariate angular distributions using nonparametric Bayesian mixture models featuring von Mises distributions. Our approach operates within a nonparametric Bayesian framework, specifically leveraging the Dirichlet process. Unlike finite mixture models, our approach assumes an infinite number of clusters initially, inferring the optimal number automatically from the data. Morever, our paper introduces a unified approach, leveraging Ward's algorithm, Dirichlet process, and von Mises Mixture distributions (DPM-MvM), to effectively capture both the structure and variability inherent in the data. We've developed a variational inference algorithm for DPM-MvM enabling automatic determination of the number of clusters. Our experimental results showcase the efficiency and accuracy of our method for analyzing multivariate angular data with state of the art approaches.

On Testing the Adequacy of the Logistic Model Based on Negative Cumulative Extropy

Hadi Alizadeh Noughabi, Mohammad Shafaei Noughabi — Tue, 07 Jan 2025 00:00:00 +0800

The logistic distribution has been used for various growth models, and is used in a certain type of regression, known appropriately as logistic regression. In this article, we propose a new goodness of fit test for the logistic distribution based on the negative cumulative residual extropy introduced by Tahmasebi and Toomaj (2022). The mean, variance and the other properties of the test statistic is presented. Percentage points of the test statistic are obtained and then power of the test against different alternatives are reported. The results of a simulation study show the test is competitive in terms of power. The proposed statistic is easy to compute and a real data set is used to illustrate the application of the proposed test.

Moment Properties of Generalized Order Statistics From Lindley Pareto Distribution

Abu Bakar, Haseeb Athar, Yousef F. Alharbi, Mohamad A. Fawzy — Thu, 09 Jan 2025 00:00:00 +0800

The Lindley Pareto distribution is a more flexible model for analyzing the lifetime data. In this paper, the moment properties of generalized order statistics from the Lindley Pareto distribution in terms of exact expression and recurrence relations are studied. The results for order statistics, record values, and progressive type-II right censored order statistics are discussed as particular cases of generalized order statistics. Further, the characterization of the said distribution through recurrence relations between moments of generalized order statistics are presented. Finally, some statistical measures of order statistics and record values for the Lindley Pareto distribution are computed.

Some Results of Generalized Extropy Measure and Its application

Vikas Kumar, Salook Sharma, Ritu Goel — Fri, 10 Jan 2025 00:00:00 +0800

Taking into account the importance of extropy (see Lad et al. 2015), and its various generalizations, in the present communication we consider and study the generalized extropy of order alpha and type beta based on Varma's (Varma, 1966) information measure for both discrete and continuous random variables. The dynamic versions (residual and past, both) of the proposed generalized extropy measure have also been presented. At the end, the interval generalized extropy measure and an application of the proposed generalized extropy measure are also presented.

A Method to Classify Shape Data using Multinomial Logistic Regression Model

Meisam Moghimbeygi — Fri, 24 Jan 2025 00:00:00 +0800

‎We introduced a multinomial logistic regression model to classify the labeled configurations‎. ‎In this modeling‎, ‎we use a power-divergence test to find an estimator for belonging probability in each category‎. ‎The estimator is introduced based on different distances‎. ‎Since the estimator is biased‎, ‎we modified the belonging probability by multinomial logistic regression‎. ‎We evaluate the performance of the proposed technique in the comprehensive simulation study‎. ‎Also‎, ‎we classified the five real data sets using our multinomial logistic model‎.

Optimizing the Arrangement of Goods in Box Van Using the Tabu Search Algorithm

Kiswara Agung Santoso, Inas Mustafidatul Ilmiyah, Agustina Pradjaningsih — Fri, 06 Dec 2024 00:00:00 +0800

A country's economic progress can be seen from the industrial sector's contribution to its economic growth. Transportation plays an important role in the distribution of products to consumers, where the smooth flow of goods can reduce costs and optimize company profits. Distribution problems often occur due to the arrangement of goods that is not optimal, thereby increasing costs and labor. Optimizing the placement of goods in expedition vans has not been widely studied until now. Therefore, the tabu search algorithm is needed to optimize the arrangement of goods(items). The Tabu Search algorithm is a metaheuristic algorithm that aims to find the optimal solution from various possible solutions. In this study, all goods sent were packaged in cubes or blocks and the vehicles used to send them were also in boxes. This article's essence is arranging goods (packed in boxes) into a van so that it has maximum contents. This research also discusses how to place items if the item cannot be reversed (fragile) along with the visualization.

Optimal Excess-of-Loss Reinsurance Contract in a Dynamic Risk Model

Abouzar Bazyari — Fri, 10 Jan 2025 00:00:00 +0800

This paper studies the optimal excess-of-loss reinsurance contract between an insurer and a reinsurer in a dynamic risk model. The risk process is assumed to be a diffusion approximation process of the classical Cramer-Lundberg model which is perturbed by a Brownian motion. In addition to reinsurance, we assume that the insurer is allowed to invest his/her surplus into a financial market containing one risk-free rate of return and determines the reinsurance strategy by a self-reinsurance function. Our aim is to obtain the simultaneous equilibrium strategy in this reinsurance dynamic risk setting using the objective functions of insurer and reinsurance. By employing the dynamic programming approach, we derive the minimization of insurer’s ruin probability and maximization of reinsurance’s expected aggregate discounted net profits to have the optimal portfolio for the two parties treaties in a fixed term insurance contract. In order to provide a more explicit reinsurance contract and to facilitate our quantitative analysis, we study the case when the reinsurance premium function is based on the standard-deviation principle from the integro-differential equations. A numerical example is given to investigate the effects of model parameters on the equilibrium strategy.

Supply Chain Networks Optimization under Uncertain Environment with Dhouib-Matrix-TP1 heuristic

Souhail Dhouib, Manel Kammoun, Saima Dhouib, Taicir Loukil — Fri, 31 Jan 2025 00:00:00 +0800

The transportation problem (TP) is a critical component of the supply chain network that involves determining the most efficient way to move goods from one location to another. TP is a generic name given to a whole class of problems in which diverse types of transportation modes are used to supply a product from sources to destinations. The TP is a common challenge in supply chain networks, where it aims to minimize the total cost of transportation to satisfy both supply and demand constraints. In this paper, the constructive heuristic Dhouib-Matrix-TP1 (DM-TP1) is adapted in order to solve the balanced and unbalanced TP with heptagonal fuzzy numbers. DM-TP1 needs a reduced number of iterations in order to generate a good initial basic feasible solution and uses a novel metric based on (Average-Min). Several numerical examples (balanced and unbalanced) are used to prove the performance of DM-TP1.

Distance measures for hidden Markov models based on Hilbert space embeddings for time series classification

Fri, 10 Jan 2025 00:00:00 +0800

In order to build a classification scheme for sequences based on HMMs, the design of an appropriate distance is critical in both theoretical and practical fields. The Kullback-Leibler (KL) and Hidden Markov Stationary Distance (HSD) measures have been used to build classification schemes for sequences based on HMMs. However, it is well known that the KL measure is not a true metric and the metric HSD is for univariate data. Inspired by the recent emergence of metrics of probability measures in Reproducible Kernel Hilbert Spaces (RKHS), we introduce two new metrics between two stationary HMMs. The difference in the metrics based on RKHS with respect to the HSD metric is that our metrics can be calculated analytically and can be used for multivariate data. We evaluate the performance of the two metrics in the task of time-series classification, using the metrics within a K-Nearest Neighbor (KNN) classifier. The performance of the two metrics is evaluated in the voice database of the Massachusetts Eye and Ear Infirmary Disordered Voice Database from the Kay Elemetrics company. Results show that the proposed metrics provide competitive classification accuracies when compared to the KL, HSD and DTW measure.

Aerial Remote Sensing Object detection using Unsupervised Domain Adaptive

Youssef BEN YOUSSEF, Soufiane Lyaqini, Khaled Fakhar, Elhassane Abdelmounim — Thu, 22 Aug 2024 00:00:00 +0800

Object recognition and localization in Aerial Remote Sensing Images (ARSI) are critical and demanding subjects for further processing object-related data in civil and military applications. To train a Deep Learning (DL) model for visual recognition and localization, a huge number of annotated images are needed. However, data categorization and annotation become a hard and time-consuming task. Despite the shortcoming of data in training, Unsupervised Domain Adaptation (UDA) offers an alternative solution to this issue. In this paper, UDA is suggested to detect and localize objects in ARSI as an unlabeled target domain. We compare the effectiveness of Faster Region Convolutional Neuronal Network (Faster R-CNN) as two stages detector and RetinaNet as one stage detector. These algorithms are based on the same Resnet50 model as the backbone. This study uses the natural image dataset MSCOCO as the source domain. We assess the proposed approach on two unlabeled datasets UC Merced and MTRASI datasets. The proposed method significantly improves object detection and localization performance, according to both qualitative and quantitative results. Extensive experiments show that the RetinaNet detector is better than the Faster R-CNN detector in terms of mAP.

On Local Antimagic b-Coloring of Graphs: New Notion

Mon, 02 Dec 2024 00:00:00 +0800

Let $G=(V,E)$ be a simple, connected and un-directed graph. Given that a map $f: E(G) \longrightarrow \{1,2,3, \dots, |E(G)|\}$. We define a vertex weight of $v\in V$ as $w(v)=\Sigma_{e\in E(v)}f(e)$ where $E(v)$ is the set of edges incident to $v$. The bijection $f$ is said to be a local antimagic labeling if for any two adjacent vertices, their vertex weights must be distinct. Furthermore a $b-$coloring of a graph is a proper $k-$coloring of the vertices of $G$ such that in each color class there exists a vertex having neighbors in all other $k-1$ color classes. If we assign color on each vertex by the vertex weight $w(v)$ such that it induces a graph coloring satisfying $b-$coloring property, then this concept falls into a local antimagic $b-$coloring of graph. A local antimagic $b-$chromatic number, denoted by $\varphi_{la}(G),$ is the maximum number of colors chosen for any colorings generated by local antimagic $b-$coloring of $G$. In this study, we initiate to study the $b-$chromatic number of $G$ and the exact values of $\varphi_{la}(G)$ of certain classes of graph families.

Novel SR-RNN Classifier for Accurate Emotion Detection in Facial Analysis

Jyoti S. Bedre, P. Lakshmi Prasanna — Thu, 05 Dec 2024 00:00:00 +0800

Facial Expression Recognition (FER) is crucial for understanding human emotions in fields like human-computer interaction and psychology. Despite advances in deep learning (DL), existing FER methods often struggle with noise, lighting variations, and inter-subject variability, leading to inaccurate emotion classification. This paper addresses these challenges by proposing a novel SwikyRelu Recurrent Neural Network (SR-RNN) classifier. The aim is to enhance FER accuracy while reducing computational complexity. The methodology involves a multi-step process starting with image pre-processing using an Adaptive Mode Guided Filter (AMGF) and Contrast Limited Adaptive Histogram Equalization (CLAHE). Key facial features are extracted using the Generative Additive Active Shape Model (GAASM) and clustered into subgraphs using Radial Basis K-Medoids Clustering (RBKMC). Feature selection is optimized through the Chaotic Ternary Remora Optimization (CTRO) algorithm, with the selected features fed into the SR-RNN classifier for emotion categorization. Results from extensive testing on the CK+, FER-2013, and RAF-DB dataset shows that the proposed SR-RNN classifier significantly outperforms conventional models, achieving 98.85\%, 91.79\%, and 89.28\% accuracy, respectively. The conclusion highlights the model's ability to enhance FER performance by effectively handling noise, illumination differences, and inter-subject variability.

Short-Term Load Forecasting Method for Renewable Energy Integration and Grid Stability Using CNN, LSTM, and Transformer Models

Khaoula Boumais, Fayçal Messaoudi — Thu, 19 Dec 2024 00:00:00 +0800

This study examines the feasibility of combining Morocco's renewable energy plan with artificial intelligence to improve energy management in the industrial sector. Based on Moroccan Law 82-21, which promotes the self-consumption of renewable energy, the study addresses the fundamental difficulty of accurately estimating energy consumption in dynamic industrial environments. This difficulty is addressed using advanced machine learning models such as convolutional neural networks (CNNs), long-term memory networks (LSTMs) and transformers. The results show that deep learning models outperform classical methods such as ARIMA, with transformers and LSTM models excelling at handling erratic and steady energy consumption patterns, respectively.

In particular, hybrid CNN-LSTM architectures provide the highest level of accuracy, with prediction accuracy improved by up to 20\%. While improving grid stability and renewable energy integration, this development has the potential to reduce operational costs by up to 30\%. This analysis not only supports Morocco's ambitious goal of generating 52\% of its electricity from renewables by 2030 but also highlights the critical role of AI-based solutions in creating a sustainable energy future.

A Metaheuristic for Fuzzy Density Based SVM and Confidence SMOTE for Early Prediction of Diabetes

Asma Driouich, ABDELLATIF EL OUISSARI, Karim EL MOUTAOUAKIL, Ismail Akharraz — Fri, 27 Dec 2024 00:00:00 +0800

Early detection of diabetes, based on observable features, plays a crucial role in preventing serious complications in diabetic patients. In this study, we propose a classification model called SMOTE Density Based Fuzzy Support Vector Machine (SMOTE-DB-FSVM), based on FSVM, to better detect diabetes. Our approach is based on five main steps: data cleaning, density-based filtering, feature selection to identify the most important attributes, calculation of a confidence score for each point in the minority class, and use of SMOTE to balance the data. In addition, we compare different versions of the kernel functions in the SVM model to optimize classification results, using metaheuristics to estimate the parameters of these kernels. The proposed SMOTE-DB-FSVM algorithm has been evaluated in diabetes datasets, including the PIMA diabetes database, and the results show a clear improvement in the early detection of diabetes with this method.

Abnormal Behavior Detection in Surveillance Systems Using a Hybrid EfficientNet-Transformer Model

Hesham A. Alberry, M. E. Khalifa, Ahmed Taha — Thu, 09 Jan 2025 00:00:00 +0800

Anomaly detection in video surveillance is vital for public safety, but challenges arise from the unpredictability of abnormal behaviors and large-scale systems. We propose a hybrid architecture combining EfficientNetV2S for efficient feature extraction with a transformer encoder to capture long-range dependencies through self-attention. This model robustly detects abnormal events by modeling local and global patterns in video frames. Evaluated on UCSD Ped1, UCSD Ped2, and Avenue datasets, our approach achieved accuracies of 99.51, 99.80, and 94.82, outperforming existing methods and proving their suitability for real-time smart surveillance applications.

Poverty prediction using machine learning models: Insights from HICES survey in Egypt

Israa Lewaaelhamd, Maged George Iskander — Sat, 11 Jan 2025 00:00:00 +0800

This study focuses on the poverty problem in Egypt. Data from household expenditure and income surveys is used to determine the poverty status of Egyptian households. However, conducting these types of surveys is challenging, costly, and time consuming. This procedure might be revolutionized by machine learning. This work contributes to the field by utilizing machine learning techniques to evaluate and track the poverty levels of Egyptian households. This method brings poverty detection closer to real-time, and lower costs, and accuracy. A significant portion of this work involves managing unbalanced data and preparing data. Eleven machine learning classification models are applied. The classification algorithms of the Gradient Boosting Machine and support vector machine have achieved the best performance. The final machine learning classification model could transform efforts to track and target poverty across the country. This work demonstrates how powerful and versatile machine learning can be and, hence, it promotes adoption across many domains in both the private sector and government.

Statistical methods for inflation forecasting in Morocco: Insights from Google trends data

Mariem Bikourne, Sokaina EL KHAMLICHI, Adil Ez-Zetouni, Khadija Akdim — Sun, 12 Jan 2025 00:00:00 +0800

Accurate inflation forecasting is essential for effective economic planning and policy-making. The increasing use of the internet enables user generated content to capture people’s expectations and perspectives on economic issues. This study aims to investigate the power of Google trends data as an effective alternative source of data for forecasting inflation in Morocco. By identifying keywords that exhibit Granger causality with the inflation rate, we examined the linear effect of public interest on inflation forecasting using a principal component index as an exogenous factor to enhance outcomes. The selected SARIMA model, coupled with the resulting index, presents an optimal trajectory for inflation rate. The results of this study demonstrate that the model incorporating Google Trends data yielded the best performance based on evaluation measures such as AIC, RMSE, and log-likelihood. This highlights that the Google index is a significant factor for accurately explaining and forecasting inflation rate movements, contributing substantially to inflation modeling. The adaptive features of our approach make it preferably suited to describing inflation uncertainty when the economy is subject to constantly changing monetary institutions and policies.

Machine Learning-Based Prediction and Multispectral Analysis for Precision Irrigation Management

Sun, 12 Jan 2025 00:00:00 +0800

The objective of this work is to build a prediction system for normalized indices such as NDVI (Normalized Difference Vegetation Index), NDRE (Normalized Difference RedEdge index) and NDWI (Normalized Difference Water Index). Based on machine learning techniques, this prediction will allow us to compare various methods. Additionally, this prediction will allow us to precisely comprehend these three indices with a small amount of data. Multiple machine learning algorithms were trained and evaluated using appropriate parameters. For NDRE and NDWI prediction, the Support Vector Machine approach produced good results with Mean Squared Errors (MSE) of 0.0006 and 0.0012, respectively. On the other hand, the Random Forest approach performed better with a lower MSE of 0.0033 for predicting NDVI. Furthermore, patterns and trends in crop health, nutrient needs and water requirements were found by clustering analysis. The process of calculating and importing indices from TIFF data was made easier with the creation of a Graphical User Interface (GUI). The system provides an innovative approach for irrigation management, that support farmers in making well-informed decisions regarding irrigation and crop health.

Periodic Exponential Autoregressive Models for Rainfall Forecasting in Algeria

Sabah BECILA, Mouna MERZOUGUI — Mon, 20 Jan 2025 00:00:00 +0800

This study examines the utilization of periodic exponential autoregressive (PEXPAR) models in analyzing rainfall time series data from Algeria. The method of Gaussian quasi maximum likelihood for parameter estimation is used. By comparing its forecasting performance with SARIMA models, we observe a slight improvement with PEXPAR₁₂(1), suggesting its potential efficacy in capturing seasonal variations and nonlinear behavior in precipitation data.

Dynamics of a Fractional Order Harvested Predator-Prey Model Incorporating Fear Effect and Refuge

Siti Nurul Afiyah, Fatmawati, Windarto, Afeez Abidemi — Sun, 19 Jan 2025 00:00:00 +0800

This study presents a fractional-order predator-prey dynamics model that considers the impact of fear, refuge, and harvesting on the population, respectively. The proposed model uses the Caputo fractional derivative to successfully obtain the memory effects of this interaction between predators and preys. We prove the existence and uniqueness of solution to ensure the non-negativity and boundedness of the system, which is indispensable for maintaining biologically feasible populations. The stability analysis is conducted on the equilibrium points at local and global levels, explaining the conditions that guarantee these points are stable or lead to periodic dynamics through Hopf bifurcation. To support the analytical results, numerical simulations are provided, which demonstrate the essential roles played by fear, refuge, and harvesting in the survival of prey and the overall dynamics of the system.

Hybrid Deep Learning Model: LSTM and 2BiGRU for Predicting Coronavirus (COVID-19)

Thanaa Moustafa, Hossam Refaat, Mohamed Makhlouf — Thu, 30 Jan 2025 00:00:00 +0800

The COVID-19 pandemic has had a major global health impact, highlighting the urgent need for accurate predictive models to forecast the virus's spread. This research explores the use of deep learning techniques to improve the accuracy of COVID-19 case predictions. Traditional machine learning methods often struggle with the complexities of time-series data inherent in pandemic forecasting, which motivates the use of advanced deep learning models. This study employs the LSTM-2BiGRU model, a sophisticated deep learning architecture, to predict new COVID-19 cases using two datasets: historical data from OurWorldInData and medical data with historical disease records. The model was trained to leverage time-dependent factors and achieve high prediction performance. The LSTM-2BiGRU model achieved a significant improvement over traditional machine learning models, with an accuracy of 76% and a Mean Absolute Error (MAE) of 8% for the historical dataset within a 7-day forecast window. When applied to the epidemiology dataset, the model demonstrated even higher accuracy, ranging from 80% to 90% across different prediction periods (1 to 14 days), with a Mean Absolute Percentage Error (MAPE) between 10% and 15%. These findings demonstrate the potential of deep learning models like LSTM-2BiGRU to provide more accurate and timely forecasts for COVID-19, with a substantial reduction in Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) compared to previous studies. This underscores the model's improved performance and supports better-informed public health decisions.

Dynamic Pricing and Service Quality in Ride-Sharing: A Statistical Analysis

Daniel Sanin-Villa, Cristian Mateo Hernandez, Vanessa Botero-Gomez — Tue, 11 Feb 2025 00:00:00 +0800

This study presents a comprehensive statistical analysis of factors influencing dynamic pricing and service quality in ride-sharing. Leveraging historical data, we employ regression models, including simple and multiple linear regressions, as well as logistic regression, to examine the relationships between trip duration, passenger count, driver availability, and customer loyalty on ride costs and service ratings. Results reveal that trip duration significantly predicts ride costs, while customer loyalty and location are key determinants of service quality. These findings provide actionable insights for enhancing dynamic pricing strategies and service quality optimization in ride-sharing, supporting data-driven decision-making in a competitive market.