Statistics, Optimization & Information Computing <p><em><strong>Statistics, Optimization and Information Computing</strong></em>&nbsp;(SOIC) is an international refereed journal dedicated to the latest advancement of statistics, optimization and applications in information sciences.&nbsp; Topics of interest are (but not limited to):&nbsp;</p> <p>Statistical theory and applications</p> <ul> <li class="show">Statistical computing, Simulation and Monte Carlo methods, Bootstrap,&nbsp;Resampling methods, Spatial Statistics, Survival Analysis, Nonparametric and semiparametric methods, Asymptotics, Bayesian inference and Bayesian optimization</li> <li class="show">Stochastic processes, Probability, Statistics and applications</li> <li class="show">Statistical methods and modeling in life sciences including biomedical sciences, environmental sciences and agriculture</li> <li class="show">Decision Theory, Time series&nbsp;analysis, &nbsp;High-dimensional&nbsp; multivariate integrals,&nbsp;statistical analysis in market, business, finance,&nbsp;insurance, economic and social science, etc</li> </ul> <p>&nbsp;Optimization methods and applications</p> <ul> <li class="show">Linear and nonlinear optimization</li> <li class="show">Stochastic optimization, Statistical optimization and Markov-chain etc.</li> <li class="show">Game theory, Network optimization and combinatorial optimization</li> <li class="show">Variational analysis, Convex optimization and nonsmooth optimization</li> <li class="show">Global optimization and semidefinite programming&nbsp;</li> <li class="show">Complementarity problems and variational inequalities</li> <li class="show"><span lang="EN-US">Optimal control: theory and applications</span></li> <li class="show">Operations research, Optimization and applications in management science and engineering</li> </ul> <p>Information computing and&nbsp;machine intelligence</p> <ul> <li class="show">Machine learning, Statistical learning, Deep learning</li> <li class="show">Artificial intelligence,&nbsp;Intelligence computation, Intelligent control and optimization</li> <li class="show">Data mining, Data&nbsp;analysis, Cluster computing, Classification</li> <li class="show">Pattern recognition, Computer vision</li> <li class="show">Compressive sensing and sparse reconstruction</li> <li class="show">Signal and image processing, Medical imaging and analysis, Inverse problem and imaging sciences</li> <li class="show">Genetic algorithm, Natural language processing, Expert systems, Robotics,&nbsp;Information retrieval and computing</li> <li class="show">Numerical analysis and algorithms with applications in computer science and engineering</li> </ul> International Academic Press en-US Statistics, Optimization & Information Computing 2311-004X <span>Authors who publish with this journal agree to the following terms:</span><br /><br /><ol type="a"><ol type="a"><li>Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a <a href="" target="_new">Creative Commons Attribution License</a> that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.</li><li>Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.</li><li>Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See <a href="" target="_new">The Effect of Open Access</a>).</li></ol></ol> Discrete Chebyshev Polynomials for Solving Fractional Variational Problems <p>In ‎the current study, a‎ general formulation of the discrete Chebyshev polynomials is given. ‎The operational matrix of fractional integration for these discrete polynomials is also derived. ‎Then,‎ a numerical scheme based on the discrete Chebyshev polynomials and their operational matrix has been developed to solve fractional variational problems‎. In this method, the need for using Lagrange multiplier during the solution procedure is eliminated.‎ The performance of the proposed scheme is validated through some illustrative examples. ‎Moreover, ‎the obtained numerical results ‎‎‎‎were compared to the previously acquired results by the classical Chebyshev polynomials. Finally, a comparison for the required CPU time is presented, which indicates more efficiency and less complexity of the proposed method.</p> Fakhrodin Mohammadi Leila Moradi Dajana Conte Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 502 515 10.19139/soic-2310-5070-991 Generalized Odd Power Cauchy Family and Its Associated Heteroscedastic Regression Model <pre>This study introduces a generalization of the odd power Cauchy family by adding one more shape parameter to<br>gain more flexibility modeling the complex data structures. The linear representations for the density, moments, quantile,<br>and generating functions are derived. The model parameters are estimated employing the maximum likelihood estimation<br>method. The Monte Carlo simulations are performed under different parameter settings and sample sizes for the proposed<br>models. In addition, we introduce a new heteroscedastic regression model based on the special member of the proposed<br>family. Three data sets are analyzed with competitive and proposed models.</pre> Emrah Altun EA Morad Alizadeh Thiago Ramires Edwin Ortega Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 516 528 10.19139/soic-2310-5070-765 A Bayesian Inference Approach for Bivariate Weibull Distributions Derived from Roy and Morgenstern Methods <p>Bivariate lifetime distributions are of great importance in studies related to interdependent components, especially in engineering applications. In this paper, we introduce two bivariate lifetime assuming three- parameter Weibull marginal distributions. Some characteristics of the proposed distributions as the joint survival function, hazard rate function, cross factorial moment and stress-strength parameter are also derived. The inferences for the parameters or even functions of the parameters of the models are obtained under a Bayesian approach. An extensive numerical application using simulated data is carried out to evaluate the accuracy of the obtained estimators to illustrate the usefulness of the proposed methodology. To illustrate the usefulness of the proposed model, we also include an example with real data from which it is possible to see that the proposed model leads to good fits to the data.</p> Ricardo Puziol de Oliveira Marcos Vinicius de Oliveira Peres Milene Regina dos Santos Edson Zangiacomi Martinez Jorge Aberto Achcar Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-12 2021-07-12 9 3 529 554 10.19139/soic-2310-5070-1240 A New Probability Distribution for Modeling Failure and Service Times: Properties, Copulas and Various Estimation Methods <p>In this paper, a new generalization of the Pareto type II model is introduced and studied. The new density can<br>be “right skewed” with heavy tail shape and its corresponding failure rate can be “J-shape”, “decreasing” and “upside down (or increasing-constant-decreasing)”. The new model may be used as an “under-dispersed” and “over-dispersed” model. Bayesian and non-Bayesian estimation methods are considered. We assessed the performance of all methods via simulation study. Bayesian and non-Bayesian estimation methods are compared in modeling real data via two applications. In modeling real data, the maximum likelihood method is the best estimation method. So, we used it in comparing competitive models. Before using the the maximum likelihood method, we performed simulation experiments to assess the finite sample behavior of it using the biases and mean squared errors.</p> Hanaa Elgohari Mohamed Ibrahim Haitham M. Yousof Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 555 586 10.19139/soic-2310-5070-1101 Feature Selection Based on Divergence Functions: A Comparative Classiffication Study <p>Due to the extensive use of high-dimensional data and its application in a wide range of scientifc felds of research, dimensionality reduction has become a major part of the preprocessing step in machine learning. Feature selection is one procedure for reducing dimensionality. In this process, instead of using the whole set of features, a subset is selected to be used in the learning model. Feature selection (FS) methods are divided into three main categories: flters, wrappers, and embedded approaches. Filter methods only depend on the characteristics of the data, and do not rely on the learning model at hand. Divergence functions as measures of evaluating the differences between probability distribution functions can be used as flter methods of feature selection. In this paper, the performances of a few divergence functions such as Jensen-Shannon (JS) divergence and Exponential divergence (EXP) are compared with those of some of the most-known flter feature selection methods such as Information Gain (IG) and Chi-Squared (CHI). This comparison was made through accuracy rate and F1-score of classifcation models after implementing these feature selection methods.</p> Saeid Pourmand Ashkan Shabbak Mojtaba Ganjali Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 587 606 10.19139/soic-2310-5070-1092 Robust Liu-Type Estimator for SUR Model <p>The Liu-type estimator is one of the shrink estimators that is used to remedy for a problem of multicollinearity<br>in SUR model, but it is sensitive to the outlier. In this paper, we introduce the S Liu-type (SLiu-type) and MM Liu-type estimator (MMLiu-type) for SUR model. These estimators merge Liu-type estimator with S-estimator and with MM-estimator which makes it have high robustness at the different level of efficiency and at the same time prevents the bad effects of multicollinearity. Moreover, to get more robust features, we have modified the Liu-type estimator by making it depend on MM estimator instead of GLS estimator. The asymptotical properties for the suggested estimator were discussed and we used the fast and robust bootstrap (FRB) to obtain the suggested robust estimators. Furthermore, we run the simulation study to show the extent of excellence for the suggested robust estimators relative to the other estimators by many factors.</p> Tarek Omara Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-26 2021-07-26 9 3 607 617 10.19139/soic-2310-5070-985 Similarity Technique Effectiveness of Optimized Fuzzy C-means Clustering Based on Fuzzy Support Vector Machine for Noisy Data <p>Fuzzy VIKOR C-means (FVCM) is a kind of unsupervised fuzzy clustering algorithm that improves the accuracy<br>and computational speed of Fuzzy C-means (FCM). So it reduces the sensitivity to noisy and outlier data, and enhances performance and quality of clusters. Since FVCM allocates some data to a specific cluster based on similarity technique, reducing the effect of noisy data increases the quality of the clusters. This paper presents a new approach to the accurate location of noisy data to the clusters overcoming the constraints of noisy points through fuzzy support vector machine (FSVM), called FVCM-FSVM, so that at each stage samples with a high degree of membership are selected for training in the classification of FSVM. Then, the labels of the remaining samples are predicted so the process continues until the convergence of the FVCM-FSVM. The results of the numerical experiments showed the proposed approach has better performance than FVCM. Of course, it greatly achieves high accuracy.</p> Hoda Khanali Babak Vaziri Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 618 629 10.19139/soic-2310-5070-1035 A New Hybrid Optimizer for Global Optimization Based on a Comparative Study Remarks of Classical Gradient Descent Variants <p>In this paper, we present an empirical comparison of some Gradient Descent variants used to solve global<br>optimization problems for large search domains. The aim is to identify which one of them is more suitable for solving an optimization problem regardless of the features of the used test function. Five variants of Gradient Descent were implemented in the R language and tested on a benchmark of five test functions. We proved the dependence between the choice of the variant and the obtained performances using the khi-2 test in a sample of 120 experiments. Those test functions vary on convexity, the number of local minima, and are classified according to some criteria. We had chosen a range of values for each algorithm parameter. Results are compared in terms of accuracy and convergence speed. Based on the obtained results,we defined the priority of usage for those variants and we contributed by a new hybrid optimizer. The new optimizer is tested<br>in a benchmark of well-known test functions and two real applications are proposed. Except for the classical gradient descent algorithm, only stochastic versions of those variants are considered in this paper.</p> Mouad Touarsi Driss Gretete Abdelmajid Elouadi Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 630 664 10.19139/soic-2310-5070-1005 Automated Noise Detection in a Database Based on a Combined Method <p><span class="fontstyle0">Data quality has diverse dimensions, from which accuracy is the most important one. Data cleaning is one of the preprocessing steps in data mining which consists of detecting errors and repairing them. Noise is a common type of error, that occur in database. This paper proposes an automated method based on the </span><span class="fontstyle2">k</span><span class="fontstyle3">-</span><span class="fontstyle0">means clustering for noise detection. At first, each attribute (</span><span class="fontstyle2">A</span><span class="fontstyle4">j</span><span class="fontstyle0">) is temporarily removed from data and the </span><span class="fontstyle2">k</span><span class="fontstyle3">-</span><span class="fontstyle0">means clustering is applied to other attributes. Thereafter, the </span><span class="fontstyle2">k</span><span class="fontstyle3">-</span><span class="fontstyle0">nearest neighbors is used in each cluster. After that a value is predicted for </span><span class="fontstyle2">A</span><span class="fontstyle4">j </span><span class="fontstyle0">in each record by the nearest neighbors. The proposed method detects noisy attributes using predicted values. Our method is able to identify several noises in a record. In addition, this method can detect noise in fields with different data types, too. Experiments show that this method can averagely detect 92% of the noises existing in the data. The proposed method is compared with a noise detection method using association rules. The results indicate that the proposed method have improved noise detection averagely by 13%.</span></p> Mahdieh Ataeyan Negin Daneshpour Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-06-09 2021-06-09 9 3 665 680 10.19139/soic-2310-5070-879 An Optimal Adaptive Variable Sample Size Scheme for the Multivariate Coefficient of Variation <p>Development of an efficient process monitoring system has always received great attention. Previous studies revealed that the coefficient of variation (CV) is important in ensuring process quality, especially for monitoring a process where its process mean and variance are highly correlated. The fact that almost all industrial process monitoring involves a minimum of two or more related quality characteristics being monitored simultaneously, this paper incorporates the salient feature of the adaptive sample size VSS scheme into the standard multivariate CV (MCV) chart, called the VSS MCV chart. A Markov chain model is developed for the derivation of the chart’s performance measures, i.e the average run length (ARL), the standard deviation of the run length (SDRL), the average sample size (ASS), the average number of observations to signal (ANOS) and the expected average run length (EARL). The numerical comparison shows that the proposed chart prevails over the existing standard MCV chart for detecting small and moderate upward and downward MCV shifts.</p> KHAI WAH KHAW XINYING CHEW MING HA LEE WAI CHUNG YEONG Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-11 2021-07-11 9 3 681 693 10.19139/soic-2310-5070-996 Sequential Monte Carlo Filters with Parameters Learning for Commodity Pricing Models <p>In this article, an estimation methodology based on the sequential Monte Carlo algorithm is proposed, that<br>jointly estimate the states and parameters, the relationship between the prices of futures contracts and the spot prices of primary products is determined, the evolution of prices and the volatility of the historical data of the primary market (Gold and Soybean) are analyzed. Two stochastic models for an estimate the states and parameters are considered, the parameters and states describe physical measure (associated with the price) and risk-neutral measure (associated with the markets to futures), the price dynamics in the short-term through the reversion to the mean and volatility are determined, while that in the long term through markets to futures. Other characteristics such as seasonal patterns, price spikes, market dependent volatilities, and non-seasonality can also be observed. In the methodology, a parameter learning algorithm is used, specifically, three algorithms are proposed, that is the sequential Monte Carlo estimation (SMC) for state space models<br>with unknown parameters: the first method is considered a particle filter that is based on the sampling algorithm of sequential importance with resampling (SISR). The second implemented method is the Storvik algorithm [19], the states and parameters of the posterior distribution are estimated that have supported in low-dimensional spaces, a sufficient statistics from the sample of the filtered distribution is considered. The third method is (PLS) Carvalho's Particle Learning and Smoothing algorithm [31]. The cash prices of the contracts with future delivery dates are analyzed. The results indicate postponement of payment, the future prices on different maturity dates with the spot price are highly correlated. Likewise, the contracts with a delivery date for the last periods of the year 2017, the spot price lower than the prices of the contracts with expiration date for 12 and 24 months is found, opposite occurs in the contracts with expiration date for 1 and 6 months.</p> Saba Infante Luis Sánchez Aracelis Hernández José Marcano Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-06-22 2021-06-22 9 3 694 716 10.19139/soic-2310-5070-814 GH Biplot in Reduced-Rank Regression based on Partial Least Squares <p>One of the challenges facing statisticians is to provide tools to enable researchers to interpret and present their data and conclusions in ways easily understood by the scientific community. One of the tools available for this purpose is a multivariate graphical representation called reduced rank regression biplot. This biplot describes how to construct a graphical representation in nonsymmetric contexts such as approximations by least squares in multivariate linear regression models of reduced rank. However multicollinearity invalidates the interpretation of a regression coefficient as the conditional effect of a regressor, given the values of the other regressors, and hence makes biplots of regression coefficients useless. So it was, in the search to overcome this problem, Alvarez and Griffin&nbsp; presented a procedure for coefficient estimation in a multivariate regression model of reduced rank in the presence of multicollinearity based on PLS (Partial Least Squares) and generalized singular value decomposition. Based on these same procedures, a biplot construction is now presented for a multivariate regression model of reduced rank in the presence of multicollinearity. This procedure, called PLSSVD GH biplot, provides a useful data analysis tool which allows the visual appraisal of the structure of the dependent and independent variables. This paper defines the procedure and shows several of its properties. It also provides an implementation of the routines in R and presents a real life application involving data from the FAO food database to illustrate the procedure and its properties.</p> Wilin Alvarez Victor John Griffin Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-10 2021-07-10 9 3 717 734 10.19139/soic-2310-5070-1112 Expectation Properties of Generalized Order Statistics from Poisson Lomax Distribution <p>The Poisson Lomax distribution was proposed by [3], as a useful model for analyzing lifetime data. In this paper,<br>we have derived recurrence relations for single and product moments of generalized order statistics for this distribution. Further, characterization of the distribution is carried out. Some deductions and particular cases are also discussed.</p> Haseeb Athar Zubdahe Noor Saima Zarrin Hanadi N.S. Almutairi Copyright (c) 2020 Statistics, Optimization & Information Computing 2020-05-29 2020-05-29 9 3 735 747 10.19139/soic-2310-5070-614 A New Family of Continuous Distributions: Properties, Copulas and Real Life Data Modeling <p>A new family of distributions called the Kumaraswamy Rayleigh family is defied and studied. Some of its relevant statistical properties are derived. Many new bivariate type G families using the of Farlie-Gumbel-Morgenstern, modified Farlie-Gumbel-Morgenstern copula, Clayton copula and Renyi’s entropy copula are derived. The method of the maximum likelihood estimation is used. Some special models based on log-logistic, exponential, Weibull, Rayleigh, Pareto type II and Burr type X, Lindley distributions are presented and studied. Three dimensional skewness and kurtosis plots are presented. A graphical assessment is performed. Two real life applications to illustrate the flexibility, potentiality and importance of the new family is proposed.</p> Mohamed Refaie Copyright (c) 2021 Statistics, Optimization & Information Computing 2021-07-26 2021-07-26 9 3 748 768 10.19139/soic-2310-5070-1130