The Best Model of the Swiss Banknote Data -Validation by the 95% CI of coefficients and t-test of discriminant scores

  • Shuichi Shinmura Seikei Univ.
Keywords: Best Model, Fisher’s Linear Discriminant Function (Fisher’s LDF), Logistic Regression, Support Vector Machine (SVM), Revised IP-OLDF.

Abstract

The discriminant analysis is not the inferential statistics since there are no equations for standard error (SE) of error rate and discriminant coefficient based on the normal distribution. In this paper, we proposed the “k-fold cross validation for small sample” and can obtain the 95% confidence interval (CI) of error rates and discriminant coefficients. This method is the computer-intensive approach by statistical and mathematical programming (MP) software such as JMP and LINGO. By the proposed approach, we can choose the best model with the minimum mean of error rate in the validation samples (Minimum M2 Standard). In this research, we examine the sixteen linear separable models of Swiss banknote data by eight linear discriminant functions (LDFs). M2 of the best model of Revised IP-OLDF is the smallest value of all models. We find all coefficients of six Revised IP-OLDF among sixteen models rejected by the 95% CI of discriminant coefficients (Discriminant coefficient standard). We compare t-values of the discriminant scores. The t-value of the best model has the maximum values among sixteen models (Maximum t-value Standard). Moreover, we can conclude that all standards support the best model of Revised IP-OLDF.

Author Biography

Shuichi Shinmura, Seikei Univ.
Faculty of EconomicProfessor

References

Cox, DR., (1958). The regression analysis of binary sequences (with discussion). J Roy Stat Soc B 20: 215-242.

Fisher, R. A., (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 7, 179-188.

Firth, D., (1993). Bias reduction of maximum likelihood estimates. Biometrika, 80, 27-39.

Flury, B., Riedel, H., (1988). Multivariate Statistics: A Practical Approach. Cambridge University Press.

Friedman, J. H., (1989). Regularized Discriminant Analysis. Journal of the American Statistical Association 84/405, 165-175.

Goodnight J.H. (1981) A tutorial on the SWEEP Operator. The American Statistician 33, 149-158.

Lachenbruch, P. A., Mickey, M. R., (1968). Estimation of error rates in discriminant analysis. Technometrics, 10, 1-11.

Sall, J. P., Creighton, L., Lehman, A., (2004). JMP Start Statistics, Third Edition. SAS Institute Inc.

Schrage, L., (1991). LINDO -An Optimization Modeling System (Fourth Edition). The Scientific Press.

Schrage, L., (2006). Optimization Modeling with LINGO. LINDO Systems Inc.

Shinmura, S., (1998). Optimal Linear Discriminant Functions using Mathematical Programming. Journal of the Japanese Society of Computer Statistics, 11 / 2, 89-101.

Shinmura, S., (2000a). A new algorithm of the linear discriminant function using integer programming. New Trends in Probability and Statistics, 5, 133-142.

Shinmura, S., (2000b). Optimal Linear Discriminant Function using Mathematical Programming. Dissertation, March 200, 1-101, Okayama Univ.

Shinmura, S., (2004). New Algorithm of Discriminant Analysis using Integer Programming. IPSI 2004 Pescara VIP Conference CD-ROM, 1-18.

Shinmura, S., (2007a). Comparison of Revised IP-OLDF and SVM. ISI2009, 1-4.

Shinmura, S., (2007b). Overviews of Discriminant Function by Mathematical Programming. Journal of the Japanese Society of Computer Statistics, 20/1-2, 59-94.

Shinmura, S., (2009). Practical discriminant analysis by IP-OLDF and IPLP-OLDF. IPSI 2009 Belgrade VIPSI Conference CD-ROM, 1-17.

Shinmura, S., (2010a). The optimal linear discriminant function. Union of Japanese Scientist and Engineer Publishing.

Shinmura, S., (2010b). Improvement of CPU time of Revised IP-OLDF using Linear Programming. Journal of the Japanese Society of Computer Statistics, 22/1, 39-57.

Shinmura, S., (2011a)Problems of Discriminant Analysis by Mark Sense Test Data. Japanese Society of Applied Statistics, 4/3, 157-172.

Shinmura, S., (2011b). Beyond Fisher’s Linear Discriminant Analysis - New World of Discriminant Analysis. ISI2011 CD-ROM, 1-6.

Shinmura, S., (2013). Evaluation of Optimal Linear Discriminant Function by 100-fold Cross-validation. 2013 ISI CD-ROM, 1-6.

Shinmura, S., (2014a). End of Discriminant Functions based on Variance-Covariance Matrices. ICORES, 5-14, 2014.

Shinmura, S., (2014b). Improvement of CPU time of Linear Discriminant Functions based on MNM criterion by IP. Statistics, Optimization and Information Computing, 2, 14-129.

Shinmura, S., (2014c). Comparison of Linear Discriminant Function by K-fold Cross-validation. Data Analytic 2014, 1-6.

Shinmura, S., (2015a). The 95% confidence intervals of error rates and discriminant coefficients. Statistics, Optimization and Information Computing, 3, 66-78.

Shinmura, S., (2015b). Four Serious Problems and New Facts of the Discriminant Analysis. In Pinson, E., Valente, F., Vitoriano, B., (Eds.), Operations Research and Enterprise Systems, 15-30, Springer (ISSN: 1865-0929, ISBN: 978-3-319-17508-9, DOI:10.1007/978-3-319-17509-6).

Shinmura, S., (2015c). A Trivial Linear Discriminant Function. Statistics, Optimization and Information Computing, 322-335.

Shinmura, S., (2015d). Matroska Feature Selection Method for Microarray Data. Free paper (16) on Research Gate, 1-6.

Stam,A., (1997).Nontraditionalapproachestostatisticalclassification:Someperspectivesonlp-Normmethods.AnnalsofOperations

Research, 74, 1-36.

Taguchi, G., Jugulum, R., (2002). The Mahalanobis-Taguchi Strategy - A Pattern Technology System. John Wiley & Sons.

Vapnik, V., (1995). The Nature of Statistical Learning Theory. SpringerVerlag.

Published
2016-06-01
How to Cite
Shinmura, S. (2016). The Best Model of the Swiss Banknote Data -Validation by the 95% CI of coefficients and t-test of discriminant scores. Statistics, Optimization & Information Computing, 4(2), 118-131. https://doi.org/10.19139/soic.v4i2.178
Section
Research Articles