A Trivial Linear Discriminant Function

Shuichi Shinmura

Abstract


In this paper, we focus on the new model selection procedure of the discriminant analysis. Combining re-sampling technique with k-fold cross validation, we develop a k-fold cross validation for small sample method. By this breakthrough, we obtain the mean error rate in the validation samples (M2) and the 95\% confidence interval (CI) of discriminant coefficient. Moreover, we propose the model  selection  procedure  in  which  the model having a minimum M2 was  chosen  to  the  best  model.  We  apply  this  new  method and procedure to the pass/ fail determination of  exam  scores.  In  this  case,  we  fix  the constant =1 for seven linear discriminant  functions  (LDFs)  and  several  good  results  were obtained as follows: 1) M2 of Fisher's LDF are over 4.6\% worse than Revised IP-OLDF. 2) A soft-margin  SVM  for  penalty c=1  (SVM1)  is  worse  than  another  mathematical  programming (MP) based LDFs and logistic regression . 3) The 95\% CI of the best discriminant coefficients was obtained. Seven LDFs except for Fisher's LDF are almost the same as a trivial LDF for the linear separable model. Furthermore, if we choose the median of the coefficient of seven LDFs except for Fisher's LDF,  those are almost the same as the trivial LDF for the linear separable model.


Keywords


Fisher’s Linear Discriminant Function (Fisher’s LDF); Logistic Regression; Quadratic Discriminant Function (QDF); Regulalized Discriminant Analysis (RDA); Support Vector Machine (SVM); Number of Misclassifications (NM); Minimum NM (MNM); Revised IP-OLDF

References


Fisher, R. A., (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 7, 179–188.

Friedman, J. H., (1989). Regularized Discriminant Analysis.Journal of the American Statistical Associ-ation,84/405, 165-175.

Goodnight, J.H.(1981). A tutorial on the SWEEP Operator, The American Statistician, 33, 149-158.

Lachenbruch, P. A., Mickey, M. R., (1968). Estimation of error rates in discriminant analysis. Technomet-rics 10, 1-11.

Sall, J. P., Creighton, L., Lehman, A., (2004). JMP Start Statistics, Third Edition. SAS Institute Inc.

Schrage, L., (1991). LINDO –An Optimization Modeling System (Fourth Edition)-. The Scientific Press.

Schrage, L., (2006). Optimization Modeling with LINGO. LINDO Systems Inc.

Shinmura, S., (1998). Optimal Linear Discriminant Functions using Mathematical Programming. Journal of the Japanese Society of Computer Statistics, 11 / 2, 89-101.

Shinmura, S., (2000). A new algorithm of the linear discriminant function using integer programming. New Trends in Probability and Statistics, 5, 133-142.

Shinmura, S., (2004). New Algorithm of Discriminant Analysis using Integer Programming. IPSI 2004 Pescara VIP Conference CD-ROM, 1-18.

Shinmura, S., (2007a). Comparison of Revised IP-OLDF and SVM. ISI2009, 1-4.

Shinmura, S., (2007b). Overviews of Discriminant Function by Mathematical Programming. Journal of the Japanese Society of Computer Statistics, 20/1-2, 59-94.

Shinmura, S., (2009). Practical discriminant analysis by IP-OLDF and IPLP-OLDF. IPSI 2009 Belgrade VIPSI Conference CD-ROM, 1-17.

Shinmura, S., (2010a). The optimal linear discriminant function. Union of Japanese Scientist and Engineer Publishing.

Shinmura, S., (2010b). Improvement of CPU time of Revised IP-OLDF using Linear Programming. Journal of the Japanese Society of Computer Statistics, 22/1, 39-57.

Shinmura, S., (2011a). Problems of Discriminant Analysis by Mark Sense Test Data. Japanese Society of Applied Statistics, 40/3,157-172.

Shinmura, S., (2011b). Beyond Fisher’s Linear Discriminant Analysis - New World of Discriminant Analysis -. ISI2011 CD-ROM,1-6.

Shinmura, S., (2013). Evaluation of Optimal Linear Discriminant Function by 100-fold Cross-validation. 2013 ISI CD-ROM, 1-6.

Shinmura, S., (2014a). End of Discriminant Functions based on Variance-Covariance Matrices. ICORES, 5-14, 2014.

Shinmura, S., (2014b). Improvement of CPU time of Linear Discriminant Functions based on MNM criterion by IP. Statistics, Optimization and Information Computing, 2, 114-129.

Shinmura, S., (2014c). Comparison of Linear Discriminant Function by K-fold Cross-validation. Data Analytic 2014, 1-6.

Shinmura, S., (2015a). The 95% confidence intervals of error rates and discriminant coefficients. Statistics, Optimization and Information Computing, 3, 66-78.

Shinmura, S., (2015b). Four Serious Problems and New Facts of the Discriminant Analysis. In Pinson, E., Valente, F., Vitoriano, B., (Eds.), Operations Research and Enterprise Systems, 15-30, Springer (ISSN: 1865-0929, ISBN: 978-3-319-17508-9, DOI:10.1007/978-3-319-17509-6).

Stam, A., (1997). Nontraditional approaches to statistical classification: Some perspectives on lp-norm methods. Annals of Operations Research, 74, 1-36.

Taguchi, G., Jugulum, R., (2002). The Mahalanobis-Taguchi Strategy -A Pattern Technology System. John Wiley & Sons.

Vapnik, V., (1995). The Nature of Statistical Learning Theory. Springer-Verlag.


Full Text: PDF

DOI: 10.19139/soic.v3i4.151

Refbacks

  • There are currently no refbacks.