Learning Unknown Structure in CRFs via Adaptive Gradient Projection Method

  • Wei Xue Nanjing University of Science and Technology
  • Wensheng Zhang Chinese Academy of Sciences
Keywords: Conditional random field, Structure learning, Constrained optimization, Two-point stepsize, Gradient projection, Nonmonotone line search

Abstract

We study the problem of fitting probabilistic graphical models to the given data when the structure is not known. More specifically, we focus on learning unknown structure in conditional random fields, especially learning both the structure and parameters of a conditional random field model simultaneously. To do this, we first formulate the learning problem as a convex minimization problem by adding an l_2-regularization to the node parameters and a group l_1-regularization to the edge parameters, and then a gradient-based projection method is proposed to solve it which combines an adaptive stepsize selection strategy with a nonmonotone line search. Extensive simulation experiments are presented to show the performance of our approach in solving unknown structure learning problems.

Author Biographies

Wei Xue, Nanjing University of Science and Technology
Wensheng Zhang, Chinese Academy of Sciences

References

J. Barzilai and J. M. Borwein, Two-point step size gradient methods, IMA J. Numer. Anal., vol. 8, no. 1, pp. 141-148, 1988.

J. Besag, Efficiency of pseudo-likelihood estimation for simple Gaussian fields, Biometrika, vol. 64, pp. 616-618, 1977

F. Biglari, M. A. Hassan and W. J. Leong, New quasi-Newton methods via higher order tensor models, J. Comput. Appl. Math., vol. 235, pp. 2412-2422, 2011.

F. Biglari and M. Solimanpur, Scaling on the spectral gradient method, J. Optim. Theory Appl., vol. 158, no. 2, pp. 626-635, 2013.

E. G. Birgin, J. M. Mart´ ınez and M. Raydan, Nonmonotone spectral projected gradient methods on convex sets, SIAM J. Optim., vol.10, no. 4, pp. 1196-1211, 2000.

S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1-122, 2011.

Y. Dai and R. Fletcher, Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming, Numer. Math., vol. 100, no. 1, pp. 21-47, 2005.

Y. Dai, J. Yuan and Y. Yuan, Modified two-point stepsize gradient methods for unconstrained optimization, Comput. Optim. Appl., vol.22, no. 1, pp. 103-109, 2002.

L. Grippo, F. Lampariello and S. Lucidi, A nonmonotone line search technique for Newton’s method, SIAM J. Numer. Anal., vol. 23, pp. 707-716, 1986.

S. Huang, Z. Wan and X. Chen, A new nonmonotone line search technique for unconstrained optimization, Numer. Algorithms, vol. 68, no.4, pp. 671-689, 2015.

J. Lafferty, A. McCallum and F. Pereira, Conditional random fields: probabilistic models for segmenting and labeling sequence data, in ICML, pp. 282-289, 2011.

J. Nocedal and S. J. Wright, Convex Optimization, New York: Springer Science & Business Media, 2006.

M. Raydan, The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem, SIAM J. Optim., vol.7, no. 1, pp. 26-33, 1997.

M. Schmidt, K. Murphy, G. Fung and R. Rosales, Structure learning in random fields for heart motion abnormality detection, in CVPR, 2008.

M. Schmidt, E. van den Berg, M. P. Friedlander and K. Murphy, Optimizing costly functions with simple constraints: a limitedmemory projected quasi-Newton algorithm, in AISTATS, 2009.

B. A. Turlach, W. N. Venables and S. J. Wright, Simultaneous variable selection, Technometrics, vol. 47, no. 3, pp. 349-363, 2005.

W. Wang and Q. Wang, Approximated function based spectral gradient algorithm for sparse signal recovery, Stat., Optim. Inf. Comput., vol. 2, no. 1, pp. 10-20, 2014.

Z. Wei, G. Yu, G. Yuan and Z. Lian, The superlinear convergence of a modified BFGS-type method for unconstrained optimization, Comput. Optim. Appl., vol. 29, no. 3, pp. 315-332, 2004.

Y. Xiao, Q. Wang and D. Wang, Notes on the Dai-Yuan-Yuan modified spectral gradient method, J. Comput. Appl. Math., vol. 234, no. 10, pp. 2986-2992, 2010.

G. Yu, L. Qi, Y. Sun and Y. Zhou, Impulse noise removal by a nonmonotone adaptive gradient method, Signal Process., vol. 90, pp. 2891-2897, 2010.

H. Zhang and W. W. Hager, A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim., vol. 14, pp. 1043-1056, 2004.

J. Zhang, N. Deng and L. Chen, New quasi-Newton equation and related methods for unconstrained optimization, J. Optim. Theory Appl., vol. 102, pp. 147-167, 1999.

B. Zhou, L. Gao and Y. Dai, Gradient methods with adaptive stepsizes, Comput. Optim. Appl., vol. 35, pp. 69-86, 2006.

Published
2016-08-30
How to Cite
Xue, W., & Zhang, W. (2016). Learning Unknown Structure in CRFs via Adaptive Gradient Projection Method. Statistics, Optimization & Information Computing, 4(3), 243-251. https://doi.org/10.19139/soic.v4i3.228
Section
Research Articles