Skip to main content

The neural network pushdown automaton: Architecture, dynamics and training

  • Chapter
  • First Online:
Adaptive Processing of Sequences and Data Structures (NN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1387))

Included in the following conference series:

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R.B. Allen, “Connectionist Language Users,” Connection Science, 2(4):279, 1990.

    Google Scholar 

  2. D. Angluin, C.H. Smith, “Inductive Inference: Theory and Methods”, ACM Computing Surveys, 15(3):237–269, 1983.

    Article  MathSciNet  Google Scholar 

  3. R.C. Carrasco, M.L. Forcada, “Second-order recurrent neural networks can learn regular grammars from noisy strings,” From Natural to Artificial Neural Computation, Proceedings of the International Workshop on Artificial Neural Networks IWANN'95, Lecture Notes in Computer Science 930, p. 605–610, 1995.

    Google Scholar 

  4. M.P. Casey, “The Dynamics of Discrete-Time Computation, With Application to Recurrent Neural Networks and Finite State Machine Extraction”, Neural Computation, 8(6): 1135–1178, 1996.

    Google Scholar 

  5. D. Chen, Ci. Giles, G.Z. Sun, H.H. Chen, Y.C. Lee, “Learning Finite State Transducers with a Recurrent Neural Network,” IJCNN International Joint Conference on Neural Networks, Beijing, China, Publishing House of Electronics Industry, Beijing, Vol. 1. pp. 129, 1992.

    Google Scholar 

  6. C-H. Chen, V. Honavar, “A Neural Architecture for Syntax Analysis.”IEEE Transactions on Neural Networks. (accepted).

    Google Scholar 

  7. A. Cleeremans, D. Servan-Schreiber, J. McClelland, “Finite State Automata and Simple Recurrent Neural Networks”, Neural Computation, 1(3):372–381, 1989.

    Google Scholar 

  8. D.S. Clouse, C.L. Giles, B.G. Horne, G.W. Cottrell, “Time-Delay Neural Networks: Representation and Induction of Finite State Machines,” IEEE Trans. on Neural Networks, 8(5): 1065, 1997.

    Article  Google Scholar 

  9. J.P. Crutchfield, K. Young, “Computation at the Onset of Chaos”, Proceedings of the 1988 Workshop on Complexity, Entropy and the Physics of Information, pp. 223–269, Editor W.H. Zurek, Addison-Wesley, Redwood City, CA. 1991.

    Google Scholar 

  10. S. Das, R. Das, “Induction of discrete state-machine by stabilizing a continuous recurrent network using clustering,” Computer Science and Informatics, 21(2): 35–40, 1991.

    Google Scholar 

  11. S. Das, C.L. Giles, G.Z. Sun, “Learning Context-free Grammars: Limitations of a Recurrent Neural Network with an External Stack Memory”, Proceedings of The Fourteenth Annual Conference of the Cognitive Science Society, Morgan Kaufmann Publishers, p.791–795, San Mateo, CA. 1992.

    Google Scholar 

  12. S. Das, C.L. Giles, G.Z. Sun, “Using Hints to Successfully Learn Context-Free Grammars with a Neural Network Pushdown Automaton” Advances in Neural Information Processing Systems 5, Eds: S.J. Hanson, J.D. Cowan, C.L. Giles, Morgan Kaufmann, San Mateo, CA., p. 65, 1993.

    Google Scholar 

  13. S. Das, M.C. Mozer, “A Unified Gradient-descent/Clustering Architecture for Finite State Machine Induction”, Advances in Neural Information Processing Systems 6, Eds: J.D. Cowan, G. Tesauro, J. Alspector, Morgan Kaufmann, San Mateo, CA, p. 19, 1994.

    Google Scholar 

  14. R.O. Duda, P.E. Hart, Pattern Classification and Scene Analysis, Wiley, New York, N.Y. 1973.

    Google Scholar 

  15. J.L. Elman, “Finding Structure in Time”, Cognitive Science, 14:179–211, 1990.

    Article  Google Scholar 

  16. J.L. Elman, “Incremental learning, or the importance of starting small”, CRL Tech Report 9101, Center for Research in Language, University of California at San Diego, La Jolla, CA. 1991.

    Google Scholar 

  17. J.L. Elman, “Distributed Representations, Simple Recurrent Networks, and Grammatical Structure”, Machine Learning, 7(2–3): 195–226, 1991.

    Google Scholar 

  18. P. Frasconi, M. Gori, M. Maggini, G. Soda, “Unified Integration of Explicit Rules and Learning by Example in Recurrent Networks,” IEEE Trans. on Knowledge and Data Engineering, 7(2):340–346, 1995.

    Article  Google Scholar 

  19. P. Frasconi, M. Gori, M. Maggini, G. Soda, “Representation of Finite State Automata in Recurrent Radial Basis Function Networks”, Machine Learning, 23:5–32, 1996.

    Google Scholar 

  20. P. Frasconi, M. Gori, “Computational Capabilities of Local-Feedback Recurrent Networks Acting as Finite-State Machines”, IEEE Trans. on Neural Networks, 7(6):1521–1524, 1996.

    Article  Google Scholar 

  21. K.S. Fu, Syntactic Pattern Recognition and Applications, Prentice-Hall, Englewood Cliffs, N.J. 1982.

    Google Scholar 

  22. J. Ghosh, Y. Shin, “Efficient higher-order neural networks for function approximation and classification,” International J. of Neural Systems, 3(4): 323–350, 1992.

    Article  Google Scholar 

  23. Proceedings of the 2nd and 3rd Workshops on Grammatical Inference, Springer-Verlag, 1994–1996.

    Google Scholar 

  24. C.L. Giles, G.Z. Sun, H.H. Chen, Y.C. Lee, D. Chen, “High Order Recurrent Networks and Grammatical Inference”, Advances in Neural Information Processing System 2, p. 380–387, Editor D. S. Touretzky, Morgan Kaufman, San Mateo, CA. 1990.

    Google Scholar 

  25. C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, Y.C. Lee, “Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks”, Neural Computation, 4(3):380. 1992.

    Google Scholar 

  26. C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, Y.C. Lee, “Extracting and Learning an Unknown Grammar with Recurrent Neural Networks”, Advances in Neural Information Processing System 4, pp. 317–324, Edited by J. Moody, S. Hanson, R. Lippmann, Morgan Kaufmann, San Mateo, CA, 1992.

    Google Scholar 

  27. C.L. Giles, C.W. Omlin, “Extraction, Insertion and Refinement of Symbolic Rules in Dynamically-Driven Recurrent Neural Networks,” Connection Science, 5(3–4):307, 1993.

    Google Scholar 

  28. C.L. Giles, D. Chen, G.Z. Sun, H.H. Chen, Y.C. Lee, M.W. Goudreau, “Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Cascade Correlation and a Simple Solution, IEEE Trans. on Neural Networks, 6(4):829–836, 1995.

    Article  Google Scholar 

  29. C.L. Giles, “Learning a Class of Large Finite State Machines with a Recurrent Neural Network”, Neural Networks, 8(9): 1359–1365, 1995.

    Article  Google Scholar 

  30. E.M. Gold, “Complexity of Automaton Identification from Given Data”, Information and Control, 37:302–320, 1978.

    Article  MATH  MathSciNet  Google Scholar 

  31. M.W. Goudreau, C.L. Giles, S.T. Chakradhar, D. Chen, “First-Order Vs. Second-Order Single Layer Recurrent Neural Networks,” IEEE Trans. on Neural Networks, 5(3):511, 1994.

    Article  Google Scholar 

  32. M.W. Goudreau, C.L. Giles, “Using Recurrent Neural Networks to Learn the Structure of Interconnection Networks”, Neural Networks, 8(5):793–804, 1995.

    Article  Google Scholar 

  33. S. Grossberg, Studies of Mind and Brain, Chapter 3, p. 65–167, Kluwer Academic, Boston, MA. 1982.

    Google Scholar 

  34. M.H. Harrison, Introduction to Formal Language Theory, Addison-Wesley Publishing Company, Inc., Realding, MA. 1978.

    Google Scholar 

  35. J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages, and Computation”, Addison-Wesley. Reading, MA. 1979.

    Google Scholar 

  36. J.J. Hopfield, “Neural Networks and Physical Systems with Emergent Collective Computational Abilities”, Proceedings of the National Academy of Sciences, USA, 79:2554, 1982.

    Article  MathSciNet  Google Scholar 

  37. B. Horne, D.R. Hush, C. Abdallah, “The State Space Recurrent Neural Network with Application to Regular Grammatical Inference,” UNM Technical Report No. EECE 92-002, Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, 87131, 1992.

    Google Scholar 

  38. M.I. Jordan, “Attractor Dynamics and Parallelism in a Connectionist Sequential Machine”, Proceedings of the Ninth Annual conference of the Cognitive Science Society, Lawrence Erlbaum, pp. 531–546, 1986.

    Google Scholar 

  39. S.C. Kleene, “Representation of Events in Nerve Nets and Finite Automata”, Automata Studies, Editor C.E. Shannon and J. McCarthy, Princeton University Press, p. 3–42, Princeton, NJ. 1956.

    Google Scholar 

  40. Z. Kohavi, Switching and Finite Automata Theory, McGraw-Hill, NY, NY, 1978.

    Google Scholar 

  41. S.C. Kremer, “Finite State Automata that Recurrent Cascade-Correlation Cannot Represent”, Advances in Neural Information Processing Systems 8, Eds: D. Touretzky, M. Mozer, M. Hasselno, MIT Press, pp. 612–618, 1996.

    Google Scholar 

  42. K.J. Lang, “Random DFA's can be Approximately Learned from Sparse Uniform Examples”, Proceedings of the Fifth ACM Workshop on Computational Learning Theory, 45–52, ACM Press, 1992.

    Google Scholar 

  43. Y.C. Lee, G. Doolen, H.H. Chen, G.Z. Sun, T. Maxwell, H.Y. Lee, C.L. Giles, “Machine Learning Using a Higher Order Correlational Network”, Physica D, 22-D(1–3):276–306, 1986.

    MathSciNet  Google Scholar 

  44. L. Ljung, System Identification: Theory for the User, Prentice Hall, Englewood Cliffs, N.J. 1987.

    Google Scholar 

  45. S. Lucas, R. Damper, “Syntactic Neural Networks,” Connection Science, 2: 199–225, 1990.

    Google Scholar 

  46. W. Maass, “Lower Bounds for the Computational Power of Networks of Spiking Neurons”, Neural Computation, 8(1):1–40, 1996.

    MATH  MathSciNet  Google Scholar 

  47. W.S. McCulloch, W. Pitts, “A Logical Calculus of Ideas Immanent in Nervous Activity”, Bulletin of Mathematical Biophysics, 5:115–133, 1943.

    MathSciNet  Google Scholar 

  48. L. Miclet, “Grammatical Inference”, Syntactic and Structural Pattern Recognition; Theory and Applications, World Scientific, Editor H. Bunke, A. Sanfeliu, Singapore, 1990.

    Google Scholar 

  49. C.B. Miller, C.L. Giles, “Experimental Comparison of the Effect of Order in Recurrent Neural Networks”, International Journal of Pattern Recognition and Artificial Intelligence, 7(4):849–872, 1993.

    Article  Google Scholar 

  50. M. Minsky, Computation: Finite and Infinite Machines, Prentice-Hall, Inc., Englewood Cliffs, NJ, 1967.

    Google Scholar 

  51. C. Moore, “Dynamical Recognizers: Real-time Language Recognition by Analog Computers,” Theoretical Computer Science, (accepted).

    Google Scholar 

  52. M.C. Mozer, J. Bachrach, “Discovering the Structure of a Reactive Environment by Exploration”, Neural Computation, 2(4):447, 1990.

    Google Scholar 

  53. M. Mozer, S. Das, “A Connectionist Symbol Manipulator That Discover the Structure of Context-Free Languages”, Advances in Neural Information Processing Systems 5, Eds: S.J. Hanson, J.D. Cowan, C.L. Giles, Morgan Kaufmann, San Mateo, CA., 863, 1993.

    Google Scholar 

  54. K.S. Narendra, K. Parthasarathy, “Identification and Control of Dynamical Systems Using Neural Networks,” IEEE Trans. on Neural Networks, 1(1):4–27, 1990.

    Article  Google Scholar 

  55. O. Nerrand, P. Roussel-Ragot, L. Personnaz, G. Dreyfus, S. Marcos, “Neural Networks and Nonlinear Adaptive Filtering: Unifying Concepts and New Algorithms,” Neural Computation, 5:165–199, 1993.

    Google Scholar 

  56. I. Noda, M. Nagao, “A Learning Method for Recurrent Networks Based on Minimization of Finite Automata,” IJCNN International Joint Conference on Neural Networks, Vol. I, pp. 27–32, IEEE Press, Piscataway, NJ, 1992.

    Google Scholar 

  57. C.W. Omlin, C.L. Giles, “Constructing Deterministic Finite-State Automata in Recurrent Neural Networks”, J. of the ACM, 43(6):937–972, 1996.

    Article  MathSciNet  Google Scholar 

  58. C.W. Omlin, C.L. Giles, “Extraction of Rules from Discrete-Time Recurrent Neural Networks,” Neural Networks, 8(4): 41–52, 1996.

    Article  Google Scholar 

  59. C.W. Omlin, K.K. Thornber, C.L. Giles, “Fuzzy Finite-State Automata Can Be Deterministically Encoded Into Recurrent Neural Networks,” IEEE Trans. on Fuzzy Systems (accepted).

    Google Scholar 

  60. Y. Pao, Adaptive Pattern Recognition and Neural Networks, Addison-Wesley Publishing Co., Inc., Reading, MA, 1989.

    Google Scholar 

  61. B.H. Partee, A.T. Meulen, R.E. Wall, Mathematical Methods in Linguistics, ch 18. Kluwer Academic Publishers, Norwell, MA, 1990.

    Google Scholar 

  62. S J. Perantonis, P.J.G. Lisboa, “Translation, Rotation, and Scale Invariant Pattern Recognition by Higher-Order Neural Networks and Moment Classifiers,” IEEE Trans. on Neural Networks, 3(2):241, 1992.

    Article  Google Scholar 

  63. J.B. Pollack, “On Connectionist Models of Natural Language Processing,” Ph.D. Thesis, Computer Science Depart., University of Illinois, Urbana, 1987.

    Google Scholar 

  64. J.B. Pollack, “Recursive distributed representation,” J. of Artificial Intelligence, 46:77–105, 1990.

    Article  Google Scholar 

  65. J.B. Pollack, “The Induction of Dynamical Recognizers,” Machine Learning, 7:227–252, 1991.

    Google Scholar 

  66. D. Psaltis, C.H. Park, J. Hong, “Higher Order Associative Memories and Their Optical Implementations”, Neural Networks, 1:149, 1988.

    Article  Google Scholar 

  67. D.E. Rumelhart, G.E. Hinton, J.L McClelland, “A General Framework for Parallel Distributed Processing”, Chapter 2, Parallel Distributed Processing, MIT Press, Cambridge, MA. 1986

    Google Scholar 

  68. A. Sanfeliu, Rene Alquezar, “Understanding Neural Networks for Grammatical Inference and Recognition,” Advances in Structural and Syntactic Pattern Recognition, Eds. H. Bunke, World Scientific, 1992.

    Google Scholar 

  69. J.W. Shavlik, “Combining Symbolic and Neural Learning”, Machine Learning, 14(3): 321–331, 1994.

    Google Scholar 

  70. H.T. Siegelmann, E.D. Sontag, “On the Computational Power of Neural Nets,” Journal of Computer and System Sciences”, 50(1):132–150, 1995

    Article  MathSciNet  Google Scholar 

  71. H.T. Siegelmann, B.G. Horne, C.L. Giles, “Computational capabilities of recurrent NARX neural networks,” IEEE Trans. on Systems, Man and Cybernetics-PartB, 27(2):208, 1997.

    Article  Google Scholar 

  72. A. Sperduti, “Stability Properties of Labeling Recursive Auto-Associative Memory”, IEEE Transactions on Neural Networks, 6(6): 1452–1460, 1995.

    Article  Google Scholar 

  73. A. Sperduti, “On the Computational Power of Recurrent Neural Networks for Structures,” Neural Networks, 10(3):395–400, 1997.

    Article  Google Scholar 

  74. G.Z. Sun, H.H. Chen, C.L. Giles, Y.C. Lee, D. Chen, “Connectionist Pushdown Automata that Learn Context-Free Grammars”, Proceedings of International Joint Conference on Neural Networks, Vol. 1: 577–580, Ed: M. Caudill, Lawrence Erlbaum Associates, Hillsdale, NJ, 1990.

    Google Scholar 

  75. G.Z. Sun, H.H. Chen, Y.C. Lee, C.L. Giles, “Turing Equivalence of Neural Networks with Second Order Connection Weights”, Proceedings of International Joint Conference on Neural Networks, Vol. II: 357–362, IEEE Press, Piscataway, NJ, 1991.

    Google Scholar 

  76. A.B. Tickle, R. Andrews, M. Golea, J. Diederich, “The truth will come to light: directions and challenges in extracting the knowledge embedded with trained artificial neural networks,” IEEE Trans. on Neural Networks, (accepted).

    Google Scholar 

  77. P. Tino, J. Sajda, “Learning and Extracting Initial Mealy Machines With a Modular Neural Network Model”, Neural Computation, 7(4):822–844, 1995.

    Google Scholar 

  78. A-C. Tsoi, A. Back., “Locally Recurrent Globally Feedforward Networks, A Critical Review of Architectures”, IEEE Trans. on Neural Networks, 5(2):229–239, 1994.

    Article  Google Scholar 

  79. R.L. Watrous, G.M. Kuhn, “Induction of Finite-State Languages Using Second-Order Recurrent Networks”, Neural Computation, 4(3):406, 1992.

    Google Scholar 

  80. J. Wiles, J. Elman, “Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks,” Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society. MIT Press, 1995.

    Google Scholar 

  81. RJ. Williams, D. Zipser, “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”, Neural Computation, 1:270–280, 1989.

    Google Scholar 

  82. RJ. Williams, D. Zipser, “Gradient-based learning algorithms for recurrent networks and their computational complexity”, Back-propagation: Theory, Architectures and Applications, Eds: Y. Chauvin, D. E. Rumelhart, Ch. 13, pp. 433–486, Lawrence Erlbaum Publishers, Hillsdale, N.J. 1995.

    Google Scholar 

  83. Z. Zeng, R.M. Goodman, P. Smyth, “Learning Finite State Machines with Self-Clustering Recurrent Networks”, Neural Computation, 5(6):976, 1993.

    Google Scholar 

  84. Z. Zeng, R.M. Goodman, P. Smyth, “Discrete Recurrent Neural Networks for Grammatical Inference”, IEEE Trans. on Neural Networks, 5(2):320, 1994.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

C. Lee Giles Marco Gori

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sun, G.Z., Giles, C.L., Chen, H.H. (1998). The neural network pushdown automaton: Architecture, dynamics and training. In: Giles, C.L., Gori, M. (eds) Adaptive Processing of Sequences and Data Structures. NN 1997. Lecture Notes in Computer Science, vol 1387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054003

Download citation

  • DOI: https://doi.org/10.1007/BFb0054003

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64341-8

  • Online ISBN: 978-3-540-69752-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics