在看文章前花一点时间去选一篇好的文章和足够的引文资料真的很重要_neural networks and analog computation: beyond the-CSDN博客

本文链接：https://blog.csdn.net/qq_30807319/article/details/128515739

翻译：
在On the Practical Computational Power of Finite Precision RNNs for Language Recognition 中的引文
Hava Siegelmann. 1999. Neural Networks and Analog Computation:Beyond the Turing Limit
附加原文的原因是翻译的水平较低，只为自我学习使用，辅助原文阅读更准确（侵删）

Chapter 1 
Computational Complexity

第一章
计算复杂性

Although neural networks are based on continuous operations, we still analyze 
their computational power using the standard framework of computational 
complexity. In this chapter we provide the background material required for 
the search of the computational fundamentals of neural network and analog 
computational models. Our presentation starts with elementary definitions of 
computational theory, but gradually builds to advanced topics; each compu-
tational term introduced is immediately related to neural models.

虽然神经网络基于连续(我认为其实是利用的连续在搜索中对于任意精度的排序和敛散性度量等性质，并不一定需要连续这个强的概念[译者])，但我们仍然使用计算复杂度的标准框架来分析它们的计算能力。在本章中，内容包括神经网络和模拟计算模型的计算基础知识背景。内容从计算理论的基本定义开始，逐渐构建到高级主题；每个引入的计算项都与神经模型直接相关。

The science of computing deals with characterizing problems and func-
tions. A function, in the mathematical sense, is a mapping between the el-
ements of one set called the domain, and the elements of another called the 
range; consequently, a function can be thought of as a set of domain-range 
ordered pairs. If the function is defined for each domain element, then it is 
said to be total (also called a map), otherwise it is partial. Unless otherwise 
specified, in this book we consider partial functions only. Two partial func-
tions with the same domain and range are said to be equal if, for every domain 
element, either the two functions are undefined or they are of equal value.

计算科学处理表征和函数问题。在数学意义上，函数是定义域和值域元素之间的映射；因此，一个函数可以被认为是一组定义域和值域的有序对。如果函数在每一个定义域的元素上都有定义，则称它是完整的（也称为映射），否则它是部分的。除非另有说明，在本书中我们只考虑部分函数。如果两个函数未定义域相等或它们每个域元素具有相等的值，则两个具有相同定义域和范围的‘部分函数’被称为相等。

A computable function is a unifying rule between "computable objects" 
(Le., objects that can be explicitly presented by finite means, like natural, 
integer or rational numbers) that specify how to get the second element from 
the first. A noncomputable function is an infinite set of ordered pairs (of 
computable objects) for which no "reasonable" rule can be provided. The 
classical theory of computation is focused on both domains and ranges that 
are discrete (rather than continuous). The finiteness (discrete domain) of the 
input and output is a crucial requirement in the theory of computing, as it 
ensures that the power of a model is based purely on its internal structure, 
rather than on the precision of the environment. When the range is binary, 
the functions are called characteristic functions or indicator functions of pred-
icates. In this case, the collection of all domain elements that are mapped to 
"I" constitutes a language. For all computational purposes, functions and 
languages are equivalent, as will be explained in Remark 1.B.1 below; we thus 
use one or the other interchangeably.

可计算函数是“可计算对象”（即可以通过有限方式明确表示的对象，如自然数、整数或有理数）之间的统一规则，定义了如何从第一个元素获取第二个元素。不可计算函数是（可计算对象的）有序对的无限集合，无法为其提供“合理”规则。经典计算理论关注离散（而不是连续）的域和范围。计算理论需要确保输入和输出的有限性（离散域），因为它需要保证模型的能力完全基于其内部结构，而不是基于环境的精度(这里我并不认同，结构和计算的精度都限制了模型的动力学上的性质必然会有所影响，如果要求对于大部分精度成立必然会减少模型的可行性[译者])。当值域是二进制的时候，这些函数称为谓词的特征函数或指示函数。在这种情况下，映射到“1”的所有定义域元素的集合构成了一种语言(这不就是收敛么 [译者])。对于所有计算，函数和语言是等价的，如以下备注 1.8.1 中所述；因此，我们可以互换使用一个或另一个

The field of computational complexity is primarily dedicated to partition-
ing sets of functions into a partially ordered hierarchy of computational classes 
with an increasing degree of difficulty. Historically, in automata theory, func-
tions were characterized by the type of automata that compute them, or in 
other words, each computational class of functions was associated with such 
an automaton. Automata are theoretical machines, also called computational 
models and computational machines. We say that one type of automata is 
stronger than another if the first can compute (or generate) a set of functions 
or decide/recognize (or accept) a set of languages, while the latter can com-
pute only a strict subset of these functions or languages. The most popular 
automaton is the Turing machine; it is neither the weakest nor the strongest, 
but it is the mathematical equivalent of a digital computer having unbounded 
resources [HU79, BDG90j.

计算复杂性领域主要致力于将函数集划分为部分有序的计算类层次结构，但是难度越来越大。从历史上看，在自动机理论中，函数由计算它们的自动机的类型来表征，或者换句话说，函数的每个计算类都与这样的自动机相关联。自动机是理论机器，也称为计算模型和计算机器。如果前者可以计算（或生成）一组函数或决定/识别（或接受）一组语言，而后者只能计算这些函数的严格子集，则我们说一种类型的自动机比另一种强。功能或语言。最流行的自动机是图灵机；它既不是最弱的也不是最强的，但它在数学上相当于一台拥有无限资源的数字计算机。

The modern theory of computational complexity does not deal only with 
the ultimate power of a machine, but also with its expressive power under 
constraints on resources, such as time and space. Resource constraints are 
defined as follows. To each domain element w we associate a measure Iwl 
called its size or its length. We then define a partial function on natural 
numbers T : IN --+ IN. T(lwl) is defined only when the computation of all 
domain elements of this size halt. We say that a machine M computes in time 
T, if for all inputs w for which T(lwl) is defined, M halts after performing 
not more than T(lwl) steps of computation. Time constraints give rise to 
time complexity classes of functions or languages, i.e. the class A-Time (T(n)) 
consists of all the functions/languages that are computed/decided by some 
automaton in A in time T(n). Similarly, space constraints (i.e. the number 
of cells scanned during the computation) give rise to space complexity classes. 
We will concentrate mainly on time complexity classes, and refer to them only 
as complexity classes, with the notation A(T(n)).

现代计算复杂性理论不仅涉及机器的最终能力，还涉及其在资源约束下的表达能力，而且在时间和空间等资源限制下具有表现力。资源约束定义如下。每一个定义域 $w$ 我们使用 $∣ w ∣$ 去描述长度或者大小，然后我们在自然数中定义一个部分函数 $\rightarrow N. T(|w|)$ ,仅当所有在定义域里的这个大小的元素计算完成之后。我么对此称作为机器 $M$ 在时间 $T$ 的计算，如果对于所有的输入 $w$ 都有定义 $T (∣ w ∣)$ ,M在执行不超过 $T (∣ w ∣)$ 步后停止。时间的限制在函数或语言使用时间复杂度类来描述，也就是说， $A - T im e (T (n))$ 类中的函数/语言由一些A类中时间长度为T(n)的自动机计算/决定。类似的，空间限制(例如，计算过程中所扫描的数字)定义了空间复杂度，我们将主要关注时间复杂度类，并仅将它们称为复杂度类，使用符号 $A (T (n))$ 。

The rest of this chapter is organized as follows. We start with a brief 
introduction of neural networks; we then present an overview of some types of 
automata and their complexity classes and discuss their relation with neural 
computation. Our presentation begins with the weakest model and proceeds 
to more powerful ones, concluding with the advice Turing machine, which is 
the most powerful and pertains most directly to our work.

本章的其余部分组织如下。我们首先简要介绍神经网络；然后，我们概述了某些类型的自动机及其复杂性类别，并讨论了它们与神经计算的关系。我们从最弱的模型开始介绍，然后是更强大的模型，最后以最强大的并且与我们的工作最直接相关图灵机的建结束。

————上面是第一章简介，下面挑重要的章节翻译

1.2 Automata: A General Introduction

1.2自动机：一般介绍

The search for mathematical models that reflect various physical control sys-
tems began in the field of automata theory [Min67]. The components of the 
actual system may take many physical forms, such as gears in mechanical de-
vices, relays in electromechanical ones, integrated circuits in modern digital 
computers, or neurons. The behavior of such systems depends on the underly-
ing physical principles. The description of a system as an automaton requires 
the identification of a set of states that characterize the.status of the device at 
any moment in time, and the specification of transition rules that determine 
the next state based on the current state and inputs from the environment. 
Rules for producing output signals may be incorporated into the model as 
well.

对反映各种物理控制系统的数学模型的搜索始于自动机理论领域 [Min67]。实际系统的组件可能采用多种物理形式，例如机械设备中的齿轮、机电设备中的继电器、现代数字计算机中的集成电路或神经元。此类系统的行为取决于基本的物理原理。将系统描述为自动机需要及时识别表征设备状态的一组状态，以及根据当前状态和来自设备的输入确定下一个状态的转换规则规范。环境产生输出信号的规则也可以并入模型中。

Although automata were formalized prior to the advent of digital com-
puters, it is useful to think of automata as describing computers, in order to 
explain their basic principles. In this view, the state of an automaton, at a 
given time t, corresponds to the specification of the complete contents of all 
RAM memory locations, and also of all other variables that can affect the 
operation of the computer, such as registers and instruction decoders. We use 
the symbol x(t) to indicate the state of all variables at time t. At each instant 
(time step) the state is updated, leading to x( t + 1). This update depends on 
the previous state, instructed by the program being executed, as well as on 
external inputs, such as keyboard strokes and pointing-device clicks. We use 
the notation i(t) to summarize the contents of the inputs. (It is mathemati-
cally convenient to consider "no input" as a particular type of input.) Thus 
one postulates an update equation of the type 
x(t + 1) = f(x(t), i{t)) (1.1) 
for some mapping f, or in short-hand form, x+ = f(x, i), where the superscript 
"+" indicates a time-shift.

尽管自动机在数字计算机出现之前就已形式化，但将自动机视为描述计算机以解释其基本原理是有用的。在这个观点下，在给定时间t的情况下，一个自动机的状态对应于所有RAM内存的位置和完整的内容的参数，以及可能影响计算机操作的所有其他变量的参数，例如寄存器和指令解码器。我们使用符号 $x (t)$ 来显示时间 $t$ 上所有变化的状态。在每个瞬间（时间步长），状态都会更新，导致 x(t + 1)。此更新取决于先前的状态，由正在执行的程序指示，以及外部输入，例如键盘敲击和定点设备点击,我们使用符号 i(t) 来总结输入的内容。（将“无输入”视为一种特定类型的输入在数学上很方便。）因此，我们假设一个更新方程式
$x (t + 1) = f (x (t), i (t))$
对于某些映射 f，或简写形式，x+ = f(x, i)，其中上标“+”表示时移。

Also, at each instant, certain outputs are produced: update of the video 
display, characters sent to a printer, and so forth; y(t) symbolizes the total 
output at time t. (Again, it is convenient to think of "no output" as a par-
ticular type of output.) The mapping h calculates the output at time t given 
the internal state at that instant 
y(t) = h(x(t)) . (1.2)

此外，在每个瞬间，都会产生某些输出：视频显示的更新、发送到打印机的字符等等；y(t) 表示时间 t 的总产出。（同样，将“无输出”视为一种特殊类型的输出很方便。）映射 h 计算时间 t 给定时刻内部状态的输出
$y (t) = h (x (t)) .$

Abstractly, an automaton is defined by the above data. Thus, as a math-
ematical object, an automaton is simply the quintuple 
M = (Q,I, Y,J,h) 
consisting of sets Q, I, and Y (called respectively the state, input, and output 
spaces), as well as two functions 
f: QxI-Q, h: Q-Y 
(called the next-state and the output maps, respectively). The sets I and Y 
are typically finite.

抽象的来说，自动机是由上述数据定义的。因此，作为一个数学对象，自动机就是简单的五元组
$M = (Q, I, Y, J, h)$
由集合 Q、I 和 Y（分别称为状态、输入和输出空间）以及两个函数组成
$Q\times I \rightarrow Q, h: Q \rightarrow Y$
（分别称为下一状态和输出映射）。集合 I 和 Y通常是有限的。

When defining the input/output map (I/O map for short) produced by an 
automaton, the input set is 
I = ~ U {$} 
where ~ is the set of possible input letters, and $ is a special letter designat-
ing the end of a string, or equivalently, the empty letter (not to be confused 
with the "no-input" of the RAM example that is a letter in ~). Attention is 
constrained to inputs of the form i = w$oo, where "w" is a finite sequence over 
~ (w E ~*) and "$00" is the infinite sequence of $'s. Given such a finite input 
string and an initial state, a well-defined output string is obtained by recur-
sively solving the update Equations (1.1) and reading-out the corresponding 
outputs. There are many possible conventions regarding the interpretation 
of the input/output behavior (map) of an automaton. The response may be 
defined as the last output symbol produced when the input sequence ends. Al-
ternatively, the computation may end when a special ("accepting" or "final") 
state has been reached, and in this case the output can be defined as either 
the output letter generated at the moment of arrival, or as the sequence of 
output letters accumulated during the computation. In all cases, the output 
string is either finite or not defined.

在定义自动机产生的输入/输出映射（简称I/O映射）时，输入集为
$\sum \cup \{ \$ \}$
其中 $\sum$ 是一组可能的输入字母, $是指定字符串结尾的特殊字母，…没啥用不翻了

——————下面是分块有用的定义和性质进行翻译作为笔记吧

1.2.1 Input Sets in Computability Theory

可计算性理论中的输入集

The input of digital computational models, and of the model described in this 
book, is a stream of symbols (traditionally called words, strings, or sequences) 
belonging to a finite non-empty set ~, commonly called an alphabet. In what 
follows we will consider the binary alphabet, namely the set ~ = {a, I}. Using 
more than two symbols yields at most a linear speedup, while using only one 
symbol may cause an exponential slow-down. Yet, in some particular cases, 
we will still concentrate on the single symbol (unary) alphabet ~ = {a}. In 
these cases the sets of domain elements that are mapped to "I" are not called 
languages but tally sets instead.

数字计算模型和本书描述的模型的输入是属于有限非空集的符号流（传统上称为单词、字符串或序列），通常称为字母表。下面我们将考虑二进制字母表，即集合 $\sum= \{0,1\}$ 。使用两个以上的符号最多产生线性加速，而仅使用一个符号可能会导致指数减速。然而，在某些特定情况下，我们仍将专注于单个符号（一元）字母表 $\sum = \{0\}$ 。在这些情况下，映射到“1”的域元素集不称为语言，而是计数集。

One can also restrict the precision in the neurons, or add noise or stochas-
ticity; these options are analyzed in Chapters 6 and 9, respectively. In addition 
to these variants, many others can be specified in analogy with other related 
models. In particular, we next demonstrate the relation of our model to the 
field of control.

还可以限制神经元的精度，或添加噪声或随机性；这些选项分别在第 6 章和第 9 章中进行了分析。除了这些变体之外，还可以通过类比其他相关模型来指定许多其他变体。特别是，接下来我们将展示我们的模型与控制领域的关系。

3.2.2 Stack Operations

We next demonstrate the usefulness of our stack-encoding.

Reading the Top: Assume that a stack holds the value 0: = 1011 from
top to bottom that is encoded by the number g = .31334. As discussed
above, the value of g is at least ~ when the top of the stack is “1”, and
at most ! otherwise. The linear operation
4g-2
transfers the range [~, 1) that corresponds to the top element being “1”
to [1,2), and the range [~,!) to [-1,0). Thus, the function
top(g) = a(4g - 2)
saturates the resulting values into {O, I} and provides the value of the
top element.