# 正则语言（regular language）

## 有限自动机 finite automaton

A **deterministic finite automaton (DFA)** is a 5-tuple (Q, Σ, δ, q_0, F ), where

- $Q$ is a finite set called the states,
- Σ is a finite set called the alphabet,
- δ : Q × Σ→ Q is the transition function,
- q_0 ∈ Q is the start state, and
- F ⊆ Q is the set of accept states.

We can describe M1 formally by writing M1 = (Q, Σ, δ, q1 , F ), where

- $Q = {q1,q2,q3}$,
- Σ = {0,1},
- δ is described as

0 | 1 | |
---|---|---|

q1 | q1 | q2 |

q2 | q3 | q2 |

q3 | q2 | q2 |

- $q1$ is the start state, and
- $F = {q2}$.

If $A$ is the set of all strings that machine $M$ accepts, we say that $A$ is the **language** of machine $M$ and write $L(M ) = A$. We say that $M$ recognizes $A$ or that $M$ accepts $A$.

Let M = (Q,Σ,δ,q_0,F) be a finite automaton and let w = w_1w_2 ··· w_n be a string where each $w_i$ is a member of the alphabet Σ. Then $M$ accepts $w$ if a sequence of states $r_0,r_1,...,r_n$ in $Q$ exists with three conditions:

- $r_0 = q_0$,
- δ(r_i,w_i+1)=r_i+1, for i=0,...,n−1, and
- r_n ∈ F .

### regular language

A language is called a **regular** **language** if some finite automaton recognizes it.

## 有限自动机 finite automaton

A **nondeterministic finite automaton (NFA)** is a 5-tuple (Q,Σ,δ,q_0,F), where

- $Q$ is a finite set of states,
- Σ is a finite alphabet,
- δ : Q × Σ_ε → P (Q) is the transition function,
- q_0 ∈ Q is the start state, and
- F ⊆ Q is the set of accept states.

### EQUIVALENCE OF NFAS AND DFAS

Deterministic and nondeterministic finite automata recognize the same class of languages.

## Regular expression

Say that R is a regular expression if $R$ is

- $a$ for some $a$ in the alphabet Σ,
- ε,
- ∅,
- (R1 ∪ R2), where $R1$ and $R2$ are regular expressions,
- (R1 ◦ R2), where $R1$ and $R2$ are regular expressions, or
- (R^∗), where $R1$ is a regular expression.

### EQUIVALENCE WITH FINITE AUTOMATA

Regular expressions and finite automata are equivalent in their descriptive power.

**THEOREM:** A language is regular if and only if some regular expression describes it.

**PROOF IDEA:** We need to show that if a language A is regular, a regular expression describes it. Because A is regular, it is accepted by a DFA. We describe a procedure for converting DFAs into equivalent regular expressions.

We break this procedure into two parts, using a new type of finite automaton called a **generalized nondeterministic finite automaton, GNFA**.

- First we show how to convert DFAs into GNFAs, and
- then GNFAs into regular expressions.

#### GNFA

Generalized nondeterministic finite automata are simply nondeterministic fi- nite automata wherein the transition arrows may have any regular expressions as labels, instead of only members of the alphabet or ε. The GNFA reads blocks of symbols from the input, not necessarily just one symbol at a time as in an ordi- nary NFA. The GNFA moves along a transition arrow connecting two states by reading a block of symbols from the input, which themselves constitute a string described by the regular expression on that arrow. A GNFA is nondeterministic and so may have several different ways to process the same input string. It ac- cepts its input if its processing can cause the GNFA to be in an accept state at the end of the input.

**Definition:** A generalized nondeterministic finite automaton is a 5-tuple,(Q, Σ, δ, q_{start}, q_{accept}), where

- $Q$ is the finite set of states,
- δ:(Q − \{q_{accept}\} × Q − \{q_{start}\}) →R is the transition function,
- $q_{start}$ is the start state, and
- $q_{accept}$ is the accept state.

The symbol $R$ is the collection of all regular expressions over the alphabet Σ, and $q_{start}$ and $q_{accept}$ are the start and accept states.