भाषाओं की एक विशेष श्रेणी: "परिपत्र" भाषाएँ। क्या यह ज्ञात है?

20

एक परिमित वर्णमाला सिग्मा पर "परिपत्र" भाषाओं के निम्नलिखित वर्ग को परिभाषित करें। दरअसल, यह नाम डीएनए कंप्यूटिंग के क्षेत्र में उपयोग होने वाली एक अलग चीज को दर्शाने के लिए पहले से मौजूद है। AFAICT, यह भाषाओं का एक अलग वर्ग है।

एक भाषा एल सभी शब्दों के लिए परिपत्र iff है $w$ में $\Sigma^*$ , हमने:

$w$ एल के अंतर्गत आता है तभी सभी पूर्णांकों के लिए करता है, तो $k > 0$ , $w^k$ एल के अंतर्गत आता है

क्या भाषाओं का यह वर्ग ज्ञात है? मुझे उन परिपत्र भाषाओं में दिलचस्पी है जो नियमित हैं और विशेष रूप से इसमें भी हैं:

उनके लिए एक नाम, यदि वे पहले से ही ज्ञात हैं
समस्या की निर्णायकता, एक ऑटोमेटन (विशेष रूप से: एक डीएफए) को देखते हुए, क्या स्वीकृत भाषा उपरोक्त परिभाषा का पालन करती है

fl.formal-languages automata-theory regular-language

— vincenzoml
स्रोत

1

यह एक बहुत ही दिलचस्प सवाल है। दो संबंधित प्रश्न: 1) अगर हमारे पास एक नियमित भाषा एल और एक संबद्ध डीएफए है, तो क्या हम इसे परिपत्र बना सकते हैं? 2) किसी भी भाषा L को देखते हुए, क्या यह मामला है कि Circ (L) नियमित है या कुछ अच्छे गुण हैं?

— सुरेश वेंकट

ps शायद यह स्पष्ट है, लेकिन आपको क्यों लगता है कि परिपत्र भाषा नियमित भाषाओं का उपवर्ग है?

— सुरेश वेंकट

3

@ सुरेश, मुझे लगता है कि वह एक भाषा को परिभाषित कर रहा है, अगर यह एक) नियमित रूप से परिपत्र हो; ख) को संतुष्ट करता है एक बंद संपत्ति

∀w∈L,n∈N:wn∈L $\forall w \in L, n \in \mathbb{N} : w^n \in L$ ।

— पीटर टेलर

क्रॉसपोस्ट में मो ।

— ह्इस-चिह चांग 張顯

1

शायद धन्यवाद पोस्ट नहीं किया जाना चाहिए, लेकिन यह मेरा पहला सवाल था और मैंने टिप्पणियों, उत्तर और चर्चा की गुणवत्ता की बहुत सराहना की। धन्यवाद।

— vincenzoml

19

पहले भाग में, हम परिपत्रता तय करने के लिए एक घातीय एल्गोरिदम दिखाते हैं। दूसरे भाग में, हम दिखाते हैं कि यह समस्या coNP- हार्ड है। तीसरे भाग में, हम दिखाते हैं कि हर गोलाकार भाषा (यहाँ रिक्त रिजेक्स हो सकती है) के रूप की भाषाओं का एक संघ है ; संघ आवश्यक रूप से असहमति नहीं है। चौथे भाग में, हम एक परिपत्र भाषा है जो एक संबंध तोड़ना राशि के रूप में नहीं लिखा जा सकता है प्रदर्शन । $r^+$ $r$ $\sum r_i^+$

संपादित करें: मार्क की टिप्पणियों के बाद कुछ सुधार शामिल किए गए। विशेष रूप से, मेरा पहले का दावा है कि परिपत्रता coNP-complete या NP-hard सही है।

संपादित करें: से सामान्य रूप सही किया करने के लिए । एक "स्वाभाविक रूप से अस्पष्ट" भाषा का प्रदर्शन किया। $\sum r_i^*$ $\sum r_i^+$

पीटर टेलर की टिप्पणी को जारी रखते हुए, यहां बताया गया है कि कैसे तय किया जाए (बेहद अक्षम रूप से) कि क्या कोई भाषा परिपत्र डीएफए को दिया गया है। एक नए DFA का निर्माण करें जिसके राज्य हैं वर्ष राज्यों के -tuples। यह नया डीएफएसमानांतर में पुराने डीएफए की प्रतियांचलाता है। $n$ $n$

भाषा परिपत्र नहीं है, तो फिर वहाँ एक शब्द है ऐसी है कि अगर हम बार-बार DFA के माध्यम से चलाने के लिए, के साथ प्रारंभिक अवस्था शुरू करने , तो हम मिल राज्यों ऐसी है कि को स्वीकार लेकिन एक है अन्य लोगों का स्वीकार नहीं कर रहा है (अगर उन सभी को तो स्वीकार कर रहे हैं तो अनुक्रम चाहिए चक्र इतना है कि भाषा में हमेशा होता है)। दूसरे शब्दों में, हमारे पास से एक रास्ता है $w$ $s_0$ $s_1,\ldots,s_n$ $s_1$ $s_0,\ldots,s_n$ $w^*$ के लिए जहां स्वीकार कर रहा है, लेकिन दूसरों में से एक को स्वीकार नहीं कर रहा है। इसके विपरीत, यदि भाषा परिपत्र है तो ऐसा नहीं हो सकता है। $s_0,\ldots,s_{n-1}$ $s_1,\ldots,s_n$ $s_1$

इसलिए हमने समस्या को एक सरल निर्देशित पुनर्संचनीयता परीक्षण में कम कर दिया है (बस सभी संभव "खराब" ट्यूपल्स की जांच करें)। $n$

गोलाकार की समस्या coNP-hard है। मान लीजिए कि हमें वेरिएबल और क्लॉज़ साथ 3SAT इंस्टेंस दिया गया है । हम यह मान सकते हैं कि (डमी चर जोड़ें) और वह अभाज्य है (अन्यथा बीच एक अभाज्य ज्ञात कीजिए $n$ $\vec{x}$ $m$ $C_1,\ldots,C_m$ $n = m$ $n$ $n$ AKS primality टेस्टिंग का उपयोग करके और , और डमी वैरिएबल और क्लॉस जोड़ें)। $2n$

निम्नलिखित भाषा पर विचार करें: "इनपुट है नहीं फार्म के जहां है के लिए एक संतोषजनक काम "। इस भाषा के लिए DFA का निर्माण करना आसान है । यदि भाषा गोलाकार नहीं है तो भाषा में एक शब्द है, जिसकी कुछ शक्ति भाषा में नहीं है। चूँकि भाषा के एकमात्र शब्दों की लंबाई , की लंबाई होनी चाहिए $\vec{x}_1 \cdots \vec{x}_n$ $\vec{x}_i$ $C_i$ $O(n^2)$ $w$ $n^2$ $w$ $1$ या । यदि यह लंबाई का है $n$ ,इसके बजाय पर विचार करें(यह अभी भी भाषा में है), ताकि भाषा में है और भाषा में नहीं है। तथ्य यह है कि भाषा में नहीं है इसका मतलब है कि एक संतोषजनक काम है। $1$ $w^n$ $w$ $w^n$ $w^n$ $w$

इसके विपरीत, कोई भी संतोषजनक असाइनमेंट भाषा की गैर-परिपत्रता को साबित करने वाले शब्द में अनुवाद करता है: संतोषजनक असाइनमेंट भाषा से संबंधित है लेकिन नहीं करता है। इस प्रकार भाषा गोलाकार है यदि 3SAT उदाहरण असंतोषजनक है। $w$ $w^n$

इस भाग में, हम गोलाकार भाषाओं के लिए एक सामान्य रूप पर चर्चा करते हैं। एक परिपत्र भाषा लिए कुछ DFA पर विचार करें । एक अनुक्रम है असली है, तो (प्रारंभिक अवस्था), अन्य सभी राज्यों स्वीकार कर रहे हैं, और तात्पर्य । इस प्रकार हर वास्तविक अनुक्रम अंततः आवधिक होता है, और केवल वास्तविक रूप से कई वास्तविक अनुक्रम होते हैं (चूंकि डीएफए में बहुत सारे राज्य हैं)। $L$ $C = C_0,\ldots$ $C_0 = s$ $C_i = C_j$ $C_{i+1} = C_{j+1}$

हम जानते हैं कि एक शब्द कहना के अनुसार बर्ताव $C$ शब्द DFA राज्य से लेता है, तो राज्य के लिए , सभी के लिए । ऐसे सभी शब्दों का सेट नियमित है (तर्क इस उत्तर के पहले भाग के समान है)। ध्यान दें कि $c_i$ $c_{i+1}$ $i$ $E(C)$ $E(C)$ सबसेट है । $L$

एक वास्तविक अनुक्रम को देखते हुए , को अनुक्रम अनुक्रम में परिभाषित करें । अनुक्रम भी वास्तविक है। चूँकि केवल बहुत ही अलग तरह के सीक्वेंस हैं , भाषा जो कि सभी का मिलन है, भी नियमित है। $C$ $C^k$ $C^k(t) = C(kt)$ $C^k$ $C^k$ $D(C)$ $E(C^k)$

हम दावा करते हैं कि संपत्ति है कि यदि तो । दरअसल, कि लगता है और । फिर । इस प्रकार को फॉर्म में लिखा जा सकता है $D(C)$ $x,y \in D(C)$ $xy \in D(C)$ $x \in C^k$ $y \in C^l$ $xy \in C^{k+l}$ $D(C) = D(C)^+$ $r^+$ कुछ नियमित अभिव्यक्ति के लिए । $r$

हर एक शब्द कुछ असली अनुक्रम भाषा मेल खाती में , यानी कोई वास्तविक अनुक्रम वहां मौजूद कि के अनुसार बर्ताव करती है। इस प्रकार सभी वास्तविक अनुक्रम ऊपर का मिलन । इसलिए हर परिपत्र भाषा प्रपत्र का प्रतिनिधित्व है । इसके विपरीत, ऐसी हर भाषा परिपत्र (तुच्छ) है। $w$ $C$ $C$ $w$ $L$ $D(C)$ $C$ $\sum r_i^+$

परिपत्र भाषा पर विचार करें भर में शब्दों का है कि या तो सम संख्या या शामिल की या की समान संख्या के (या दोनों)। हम बताते हैं कि यह एक संबंध तोड़ना राशि के रूप में नहीं लिखा जा सकता है ; "संबंध तोड़ना" द्वारा हम इसका मतलब यह । $L$ $a,b$ $a$ $b$ $\sum r_i^+$ $r_i^+ \cap r_j^+ = \varnothing$

Let $N_i$ be the size of the some DFA for $r_i^+$ , and $N > \max N_i$ be some odd integer. Consider $x = a^N b^{N!}$ . Since $x \in L$ , $x \in r_i^+$ for some $i$ . By the pumping lemma, we can pump a prefix of $x$ of length at most $N$ . Thus $r_i^+$ generates । इसी प्रकार, कुछ उत्पन्न होता है, जो भी उत्पन्न करता है। ध्यान दें कि बाद से । इस प्रकार प्रतिनिधित्व असहमति नहीं हो सकता है। $z = a^{N!} b^{N!}$ $y = a^{N!} b^N$ $r_j^+$ $z$ $i \neq j$ $xy \notin L$

— युवल फिल्मस
स्रोत

There seem to be a number of errors here. You're reducing from UNSAT, not SAT, so you're showing it's coNP-hard. What's your polynomial time witness for (non)-membership?

— Mark Reitblatt

"Since the only words not in the language have length

n2 $n^2$ " Shouldn't that be

nm $nm$ ?

— Mark Reitblatt

I don't think it's "trivially in coNP". At least, it's not trivially obvious to me. The "obvious" certificate would be a string

l $l$ in the language, and a power

k $k$ such that

lk $l^k$ isn't in the language. But it's not immediately obvious to me why such a word must be polynomially-sized. Maybe it's by a simple fact of automata theory that I'm overlooking.

— Mark Reitblatt

An even more serious apparent flaw is that you jump from each clause being satisfiable individually to the whole formula being satisfiable. Unless I am misreading, of course.

— Mark Reitblatt

I agree that it's not clear that circularity is in coNP. On the other hand, I see no problems in the rest of the argument (now that I've put

n=m $n = m$ ). If each clause is satisfied by the same assignment, then the 3SAT instance is satisfied by this assignment.

— Yuval Filmus

17

Here are some papers that discuss these languages:

Thierry Cachat, The power of one-letter rational languages, DLT 2001, Springer LNCS #2295 (2002), 145-154.

S. Hovath, P. Leupold, and G. Lischke, Roots and powers of regular languages, DLT 2002, Springer LNCS #2450 (2003), 220-230.

H. Bordihn, Context-freeness of the power of context-free languages is undecidable, TCS 314 (2004), 445-449.

— Jeffrey Shallit
स्रोत

6

@Dave Clarke, L = a*|b* would be circular, but L* would be (a|b)*.

In terms of decidability, a language $L$ is circular if there is an $L'$ such that $L$ is the closure under + of $L'$ or if it is a finite union of circular languages.

(I'm dying to redefine "circular" replacing your $>$ with $\ge$ . It simplifies things a lot. We can then characterise the circular languages as those for which there exists a NDFA whose starting state has only epsilon-transitions to accepting states and has an epsilon-transition to each accepting state).

— Peter Taylor
स्रोत

You are right. I've removed my incorrect post.

— Dave Clarke

Regarding adaption with

≥ $\geq$ : I am thinking that a minimal DFA should always have exactly one accepting state, namely the start state. Maybe more accepting states can happen, but then they need an

ε $\varepsilon$ -transition to the start state.

— Raphael

1

@Raphael, consider again L = a*|b*. A DFA whose start state is the only accepting state and which accepts a and b must accept (a|b)*.

— Peter Taylor

On the question of decidability, again: suppose you have a DFA with

n $n$ states of which

na $n_a$ are accepting. Suppose it accepts a word

w $w$ , and also accepts

w2 $w^2$ ,

w3 $w^3$ , ...,

wna+1 $w^{n_a+1}$ . Then it accepts

wx $w^x$ for

x>0 $x > 0$ . (Proof is a straightforward application of the pigeonhole principle). If it's possible to show that the minimal (minimising

|w| $|w|$ ) counterexample (

w $w$ ,

x $x$ ) to the circularity of the language accepted by the DFA has length bounded by a function of

n $n$ then brute force testing is possible. I suspect that

|w|<=n+1 $|w| <= n+1$ , but I haven't proved it.

— Peter Taylor

To follow up on @Raphael's idea above. The idea of start state = only accept state is wrong for this problem, but it does capture some interesting property. When M is a minDFA, the start state is the only accept state if and only if L(M) is the Kleene star of a prefix-free language. This is one of my favorite DFA trivia tidbits and thus I am quick to share it! ;)

— mikero

5

Edit: A complete (simplified) PSPACE-completeness proof appears below.

Two updates. First, the normal form described in my other answer appears already in a paper by Calbrix and Nivat titled Prefix and period languages of rational $\omega$ -langauges, unfortunately not available online.

Second, deciding whether a language is circular given its DFA is PSPACE-complete.

Circularity in PSPACE. Since NPSPACE=PSPACE by Savitch's theorem, it is enough to give an NPSPACE algorithm for non-circularity. Let $A = (Q,\Sigma,\delta,q_0,F)$ be a DFA with $|Q|=n$ states. The fact that the syntactic monoid of $L(A)$ has size at most $n^n$ implies that if $L(A)$ is not circular then there is a word $w$ of length at most $n^n$ such that $w \in L(A)$ but $w^k \notin L(A)$ for some $k \leq n$ . The algorithm guesses $w$ and computes $\delta_w(q) = \delta(q,w)$ for all $q \in Q$ , using $O(n\log n)$ space (used to count up to $n^n$ ). It then verifies that $\delta_w(q_0) \in F$ but $\delta_w^{(k)} \notin F$ for some $k \leq n$ .

Circularity is PSPACE-hard. Kozen showed in his classic 1977 paper Lower bounds for natural proof systems that it is PSPACE-hard to decide, given a list of DFAs, whether the intersection of the languages accepted by them is empty. We reduce this problem to circularity. Given binary DFAs $A_1,\ldots,A_n$ , we find a prime $p \in [n,2n]$ and construct a ternary DFA $A$ accepting the language

$L(A) = \overline{\{2w_12w_2\cdots2w_p : w_i \in L(A_{1+(i\mod{n})})\}}.$ (With some more effort, we can make

$A$ binary as well.) It is not difficult to see (using the fact that

$p$ is prime) that

$L(A)$ is circular if and only if the intersection

$L(A_1) \cap \cdots \cap L(A_n)$ is empty.

— Yuval Filmus
स्रोत

0

Every $s \in L$ of length $p>0$ can be written as $xy^{i}z$ where $x = z = \epsilon$ , $y = w \neq \epsilon$ . It's obvious that $|xy| \leq p$ and $|y| = |w| > 0$ . It follows that the language is regular for non-empty inputs, by the pumping lemma.

For $w= \epsilon$ , the definition holds, since a NDFA that accepts the empty string will also accept any number of empty strings.

The union of the above languages is the language L and since regular languages are closed under union, it follows that every circular language is regular.

By Rice's theorem, $CIRCULARITY/TM$ is undecidable. The proof is similar to regularity.

— chazisop
स्रोत

1

The pumping lemma is a necessary, but not sufficient, condition for regularity. In particular, there are nonregular languages satisfying the pumping condition. Also, Rice's theorem would say that

$\{\langle M\rangle\vert L(M)\text{ is circular}\}$ is undecidable. This does not mean that

$\{\langle D\rangle\vert L(D)\text{ is circular}\}$ is undecidable (where

$D$ is a DFA,

$M$ a TM)! For instance, emptiness testing for DFAs is decidable, while emptiness testing for TMs is not.

— alpoge

1

Here's a non-computable circular language. Let

$D = \{ 0^x 1 : x \in R\}$ , where

$R$ is some non-computable language (e.g. codes of halting TMs). Then

$D^*$ is circular but clearly non-computable (an oracle for

$D^*$ can be used to decide

$R$ ).

— Yuval Filmus

2

@Peter, have you read this answer? It was trying to prove that any circular language (without the condition of regularity) is regular.

— Yuval Filmus

1

@Yuval, my mistake. @chazisop, the pumping lemma is useful for proving non-regularity of languages, but not regularity. (Besides, the assertion of your first sentence reduces to "Every

$s \in L$ of length

$p > 0$ can be written as

$y^i$ where

$y \ne \epsilon$ ", which is clearly false).

— Peter Taylor

1

Yes, I use CIRCULARITY/TM to refer to this. CIRCULARITY/DFA is probably decidable.

— chazisop