एक जुंटा फूटने की ललक

16

हम कहते हैं कि एक बूलियन फ़ंक्शन एक -junta है अगर में सबसे अधिक प्रभावित चर है। $f: \{0,1\}^n \to \{0,1\}$ $k$ $f$ $k$

चलो एक हो -junta। द्वारा के चर को निरूपित करें । फिक्स स्पष्ट रूप से, वहाँ मौजूद है जैसे कि में के प्रभावशाली चर कम से कम । $f: \{0,1\}^n \to \{0,1\}$ $2k$ $f$ $x_1, x_2, \ldots, x_n$

S 1 = {x 1, x 2, \dots, x n 2}, S 2 = {x n 2 + 1, x n 2 + 2, \dots, x n} .

$S_1 = \left\{ x_1, x_2, \ldots, x_{\frac{n}{2}} \right\},\quad S_2 = \left\{ x_{\frac{n}{2} + 1}, x_{\frac{n}{2} + 2}, \ldots, x_n \right\}.$

S∈{S1,S2} $S \in \{S_1, S_2\}$

S $S$

k $k$

f $f$

आइए अब , और कहा कि यह मान है -far से हर -junta (यानी, एक का एक अंश को बदलने के लिए है के मानों का कम से कम क्रम में इसे -junta) बनाने के लिए। क्या हम उपरोक्त कथन का "मजबूत" संस्करण बना सकते हैं? यही है, वहाँ एक सार्वभौमिक स्थिर , और एक सेट ऐसा है कि is -far हर फ़ंक्शन से होता है जिसमें में अधिकांश प्रभावित चर होते हैं। ? $\epsilon > 0$ $f: \{0,1\}^n \to \{0,1\}$ $\epsilon$ $2k$ $\epsilon$ $f$ $2k$ $c$ $S \in \{S_1, S_2\}$ $f$ $\frac{\epsilon}{c}$ $k$ $S$

नोट: प्रश्न के मूल सूत्रीकरण में, $c$ को रूप में तय किया गया था $2$ । नील के उदाहरण से पता चलता है कि का ऐसा मूल्य $c$ पर्याप्त नहीं है। हालांकि, संपत्ति परीक्षण के बाद से हम आमतौर पर स्थिरांक के साथ चिंतित नहीं हैं, मैंने स्थिति को थोड़ा आराम दिया।

क्या आप अपनी शर्तों को स्पष्ट कर सकते हैं? एक चर "प्रभावित करना" है जब तक कि च का मूल्य हमेशा चर से स्वतंत्र नहीं होता है? क्या " मान बदल " का अर्थ है, कुछ विशेष लिए मानों में से एक को बदलना ?

f $f$

f(x) $f(x)$

x $x$

— नील युवा

बेशक, चर प्रभावित कर रहा है यदि कोई string string मौजूद है जैसे कि , जहाँ string जिसके साथ 'th निर्देशांक फ़्लिप किया गया है। के मान को बदलने का अर्थ है इसकी सत्य तालिका में बदलाव करना।

xi $x_i$

n $n$

y $y$

f(y)≠f(y′) $f(y) \neq f(y')$

y′ $y'$

y $y$

i $i$

f $f$

17

इसका जवाब है हाँ"। विरोधाभास ही सबूत है।

सूचनात्मक सुविधा के लिए, द्वारा पहले चर और दूसरे चर को निरूपित करते हैं । कि मान लीजिए है एक समारोह को -close जो केवल पर निर्भर करता है के निर्देशांक । द्वारा इसके प्रभावशाली निर्देशांक को निरूपित करें । इसी तरह, लगता है कि है $n/2$ $x$ $n/2$ $y$ $f(x,y)$ $\delta$ $f_1(x,y)$ $k$ $x$ $T_1$ $f(x,y)$ $\delta$ एक फंक्शन $f_2(x,y)$ जो केवल $k$ निर्देशांक पर निर्भर करता है । द्वारा इसके प्रभावशाली निर्देशांक को अस्वीकार करें । हम साबित होता है कि जरूरत है है एक के करीब - -junta । $y$ $T_2$ $f$ $4\delta$ $2k$ $\tilde f(x,y)$

हमें का कहना है कि $(x_1,y_1) \sim (x_2,y_2)$ यदि $x_1$ और $x_2$ में सभी निर्देशांक पर सहमत हैं $T_1$ और $y_1$ और $y_2$ में सभी निर्देशांक पर सहमत हैं $T_2$ । हम प्रत्येक समतुल्य वर्ग से यादृच्छिक रूप से एक प्रतिनिधि चुनते हैं। आज्ञा देना $(\bar x, \bar y)$ वर्ग के लिए प्रतिनिधि हो $(x,y)$ । परिभाषित करें $\tilde f$ इस प्रकार है:

f ~ (x, y) = f (x ¯, y ¯) .

$\tilde f(x,y) = f(\bar x, \bar y).$

ऐसा नहीं है कि स्पष्ट है $\tilde f$ एक है $2k$ -junta (उस में केवल चर पर निर्भर करता है $T_1 \cup T_2)$ । हम साबित करना होगा कि यह दूरी पर है $4\delta$ से $f$ उम्मीद में।

हम साबित करना चाहते हैं कि जहां और यादृच्छिक पर समान रूप से चुना जाता है। एक यादृच्छिक वेक्टर पर विचार करें

Pr f ~ (Pr x, y (f ~ (x, y) \neq f (x, y))) = Pr (f (x ¯, y ¯) \neq f (x, y)) \leq 4 δ,

$\Pr_{\tilde f}(\Pr_{x,y}(\tilde f(x,y) \neq f(x,y))) = \Pr(f(\bar x, \bar y) \neq f(x,y)) \leq 4\delta,$

x $x$

y $y$

से प्राप्त

में सभी बिट्स रखकर

और बेतरतीब ढंग से नहीं सभी बिट्स flipping

, और एक वेक्टर

इसी प्रकार परिभाषित। ध्यान दें कि

x~ $\tilde x$

x $x$

T1 $T_1$

y~ $\tilde y$

Pr (f ~ (x, y) \neq f (x, y)) = Pr (f (x ¯, y ¯) \neq f (x, y)) = Pr (f (x ~, y ~) \neq f (x, y)) .

$\Pr(\tilde f(x,y) \neq f(x,y)) = \Pr(f(\bar x, \bar y) \neq f(x,y))= \Pr(f(\tilde x, \tilde y) \neq f(x,y)).$

हम है,

Pr (f (x, y) \neq f (x ~, y)) \leq Pr (f (x, y) \neq f 1 (x, y)) + Pr (f 1 (x, y) \neq f 1 (x ~, y)) + Pr (f 1 (x ~, y) \neq f (x ~, y)) \leq δ + 0 + δ = 2 δ .

$\Pr(f(x,y) \neq f(\tilde x, y)) \leq \Pr(f(x,y) \neq f_1(x, y)) + \Pr(f_1(x,y) \neq f_1(\tilde x, y)) + \Pr(f_1(\tilde x,y) \neq f(\tilde x, y)) \leq \delta + 0 + \delta = 2\delta.$

इसी तरह, । हम QED $\Pr(f(\tilde x,y) \neq f(\tilde x, \tilde y)) \leq 2\delta$

Pr (f (x ¯, y ¯) \neq f (x, y)) \leq 4 δ .

$\Pr(f(\bar x, \bar y) \neq f(x,y)) \leq 4\delta.$

इस प्रमाण को "व्युत्पन्न" करना आसान है। हर , चलो यदि के लिए सबसे अधिक की समतुल्यता कक्षा में , और , अन्यथा। $(x,y)$ $\tilde f(x,y) = 1$ $f(x,y) = 1$ $(x',y')$ $(x,y)$ $\tilde f(x,y) = 0$

— यूरी
स्रोत

12

सबसे छोटा जो बन्ध है वह $c$ । $c = \frac{1}{\sqrt 2 - 1} \approx 2.41$

लेममस 1 और 2 बताते हैं कि इस लिए बाध्य है । लेम्मा 3 से पता चलता है कि यह बाध्य तंग है। $c$

(तुलना में, जुरी का सुरुचिपूर्ण संभाव्य तर्क देता है ।) $c=4$

चलो । लेम्मा 1लिए ऊपरी सीमा देता है। $c=\frac{1}{\sqrt 2 - 1}$ $k=0$

लेम्मा 1: यदि है -near एक समारोह में कोई प्रभावित करने वेरिएबल नहीं है कि , और है एक समारोह -near है कि कोई में चर को प्रभावित किया है , तो है -near एक निरंतर समारोह , जहां $f$ $\epsilon_g$ $g$ $S_2$ $f$ $\epsilon_h$ $h$ $S_1$ $f$ $\epsilon$ । $\epsilon \le \frac{(\epsilon_g+\epsilon_h)/2}{c}$

प्रमाण। चलो से दूरी होना एक निरंतर कार्य करने के लिए। विरोधाभास के लिए मान लीजिए कि दावा किया असमानता को संतुष्ट नहीं करता है। चलो और और लिखने , , और के रूप में $\epsilon$ $f$ $\epsilon$ $y=(x_1,x_2,\ldots,x_{n/2})$ $z=(x_{n/2}+1,\ldots,x_n)$ $f$ $g$ $h$ $f(y,z)$ , $g(y,z)$ and $h(y,z)$ , so $g(y,z)$ is independent of $z$ and $h(y,z)$ is independent of $y$ .

(I find it helpful to visualize $f$ as the edge-labeling of the complete bipartite graph with vertex sets $\{y\}$ and $\{z\}$ , where $g$ gives a vertex-labeling of $\{y\}$ , and $h$ gives a vertex-labeling of $\{z\}$ .)

Let $g_0$ be the fraction of pairs $(y,z)$ such that $g(y,z) = 0$ . Let $g_1=1-g_0$ be the fraction of pairs such that $g(y,z) = 1$ . Likewise let $h_0$ be the fraction of pairs such that $h(y,z) = 0$ , and let $h_1$ be the fraction of pairs such that $h(y,z) = 1$ .

Without loss of generality, assume that, for any pair such that $g(y,z) = h(y,z)$ , it also holds that $f(y,z) = g(y,z) = h(y,z)$ . (Otherwise, toggling the value of $f(y,z)$ allows us to decrease both $\epsilon_g$ and $\epsilon_h$ by $1/2^n$ , while decreasing the $\epsilon$ by at most $1/2^n$ , so the resulting function is still a counter-example.) Say any such pair is ``in agreement''.

The distance from $f$ to $g$ plus the distance from $f$ to $h$ is the fraction of $(x,y)$ pairs that are not in agreement. That is, $\epsilon_g + \epsilon_h = g_0 h_1 + g_1 h_0$ .

The distance from $f$ to the all-zero function is at most $1 - g_0 h_0$ .

The distance from $f$ to the all-ones function is at most $1-g_1 h_1$ .

Further, the distance from $f$ to the nearest constant function is at most $1/2$ .

Thus, the ratio $\epsilon/(\epsilon_g+\epsilon_h)$ is at most

min ( 1 / 2 , 1 - g 0 h 0 , 1 - g 1 h 1 ) g 0 h 1 + g 1 h 0,

$\frac{\min(1/2, 1-g_0 h_0, 1-g_1 h_1)}{g_0 h_1 + g_1 h_0},$ where

g0,h0∈[0,1] $g_0,h_0 \in [0,1]$ and

g1=1−g0 $g_1 = 1-g_0$ and

h1=1−h0 $h_1=1-h_0$ .

By calculation, this ratio is at most $\frac{1}{2(\sqrt 2 - 1)} = c/2$ . QED

Lemma 2 extends Lemma 1 to general $k$ by arguing pointwise, over every possible setting of the $2k$ influencing variables. Recall that $c=\frac{1}{\sqrt 2 - 1}$ .

Lemma 2: Fix any $k$ . If $f$ is $\epsilon_g$ -near a function $g$ that has $k$ influencing variables in $S_2$ , and $f$ is $\epsilon_h$ -near a function $h$ that has $k$ influencing variables in $S_1$ , then $f$ is $\epsilon$ -near a function $\hat f$ that has at most $2k$ influencing variables, where $\epsilon \le \frac{(\epsilon_g+\epsilon_h)/2}{c}$ .

Proof. Express $f$ as $f(a,y,b,z)$ where $(a,y)$ contains the variables in $S_1$ with $a$ containing those that influence $h$ , while $(b,z)$ contains the variables in $S_2$ with $b$ containing those influencing $g$ . So $g(a,y,b,z)$ is independent of $z$ , and $h(a,y,b,z)$ is independent of $y$ .

For each fixed value of $a$ and $b$ , define $F_{ab}(y,z) = f(a,y,b,z)$ , and define $G_{ab}$ and $H_{ab}$ similarly from $g$ and $h$ respectively. Let $\epsilon^g_{ab}$ be the distance from $F_{ab}$ to $G_{ab}$ (restricted to $(y,z)$ pairs). Likewise let $\epsilon^h_{ab}$ be the distance from $F_{ab}$ to $H_{ab}$ .

By Lemma 1, there exists a constant $c_{ab}$ such that the distance (call it $\epsilon_{ab}$ ) from $F_{ab}$ to the constant function $c_{ab}$ is at most $(\epsilon^h_{ab} + \epsilon^g_{ab})/(2c)$ . Define $\hat f(a,y,b,z) = c_{ab}$ .

Clearly $\hat f$ depends only on $a$ and $b$ (and thus at most $k$ variables).

Let $\epsilon_{\hat f}$ be the average, over the $(a,b)$ pairs, of the $\epsilon_{ab}$ 's, so that the distance from $f$ to $\hat f$ is $\epsilon_{\hat f}$ .

Likewise, the distances from $f$ to $g$ and from $f$ to $h$ (that is, $\epsilon_g$ and $\epsilon_h)$ are the averages, over the $(a,b)$ pairs, of, respectively, $\epsilon^g_{ab}$ and $\epsilon^h_{ab}$ .

Since $\epsilon_{ab} \le (\epsilon^h_{ab} + \epsilon^g_{ab})/(2c)$ for all $a, b$ , it follows that $\epsilon_{\hat f} \le (\epsilon_g + \epsilon_h)/(2c)$ . QED

Lemma 3 shows that the constant $c$ above is the best you can hope for (even for $k=0$ and $\epsilon=0.5$ ).

Lemma 3: There exists $f$ such that $f$ is $(0.5/c)$ -near two functions $g$ and $h$ , where $g$ has no influencing variables in $S_2$ and $h$ has no influencing variables in $S_1$ , and $f$ is $0.5$ -far from every constant function.

Proof. Let $y$ and $z$ be $x$ restricted to, respectively, $S_1$ and $S_2$ . That is, $y=(x_1,\ldots,x_{n/2})$ and $z=(x_{n/2+1},\ldots,x_n)$ .

Identify each possible $y$ with a unique element of $[N]$ , where $N=2^{n/2}$ . Likewise, identify each possible $z$ with a unique element of $[N]$ . Thus, we think of $f$ as a function from $[N]\times[N]$ to $\{0,1\}$ .

Define $f(y,z)$ to be 1 iff $\max(y,z) \ge \frac{1}{\sqrt 2}N$ .

By calculation, the fraction of $f$ 's values that are zero is $(\frac{1}{\sqrt 2})^2 = \frac{1}{2}$ , so both constant functions have distance $\frac{1}{2}$ to $f$ .

Define $g(y,z)$ to be 1 iff $y\ge \frac{1}{\sqrt 2}N$ . Then $g$ has no influencing variables in $S_2$ . The distance from $f$ to $g$ is the fraction of pairs $(y,z)$ such that $y<\frac{1}{\sqrt 2}N$ and $z\ge \frac{1}{\sqrt 2}N$ . By calculation, this is at most $\frac{1}{\sqrt 2}(1-\frac{1}{\sqrt2}) = 0.5/c$

Similarly, the distance from $f$ to $h$ , where $h(y,z)=1$ iff $z\ge \frac{1}{\sqrt 2}N$ , is at most $0.5/c$ .

QED

— Neal Young
स्रोत

First of all, thanks Neal! This indeed sums it up for

$k=0$ , and sheds some light on the general problem. However in the case of

$k=0$ the problem is a bit degenerate (as

$2k=k$ ), so I'm more curious regarding the case of

$k \ge 1$ . I didn't manage to extend this claim for

$k>0$ , so if you have an idea on how to do it - I'd appreciate it. If it simplifies the problem, then the exact constants are not crucial; that is,

$\epsilon/2$ -far can be replaced by

$\epsilon/c$ -far, for some universal constant

$c$ .

2

I've edited it to add the extension to general k. And Yuri's argument below gives a slightly looser factor with an elegant probabilistic argument.

— Neal Young

Sincere thanks Neal! This line of reasoning is quite enlightening.