एकरूपता के लिए समान रूप से वितरित भार उत्पन्न करें?

मिश्रण मॉडलिंग और रैखिक रूप से आधार कार्यों को संयोजित करने जैसे अनुप्रयोगों में भार का उपयोग करना आम है। बाट $w_i$ अक्सर का पालन करना चाहिए $w_i ≥$ 0 और $\sum_{i} w_i=1$ । मैं ऐसे वैक्टरों के एक समान वितरण से बेतरतीब ढंग से वेट वेक्टर चुनना चाहूंगा $\mathbf{w} = (w_1, w_2, …)$ ।

यह उपयोग करने के लिए आकर्षक हो सकता है $w_i = \frac{\omega_i}{\sum_{j} \omega_j}$ जहां $\omega_i \sim$ (0, 1), तथापि के रूप में नीचे टिप्पणी में चर्चा की, के वितरण $\mathbf{w}$ समान नहीं है।

हालांकि, बाधा $\sum_{i} w_i=1$ , ऐसा लगता है कि समस्या की अंतर्निहित गतिशीलता $n-1$ , और यह कि कुछ वितरण के अनुसार मापदंडों को चुनकर को चुनना संभव हो सकता है और फिर कंप्यूटिंग करना चाहिए। उन मापदंडों से संबंधित (क्योंकि एक बार वजन निर्दिष्ट किए जाते हैं, शेष वजन पूरी तरह से निर्धारित होता है)। $\mathbf{w}$ $n-1$ $\mathbf{w}$ $n-1$

समस्या के समान प्रतीत होता है क्षेत्र बिंदु पिकिंग समस्या (बल्कि 3-वैक्टर जिसका चुनने की तुलना, आदर्श एकता है, मैं लेने के लिए चाहते हैं -vectors जिसका $ℓ_2$ $n$ आदर्श एकता है)। $ℓ_1$

धन्यवाद!

random-generation

— क्रिस
स्रोत

आपकी विधि सिंप्लेक्स पर समान रूप से वितरित वेक्टर उत्पन्न नहीं करती है। जो आप सही तरीके से करना चाहते हैं, वह करने के लिए सबसे सरल तरीका है

iid

यादृच्छिक चर उत्पन्न करना और फिर उनकी राशि से उन्हें सामान्य करना। आप केवल

वेरिएंट को ड्रा करने के लिए कुछ अन्य विधि पाकर इसे करने की कोशिश कर सकते हैं, लेकिन मुझे दक्षता ट्रेडऑफ़ के बारे में संदेह है क्योंकि

वेरिएंट

वेरिएंट से बहुत कुशलता से उत्पन्न हो सकता है ।

n

$n$

E x p (1)

$\mathrm{Exp}(1)$

n - 1

$n-1$

E x p (1)

$\mathrm{Exp}(1)$

U (0, 1)

$U(0,1)$

— कार्डिनल

जवाबों:

Choose $\mathbf{x} \in [0,1]^{n-1}$ uniformly (by means of $n-1$ uniform reals in the interval $[0,1]$ ). Sort the coefficients so that $0 \le x_1 \le \cdots \le x_{n-1}$ . Set

w = (x_{1}, x_{2} - x_{1}, x_{3} - x_{2}, \dots, x_{n - 1} - x_{n - 2}, 1 - x_{n - 1}) .

$\mathbf{w} = (x_1, x_2-x_1, x_3 - x_2, \ldots, x_{n-1} - x_{n-2}, 1 - x_{n-1}).$

Because we can recover the sorted $x_i$ by means of the partial sums of the $w_i$ , the mapping $\mathbf{x} \to \mathbf{w}$ is $(n-1)!$ to 1; in particular, its image is the $n-1$ simplex in $\mathbb{R}^n$ . Because (a) each swap in a sort is a linear transformation, (b) the preceding formula is linear, and (c) linear transformations preserve uniformity of distributions, the uniformity of $\mathbf{x}$ implies the uniformity of $\mathbf{w}$ on the $n-1$ simplex. In particular, note that the marginals of $\mathbf{w}$ are not necessarily independent.

3D point plot

This 3D point plot shows the results of 2000 iterations of this algorithm for $n=3$ . The points are confined to the simplex and are approximately uniformly distributed over it.

क्योंकि इस एल्गोरिथ्म के निष्पादन समय है , यह बड़े के लिए अक्षम है । लेकिन यह सवाल का जवाब देता है! एक बेहतर तरीका (सामान्य रूप से) -simplex पर समान रूप से वितरित मान उत्पन्न करने के लिए है अंतराल , गणना पर वर्दी वास्तविक को आकर्षित करना है। $O(n \log(n)) \gg O(n)$ $n$ $n-1$ $n$ $(x_1, \ldots, x_n)$ $[0,1]$

y_{i} = - \log (x_{i})

$y_i = -\log(x_i)$

(which makes each $y_i$ positive with probability $1$ , whence their sum is almost surely nonzero) and set

w = (y_{1}, y_{2}, \dots, y_{n}) / (y_{1} + y_{2} + \dots + y_{n}) .

$\mathbf w = (y_1, y_2, \ldots, y_n) / (y_1 + y_2 + \cdots + y_n).$

This works because each $y_i$ has a $\Gamma(1)$ distribution, which implies $\mathbf w$ has a Dirichlet $(1,1,1)$ distribution--and that is uniform.

[3D point plot 2]

— whuber
स्रोत

@Chris If by "Dir(1)" you mean the Dirichlet distribution with parameters

(α_{1}, \dots, α_{n})

$(\alpha_1, \ldots, \alpha_n)$ =

(1, 1, \dots, 1)

$(1,1,\ldots,1)$ , then the answer is yes.

— whuber

(+1) One minor comment: The intuition is excellent. Care in interpreting (a) may need to be taken, as it seems that the "linear transformation" in that part is a random one. However, this is easily worked around at the expense of additional formality by using exchangeability of the generating process and a certain invariance property.

— cardinal

More explicitly: For distributions with a density

f

$f$ , the density of the order statistics of an iid sample of size

n

$n$ is

n! f (x_{1}) \dots f (x_{n}) 1_{(x_{1} < x_{2} < \dots < x_{n})}

$n! f(x_1)\cdots f(x_n) 1_{(x_1 < x_2 < \cdots < x_n)}$ . In the case of

f = 1_{[0, 1]} (x)

$f = 1_{[0,1]}(x)$ आदेश आँकड़ों का वितरण एक बहुवचन पर समान है। इस बिंदु से लिया गया, शेष परिवर्तन नियतात्मक हैं और परिणाम निम्नानुसार है।

— कार्डिनल

I_{n - 1} = [0, 1]^{n - 1}

$I_{n-1}=[0,1]^{n-1}$ is carved into

(n - 1)!

$(n-1)!$ regions, of which one is distinguished from the others, and there's a predetermined affine bijection between each region and the distinguished one. Whence, the only additional fact we need is that a uniform distribution on a region is uniform on any measurable subset of it, which is a complete triviality.

— whuber

@whuber: Interesting remarks. Thanks for sharing! I always appreciate your insightful thoughts on such things. Regarding my previous comment on "random linear transformation", my point was that, at least through

x

$\mathbf{x}$ , the transformation used depends on the sample point

ω

$\omega$ . Another way to think of it is there is a fixed, predetermined function

T : R^{n - 1} \to R^{n - 1}

$T: \mathbb{R}^{n-1} \to \mathbb{R}^{n-1}$ such that

w = T (x)

$\mathbf{w} = T(\mathbf{x})$ , but I wouldn't call that function linear, though it is linear on subsets that partition the

(n - 1)

$(n-1)$ -cube. :)

— cardinal

    zz <- c(0, log(-log(runif(n-1))))
    ezz <- exp(zz)
    w <- ezz/sum(ezz)

The first entry is put to zero for identification; you would see that done in multinomial logistic models. Of course, in multinomial models, you would also have covariates under the exponents, rather than just the random zzs. The distribution of the zzs is the extreme value distribution; you'd need this to ensure that the resulting weights are i.i.d. I initially put rnormals there, but then had a gut feeling that this ain't gonna work.

— StasK
स्रोत

That doesn't work. Did you try looking at a histogram?

— cardinal

Your answer is now almost correct. If you generate

n

$n$ iid

E x p (1)

$\mathrm{Exp}(1)$ and divide each by the sum, then you will get the correct distribution. See Dirichlet distribution for more details, though it doesn't discuss this explicitly.

— cardinal

Given the terminology you are using, you sound a little confused.

— cardinal

Actually, the Wiki link does discuss this (fairly) explicitly. See the second paragraph under the Support heading.

— cardinal

This characterization is both too restrictive and too general. It is too general in that the resulting distribution of

w

$\mathbf{w}$ must be "uniform" on the

n - 1

$n-1$ simplex in

R^{n}

$\mathbb{R}^n$ . It is too restrictive in that the question is worded generally enough to allow that

w

$\mathbf{w}$ be some function of an

n - 1

$n-1$ -variate distribution, which in turn presumably, but not necessarily, consists of

n - 1

$n-1$ independent (and perhaps iid) variables.

— whuber

The solution is obvious. The following MathLab code provides the answer for 3 weights.

function [  ] = TESTGEN( )
SZ  = 1000;
V  = zeros (1, 3);
VS = zeros (SZ, 3);
for NIT=1:SZ   
   V(1) = rand (1,1);     % uniform generation on the range 0..1
   V(2) = rand (1,1) * (1 - V(1));
   V(3) = 1 - V(1) - V(2);  
   PERM = randperm (3);    % random permutation of values 1,2,3
   for NID=1:3
         VS (NIT, NID) = V (PERM(NID));
    end
end 
figure;
scatter3 (VS(:, 1), VS(:,2), VS (:,3));
end

— user96990
स्रोत

Your marginals do not have the correct distribution. Judging from the Wikipedia article on the Dirichlet distribution (random number generation section, which has the algorithm you have coded), you should be using a beta(1,2) distribution for V(1), not a uniform[0,1] distribution.

— soakley

It does appear that the density increases in the corners of this tilted triangle. Nonetheless, it provides a nice geometric display of the problem.

— DWin