It's perhaps worth reading about Lagrangian duality and a broader relation (at times equivalence) between:
- optimization subject to hard (i.e. inviolable) constraints
- optimization with penalties for violating constraints.
Quick intro to weak duality and strong duality
Assume we have some function f(x,y) of two variables. For any x^ and y^, we have:
minxf(x,y^)≤f(x^,y^)≤maxyf(x^,y)
Since that holds for any x^ and y^ it also holds that:
maxyminxf(x,y)≤minxmaxyf(x,y)
इसे कमजोर द्वंद्व के रूप में जाना जाता है । कुछ परिस्थितियों में, आपके पास मजबूत द्वंद्व भी होता है (जिसे काठी बिंदु संपत्ति के रूप में भी जाना जाता है ):
maxyminxf(x,y)=minxmaxyf(x,y)
जब मजबूत द्वंद्व होता है, तो दोहरी समस्या को हल करना भी मौलिक समस्या को हल करता है। वे एक ही समस्या में हैं!
विवश रिज प्रतिगमन के लिए अंतराल
मुझे फ़ंक्शन को परिभाषित करने देंL as:
L(b,λ)=∑i=1n(y−xi⋅b)2+λ(∑j=1pb2j−t)
The min-max interpretation of the Lagrangian
The Ridge regression problem subject to hard constraints is:
minbmaxλ≥0L(b,λ)
You pick b to minimize the objective, cognizant that after b is picked, your opponent will set λ to infinity if you chose b such that ∑pj=1b2j>t.
If strong duality holds (which it does here because Slater's condition is satisfied for t>0), you then achieve the same result by reversing the order:
maxλ≥0minbL(b,λ)
Here, your opponent chooses λ first! You then choose b to minimize the objective, already knowing their choice of λ. The minbL(b,λ) part (taken λ as given) is equivalent to the 2nd form of your Ridge Regression problem.
As you can see, this isn't a result particular to Ridge regression. It is a broader concept.
References
(I started this post following an exposition I read from Rockafellar.)
Rockafellar, R.T., Convex Analysis
You might also examine lectures 7 and lecture 8 from Prof. Stephen Boyd's course on convex optimization.