The formula for the weighted Euclidean distance between two points, considering a vector of weights, is an extension of the standard Euclidean distance formula. The weighted Euclidean distance accounts for the importance of each dimension (or variable) by assigning a weight to each.
Given two points \(\mathbf{x} = (x_1, x_2, \ldots, x_n)\) and \(\mathbf{y} = (y_1, y_2, \ldots, y_n)\), and a vector of weights \(\mathbf{w} = (w_1, w_2, \ldots, w_n)\), the weighted Euclidean distance \(d_w(\mathbf{x}, \mathbf{y})\) is defined as:
\[ d_w(\mathbf{x}, \mathbf{y}) = \sqrt{\sum_{i=1}^{n} w_i (x_i - y_i)^2} \]
In the context of the given problem, let’s break it down step by step:
Familiarity variable to be
weighted 10 times more than the others, we set its corresponding weight
to 10 and all other weights to 1.If we have: - \(\mathbf{x} = (x_1, x_2, \ldots, x_n)\) - Mean vector \(\mathbf{y} = (y_1, y_2, \ldots, y_n)\) - Weights \(\mathbf{w} = (w_1, w_2, \ldots, w_n)\)
The distance is: \[ d_w(\mathbf{x}, \mathbf{y}) = \sqrt{w_1 (x_1 - y_1)^2 + w_2 (x_2 - y_2)^2 + \cdots + w_n (x_n - y_n)^2} \]
For the specific case where Familiarity (let’s assume
it’s the first variable) is weighted 10 times more, the weights would be
\(\mathbf{w} = (10, 1, 1, \ldots,
1)\).
So the formula becomes: \[ d_w(\mathbf{x}, \mathbf{y}) = \sqrt{10 (x_1 - y_1)^2 + 1 (x_2 - y_2)^2 + 1 (x_3 - y_3)^2 + \cdots + 1 (x_n - y_n)^2} \]
This formula ensures that differences in the Familiarity
variable have a much larger impact on the overall distance compared to
differences in the other variables.