Skip to content

Commit da13d9c

Browse files
committed
Merge branch 'beta'
2 parents 6ee5d95 + f25f70a commit da13d9c

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -164,13 +164,13 @@ A **map point** is a point <span class="math-tex" data-type="tex">\\(y_i\\)</spa
164164

165165
How do we choose the positions of the map points? We want to conserve the structure of the data. More specifically, if two data points are close together, we want the two corresponding map points to be close too. Let's <span class="math-tex" data-type="tex">\\(\left| x_i - x_j \right|\\)</span> be the Euclidean distance between two data points, and <span class="math-tex" data-type="tex">\\(\left| y_i - y_j \right|\\)</span> the distance between the map points. We first define a conditional similarity between the two data points:
166166

167-
<span class="math-tex" data-type="tex">\(p_{j|i} = \frac{\exp\left(-\left| x_i - x_j\right|^2 \big/ 2\sigma_i^2\right)}{\displaystyle\sum_{k \neq i} \exp\left(-\left| x_i - x_k\right|^2 \big/ 2\sigma_i^2\right)}\)</span>
167+
<span class="math-tex" data-type="tex">\\(p_{j|i} = \frac{\exp\left(-\left| x_i - x_j\right|^2 \big/ 2\sigma_i^2\right)}{\displaystyle\sum_{k \neq i} \exp\left(-\left| x_i - x_k\right|^2 \big/ 2\sigma_i^2\right)}\\)</span>
168168

169169
This measures how close <span class="math-tex" data-type="tex">\\(x_j\\)</span> is from <span class="math-tex" data-type="tex">\\(x_i\\)</span>, considering a **Gaussian distribution** around <span class="math-tex" data-type="tex">\\(x_i\\)</span> with a given variance <span class="math-tex" data-type="tex">\\(\sigma_i^2\\)</span>. This variance is different for every point; it is chosen such that points in dense areas are given a smaller variance than points in sparse areas. The original paper details how this variance is computed exactly.
170170

171171
Now, we define the similarity as a symmetrized version of the conditional similarity:
172172

173-
<span class="math-tex" data-type="tex">\(p_{ij} = \frac{p_{j|i} + p_{i|j}}{2N}\)</span>
173+
<span class="math-tex" data-type="tex">\\(p_{ij} = \frac{p_{j|i} + p_{i|j}}{2N}\\)</span>
174174

175175
We obtain a **similarity matrix** for our original dataset. What does this matrix look like?
176176

@@ -231,7 +231,7 @@ We can already observe the 10 groups in the data, corresponding to the 10 number
231231

232232
Let's also define a similarity matrix for our map points.
233233

234-
<span class="math-tex" data-type="tex">\(q_{ij} = \frac{f(\left| x_i - x_j\right|)}{\displaystyle\sum_{k \neq i} f(\left| x_i - x_k\right|)} \quad \textrm{with} \quad f(z) = \frac{1}{1+z^2}\)</span>
234+
<span class="math-tex" data-type="tex">\\(q_{ij} = \frac{f(\left| x_i - x_j\right|)}{\displaystyle\sum_{k \neq i} f(\left| x_i - x_k\right|)} \quad \textrm{with} \quad f(z) = \frac{1}{1+z^2}\\)</span>
235235

236236
This is the same idea as for the data points, but with a different distribution ([**t-Student with one degree of freedom**](http://en.wikipedia.org/wiki/Student%27s_t-distribution), or [**Cauchy distribution**](http://en.wikipedia.org/wiki/Cauchy_distribution), instead of a Gaussian distribution). We'll elaborate on this choice later.
237237

0 commit comments

Comments
 (0)