diff --git a/SentimentAnalysis Word2Vec word embeding.ipynb b/SentimentAnalysis Word2Vec word embeding.ipynb index 784d573..62126b9 100644 --- a/SentimentAnalysis Word2Vec word embeding.ipynb +++ b/SentimentAnalysis Word2Vec word embeding.ipynb @@ -758,8 +758,8 @@ "\n", "where $u.v$ is the dot product (or inner product) of two vectors, $||u||_2$ is the norm (or length) of the vector $u$, and $\\theta$ is the angle between $u$ and $v$. This similarity depends on the angle between $u$ and $v$. If $u$ and $v$ are very similar, their cosine similarity will be close to 1; if they are dissimilar, the cosine similarity will take a smaller value. \n", "\n", - "\n", - "
**Figure 1**: The cosine of the angle between two vectors is a measure of how similar they are
\n", + "\n", + "
Figure 1: The cosine of the angle between two vectors is a measure of how similar they are
\n", "\n", "**Exercise**: Implement the function `cosine_similarity()` to evaluate similarity between word vectors.\n", "\n", @@ -836,7 +836,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "** PCA ** : a linear deterministic algorithm (principal component analysis) that tries to capture as much of the data variability in as few dimensions as possible. PCA tends to highlight large-scale structure in the data, but can distort local neighborhoods. The Embedding Projector computes the top 10 principal components, from which you can choose two or three to view." + "**PCA** : a linear deterministic algorithm (principal component analysis) that tries to capture as much of the data variability in as few dimensions as possible. PCA tends to highlight large-scale structure in the data, but can distort local neighborhoods. The Embedding Projector computes the top 10 principal components, from which you can choose two or three to view." ] }, { @@ -1026,7 +1026,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "** t-SNE ** : a nonlinear nondeterministic algorithm (T-distributed stochastic neighbor embedding) that tries to preserve local neighborhoods in the data, often at the expense of distorting global structure. You can choose whether to compute two- or three-dimensional projections." + "**t-SNE** : a nonlinear nondeterministic algorithm (T-distributed stochastic neighbor embedding) that tries to preserve local neighborhoods in the data, often at the expense of distorting global structure. You can choose whether to compute two- or three-dimensional projections." ] }, { @@ -1052,7 +1052,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### for 3d plotly word embeding t-sne 100 words visualization link :\n", + "#### For 3d plotly word embeding t-sne 100 words visualization link :\n", "For rendering link :\n", "https://plot.ly/~AlaBayoudh/6" ] @@ -1199,7 +1199,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "** t-sne ** closest words to 'good'" + "**t-sne** closest words to 'good'" ] }, { @@ -1257,7 +1257,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.5.4" + "version": "3.7.6" } }, "nbformat": 4, diff --git a/images/cosine_sim.png b/images/cosine_sim.png new file mode 100644 index 0000000..2c08681 Binary files /dev/null and b/images/cosine_sim.png differ