Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplication and wrong markup in docs #1633

Merged
merged 2 commits into from
Oct 18, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/src/scripts/word2vec2tensor.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
:mod:`scripts.word2vec2tensor` --
==================================
:mod:`scripts.word2vec2tensor` -- Convert the word2vec format to Tensorflow 2D tensor
=====================================================================================

.. automodule:: gensim.scripts.word2vec2tensor
:synopsis:
:synopsis: Convert the word2vec format to Tensorflow 2D tensor
:members:
:inherited-members:
:undoc-members:
Expand Down
4 changes: 2 additions & 2 deletions gensim/models/doc2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

"""
Deep learning via the distributed memory and distributed bag of words models from
[1]_, using either hierarchical softmax or negative sampling [2]_ [3]_. See [tutorial]_
[1]_, using either hierarchical softmax or negative sampling [2]_ [3]_. See [#tutorial]_

**Make sure you have a C compiler before installing gensim, to use optimized (compiled)
doc2vec training** (70x speedup [blog]_).
Expand Down Expand Up @@ -35,7 +35,7 @@
In Proceedings of NIPS, 2013.
.. [blog] Optimizing word2vec in gensim, http://radimrehurek.com/2013/09/word2vec-in-python-part-two-optimizing/

.. [tutorial] Doc2vec in gensim tutorial, https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-lee.ipynb
.. [#tutorial] Doc2vec in gensim tutorial, https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-lee.ipynb



Expand Down
6 changes: 3 additions & 3 deletions gensim/models/word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -1075,10 +1075,10 @@ def score(self, sentences, total_sentences=int(1e6), chunksize=100, queue_factor
Note that you should specify total_sentences; we'll run into problems if you ask to
score more than this number of sentences but it is inefficient to set the value too high.

See the article by [taddy]_ and the gensim demo at [deepir]_ for examples of how to use such scores in document classification.
See the article by [#taddy]_ and the gensim demo at [#deepir]_ for examples of how to use such scores in document classification.

.. [taddy] Taddy, Matt. Document Classification by Inversion of Distributed Language Representations, in Proceedings of the 2015 Conference of the Association of Computational Linguistics.
.. [deepir] https://github.com/piskvorky/gensim/blob/develop/docs/notebooks/deepir.ipynb
.. [#taddy] Taddy, Matt. Document Classification by Inversion of Distributed Language Representations, in Proceedings of the 2015 Conference of the Association of Computational Linguistics.
.. [#deepir] https://github.com/piskvorky/gensim/blob/develop/docs/notebooks/deepir.ipynb

"""
if FAST_VERSION < 0:
Expand Down
6 changes: 4 additions & 2 deletions gensim/scripts/glove2word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@
"""
USAGE:
$ python -m gensim.scripts.glove2word2vec --input <GloVe vector file> --output <Word2vec vector file>

Where:
<GloVe vector file>: Input GloVe .txt file
<Word2vec vector file>: Desired name of output Word2vec .txt file

* <GloVe vector file>: Input GloVe .txt file.
* <Word2vec vector file>: Desired name of output Word2vec .txt file.

This script is used to convert GloVe vectors in text format into the word2vec text format.
The only difference between the two formats is an extra header line in word2vec,
Expand Down
14 changes: 8 additions & 6 deletions gensim/scripts/word2vec2tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,22 @@
USAGE: $ python -m gensim.scripts.word2vec2tensor --input <Word2Vec model file> --output <TSV tensor filename prefix> [--binary] <Word2Vec binary flag>

Where:
<Word2Vec model file>: Input Word2Vec model.
<TSV tensor filename prefix>: 2D tensor TSV output file name prefix.
<Word2Vec binary flag>: Set True if Word2Vec model is binary. Defaults to False.

* <Word2Vec model file>: Input Word2Vec model.
* <TSV tensor filename prefix>: 2D tensor TSV output file name prefix.
* <Word2Vec binary flag>: Set True if Word2Vec model is binary. Defaults to False.

Output:
The script will create two TSV files. A 2d tensor format file, and a Word Embedding metadata file. Both files will
us the --output file name as prefix
use the --output file name as prefix.

This script is used to convert the word2vec format to Tensorflow 2D tensor and metadata formats for Embedding Visualization
To use the generated TSV 2D tensor and metadata file in the Projector Visualizer, please

1) Open http://projector.tensorflow.org/.
2) Choose "Load Data" from the left menu.
3) Select "Choose file" in "Load a TSV file of vectors." and choose you local "_tensor.tsv" file
4) Select "Choose file" in "Load a TSV file of metadata." and choose you local "_metadata.tsv" file
3) Select "Choose file" in "Load a TSV file of vectors." and choose you local "_tensor.tsv" file.
4) Select "Choose file" in "Load a TSV file of metadata." and choose you local "_metadata.tsv" file.

For more information about TensorBoard TSV format please visit:
https://www.tensorflow.org/versions/master/how_tos/embedding_viz/
Expand Down