Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make doc2vec imdb ipynb tutorial run in python 2 and 3 #1220

Merged
merged 2 commits into from
Mar 20, 2017

Conversation

robotcator
Copy link
Contributor

fix the compatibility between python2 and python3 for the notebook of doc2vec-IMDB.ipynb #1139.

@tmylk
Copy link
Contributor

tmylk commented Mar 19, 2017

Please merge in develop into your branh to resolve the conflicts git fetch;git merge develop

@robotcator robotcator changed the base branch from doc_fix to develop March 19, 2017 08:33
@robotcator
Copy link
Contributor Author

It seems that I select the wrong base branch. I have change to the develop branch and the conflicts were resolved. And the commit( 1aa3f33) was my operation mistake.

@tmylk tmylk changed the title Fix notebook Make doc2vec imdb ipynb tutorial run in python 2 and 3 Mar 20, 2017
@tmylk tmylk merged commit 854fad6 into piskvorky:develop Mar 20, 2017
@@ -92,8 +116,7 @@
" txt_files = glob.glob('/'.join([dirname, fol, '*.txt']))\n",
"\n",
" for txt in txt_files:\n",
" with open(txt, 'r', encoding='utf-8') as t:\n",
" control_chars = [chr(0x85)]\n",
" with codecs.open(txt, 'r', encoding='utf-8') as t:\n",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use smart_open instead: drop codecs, open files in binary mode and convert content to unicode explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will drop the codecs and move to smart_open.

@@ -104,21 +127,28 @@
" temp += \"\\n\"\n",
"\n",
" temp_norm = normalize_text(temp)\n",
" with open('/'.join([dirname, output]), 'w', encoding='utf-8') as n:\n",
" with codecs.open('/'.join([dirname, output]), 'w', encoding='utf-8') as n:\n",
Copy link
Owner

@piskvorky piskvorky Apr 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not portable -- please use os.path.join.

" n.write(temp_norm)\n",
"\n",
" alldata += temp_norm\n",
"\n",
" with open('/'.join([dirname, 'alldata-id.txt']), 'w', encoding='utf-8') as f:\n",
" with codecs.open('/'.join([dirname, 'alldata-id.txt']), 'w', encoding='utf-8') as f:\n",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop codecs, use binary mode.

@robotcator robotcator deleted the fix-notebook branch June 1, 2017 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants