Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moseq2-model learn-model error on o2 #6

Closed
ralphpeterson opened this issue Mar 2, 2018 · 11 comments
Closed

moseq2-model learn-model error on o2 #6

ralphpeterson opened this issue Mar 2, 2018 · 11 comments

Comments

@ralphpeterson
Copy link

ralphpeterson commented Mar 2, 2018

Receiving the following error after installing this repo w/ following instructions:

conda create -n moseq2_model python=2.7
source activate moseq2_model
git clone https://github.com/dattalab/moseq2_model.git
cd moseq2_model/
pip install numpy==1.13.0
pip install future
pip install six
pip install cython
CXX=/usr/local/bin/g++-7 pip install -e . --process-dependency-links

Could be related to something here: mattjj/pyhsmm#55

>>moseq2-model learn-model /home/rep19/Code/_sex-diffs/data-600k-size-matched-compressed.p /home/rep19/Code/_sex-diffs/tar-model.p --robust --separate-trans --nu 5 -p

Traceback (most recent call last):
  File "/home/rep19/anaconda2/envs/mm/bin/moseq2-model", line 11, in <module>
    load_entry_point('moseq2-model', 'console_scripts', 'moseq2-model')()
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/rep19/Code/moseq2_model/moseq2_model/cli.py", line 214, in learn_model
    file=sys.stdout)
  File "/home/rep19/Code/moseq2_model/moseq2_model/train/util.py", line 19, in train_model
    model.resample_model(num_procs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/models.py", line 442, in resample_model
    self.resample_states(num_procs=num_procs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/models.py", line 469, in resample_states
    self._joblib_resample_states(self.states_list,num_procs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/models.py", line 497, in _joblib_resample_states
    for idx in range(len(joblib_args)))
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/parallel.py", line 779, in __call__
    while self.dispatch_one_batch(iterator):
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/parallel.py", line 625, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/parallel.py", line 588, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/_parallel_backends.py", line 111, in apply_async
    result = ImmediateResult(func)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/_parallel_backends.py", line 332, in __init__
    self.results = batch()
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/parallel.py", line 39, in _get_sampled_stateseq
    model.add_data(data,initialize_from_prior=False,**kwargs)
  File "/home/rep19/Code/autoregressive/autoregressive/models.py", line 26, in add_data
    super(_ARMixin,self).add_data(data=strided_data,**kwargs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/models.py", line 73, in add_data
    **kwargs))
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 118, in __init__
    super(_SeparateTransMixin,self).__init__(**kwargs)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 40, in __init__
    self.resample()
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 362, in resample
    return self.resample_normalized()
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 357, in resample_normalized
    alphan = self.messages_forwards_normalized()
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 347, in messages_forwards_normalized
    self._messages_forwards_normalized(self.trans_matrix,self.pi_0,self.aBl)
  File "/home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/hmm_states.py", line 621, in _messages_forwards_normalized
    from pyhsmm.internals.hmm_messages_interface import messages_forwards_normalized
ImportError: No module named hmm_messages_interface
@jmarkow
Copy link
Collaborator

jmarkow commented Mar 2, 2018

Strange error, can you write down in this thread what you did to fix this based on our Slack conversation?

@ralphpeterson
Copy link
Author

ralphpeterson commented Mar 2, 2018

Kind of a funky workaround, but, a manageable band-aid for now:

1.) install fresh conda environment on local machine

conda create -n mm python=2.7
source activate mm
git clone https://github.com/dattalab/moseq2_model.git
cd moseq2_model/
pip install numpy==1.13.0
pip install future
pip install six
pip install cython
export CC=/usr/local/bin/gcc-7
export CCX=/usr/local/bin/g++-7
pip install -e . --process-dependency-links

2.) mount remote filesystem to access data

sudo mkdir /Volumes/orchestra-groups; sudo sshfs -o allow_other,defer_permissions rep19@orchestra.med.harvard.edu:/groups/datta/Ralph /Volumes/orchestra-groups

3.) Change matplotlib backend to TkAgg à la: https://stackoverflow.com/questions/21784641/installation-issue-with-matplotlib-python

4.) learn a model

moseq2-model learn-model /Volumes/orchestra-groups/data-600k-size-matched-compressed.p ~/Desktop/t-dist.p --robust --separate-trans --nu 5 -p

Still not sure what the problem is with running this all on O2, but am going to try installing from a fresh env and a few other things.

@davidhbrann
Copy link

You probably already tried this but if it's a gcc/g++ issue, did you have GCC loaded with module load gcc/6.2.0 when you tried to install everything in orchestra? That's in my bashrc. It looks like O2 also has gcc/4.8.5 but I don't think downgrading should be necessary since we don't when we compile pyhsmm on orchestra (which uses gcc-5.2.0) or O2

@ralphpeterson
Copy link
Author

Yep, I tried with both gcc/6.2.0 and gcc/4.8.5 — no dice unfortunately.

@davidhbrann
Copy link

Weird! The installation instructions worked for me on O2. I can try to help if you're around this weekend.

@ralphpeterson
Copy link
Author

Update: tried fresh installing yesterday with David, but the problem persists. I'm able to run the code fine on my laptop using the methodology outlined in my comment above.

@jmarkow
Copy link
Collaborator

jmarkow commented Mar 5, 2018

Bizarre. Do you have any custom environment variables set?

@ralphpeterson
Copy link
Author

Here's my env. ~/Code/autoregressive doesn't actually exist.

>>>env

XDG_SESSION_ID=69324
TERM=xterm-256color
SHELL=/bin/bash
SSH_CLIENT=10.11.176.104 49326 22
SSH_TTY=/dev/pts/25
USER=rep19
MAIL=/var/mail/rep19
PATH=/home/rep19/anaconda2/bin:/home/rep19/bin:/n/cluster/bin:/usr/local/bin:/usr/bin:/opt/puppetlabs/bin:/n/cluster/bin:/usr/local/bin:/usr/bin:/opt/puppetlabs/bin
PWD=/home/rep19
SHLVL=2
HOME=/home/rep19
LOGNAME=rep19
SSH_CONNECTION=10.11.176.104 49326 134.174.159.21 22
XDG_RUNTIME_DIR=/run/user/106812
_=/usr/bin/env
OLDPWD=/home/rep19
LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:
LANG=en_US.UTF-8
LESSOPEN=||/usr/bin/lesspipe.sh %s
LMOD_sys=Linux
MODULEPATH_ROOT=/n/app/lmod/lmod/modulefiles
MODULEPATH=/n/app/lmod/lmod/modulefiles/Linux:/n/app/lmod/lmod/modulefiles/Core
BASH_ENV=/n/app/lmod/lmod/init/bash
MANPATH=/n/app/lmod/lmod/share/man::
LMOD_PKG=/n/app/lmod/lmod
LMOD_DIR=/n/app/lmod/lmod/libexec
LMOD_CMD=/n/app/lmod/lmod/libexec/lmod
MODULESHOME=/n/app/lmod/lmod
LMOD_SETTARG_CMD=:
LMOD_FULL_SETTARG_SUPPORT=no
LMOD_VERSION=7.5.17
HMS_CLUSTER=o2
SHARED_DATABASES=/n/groups/shared_databases
TZ=America/New_York
QT_GRAPHICSSYSTEM_CHECKED=1
SLURM_CONF=/etc/slurm/slurm.conf
ZSH=/home/rep19/.oh-my-zsh
PAGER=less
LESS=-R
LC_CTYPE=en_US.UTF-8
LSCOLORS=Gxfxcxdxbxegedabagacad
PYTHONPATH=/home/rep19/Code/moseq-alt:/home/rep19/Code/autoregressive:/home/rep19/Code:/home/rep19/Code/pybasicbayes:/home/rep19/Code/syllables:

@jmarkow
Copy link
Collaborator

jmarkow commented Mar 5, 2018

What's the output of ls home/rep19/anaconda2/envs/mm/lib/python2.7/site-packages/pyhsmm/internals/?

I'm betting that module didn't get compiled...or something. You may need to pip reinstall with CC and CXX either unset or set correctly to point to your current version of gcc and g++. So something like,

unset CC
unset CXX
module load gcc/6.2.0
pip uninstall moseq2_model pybasicbayes pyhsmm autoregressive
pip install -e . --process-dependency-links --no-cache-dir

This will make sure that pip doesn't use any cached results to reinstall.

@ralphpeterson
Copy link
Author

It worked!! I wasn't using --no-cache-dir when installing prior. I will edit into the install instructions.

I ran this successfully with gcc/4.5.8 and CC/CXX environment variables were not set.

@jmarkow
Copy link
Collaborator

jmarkow commented Mar 5, 2018

Considering this issue closed.

@jmarkow jmarkow closed this as completed Mar 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants