Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with tfa.seq2seq.gather_tree #125

Closed
guillaumekln opened this issue Apr 1, 2019 · 9 comments
Closed

Segmentation fault with tfa.seq2seq.gather_tree #125

guillaumekln opened this issue Apr 1, 2019 · 9 comments
Labels
bug Something isn't working build

Comments

@guillaumekln
Copy link
Contributor

System information

  • Have I written custom code: Yes
  • OS Platform and Distribution: Ubuntu 16.04
  • TensorFlow installed from: binary
  • TensorFlow version: 2.0.0a0
  • TensorFlow Addons installed from: PyPi
  • TensorFlow Addons version: 0.2.0
  • Python version and type: 2.7.12 (stock)
  • Is GPU used? No

Describe the bug

The gather_tree function from the seq2seq module fails with a Segmentation fault while the same code using tf.contrib.seq2seq does not.

Describe the expected behavior

The function should run without failures.

Code to reproduce the issue

import tensorflow as tf
import tensorflow_addons as tfa

step_ids = tf.constant([[[1, 2, 3], [1, 3, 3]]], dtype=tf.int32)  # [batch, beam, time]
parent_ids = tf.constant([[[0, 0, 0], [0, 1, 1]]], dtype=tf.int32)  # [batch, beam, time]
maximum_lengths = tf.constant([3], dtype=tf.int32)  # [batch]

step_ids = tf.transpose(step_ids, perm=[2, 0, 1])
parent_ids = tf.transpose(parent_ids, perm=[2, 0, 1])

ids = tfa.seq2seq.beam_search_decoder.gather_tree(
    step_ids, parent_ids, maximum_lengths, 3)
@seanpmorgan
Copy link
Member

@qlzh727 Mind taking a look when time allows? Possible this could be related to the packaging, but we'll see.

@qlzh727
Copy link
Member

qlzh727 commented Apr 1, 2019

Humm, the gather_tree will eventually use a c op, which might be the cause here. Is there any log before it throw the error?

@seanpmorgan, we do have a unit test for gather_tree in https://github.com/tensorflow/addons/blob/master/tensorflow_addons/seq2seq/beam_search_ops_test.py. I am wondering is there any similar issue raised from user about using c ops after the recent reorg of the ops?

@seanpmorgan
Copy link
Member

Not that I've seen... for example here is a call to the c ops in image
https://colab.research.google.com/drive/1_RFgu_glO5bJ1PRcxsMtiBR708fjsW7a

When I tried to run the snippet from above though... I got a kernel restart so will need to run locally in order to see what the crash says. Hopefully will have time to look into this a bit in the next day or so

@seanpmorgan seanpmorgan added bug Something isn't working seq2seq labels Apr 2, 2019
@seanpmorgan
Copy link
Member

seanpmorgan commented Apr 2, 2019

So I'm unable to replicate this on my local PC, but do get a crash on colab with no error message to work with. Because this works on local PC and passes tests which run this I'm heavily leaning on this being related to #119

An example of core dump due to improper packaging can be seen here
apache/arrow#1450

@guillaumekln were you able to get any meaningful failure message?

@seanpmorgan
Copy link
Member

seanpmorgan commented Apr 2, 2019

On second thought not totally sold its related to the packaging, because the same sequence of imports on different computers should both crash (if I totally follow the packaging issue).

@guillaumekln
Copy link
Contributor Author

@guillaumekln were you able to get any meaningful failure message?

No, it was a plain Segmentation fault. Will try to dig deeper.

@seanpmorgan
Copy link
Member

Thanks, here is another (probably) related issue. I bumped to gcc5 so we could build a py37 whl but that was probably not worth it. I'll get feedback from the SIG BUILD meeting today and push a 0.2.1 soon

tensorflow/tensorflow#27067

@seanpmorgan
Copy link
Member

@guillaumekln This should be fixed in 0.2.1 can you please confirm?
Looks good on the colab notebook

@guillaumekln
Copy link
Contributor Author

Yes, looks like it is working now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build
Projects
None yet
Development

No branches or pull requests

3 participants