-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inferencing does not work #3
Comments
If you're running on Windows that might be the problem. Windows sometimes has problems running multi-processing code with globals do to the fact that it doesn't fork like Unix does. Regardless, the easiest thing to do is to remove the multiprocessing. Simply remove line 242 "With Pool" and change line 243 so it calls That'll be a bit slower but if you're only doing a few graphs it won't be noticeable. |
@bjascob Thanks again for the speedy response. No, this is on macOS, so I'd think it would work like Linux since it's FreeBSD with a mach kernel. I actually did something similar to remove multiprocessing (see below). However its very slow. Does it split up the work across just using the model? In other words, the multiprocessing isn't specifically used just for training is it? from multiprocessing.pool import ThreadPool
...
if use_multithreading:
pool = Pool(processes=processes, maxtasksperchild=maxtpc)
else:
pool = ThreadPool(processes=processes)
for fdata in pool.imap_unordered(worker, idx_keys, chunksize=chunksize):
dn, midx, sspans, dspans, words, sfeats, pfeats, slabels, plabels = fdata
feat_data[dn][midx] = {'sspans':sspans, 'dspans':dspans, 'words':words,
'sfeats':sfeats, 'pfeats':pfeats,
'slabels':slabels, 'plabels':plabels}
pbar.update(1) |
I believe this is just for inference but I haven't look at this code in a long time. If you're using a Mac, I'm not sure what the issue is. That global should be defined in all "forked" processes unless FreeBSD uses a different MP scheme than "fork" as it's default. You might want to look at the docs and see if if the behavior is the same as Ubuntu. Your solution looks OK to me, though simply removing the multiprocessing altogether would be worth trying. |
@bjascob I've tested it and it works. I've also added a simple make files and setup.py for the project. You mention you'd like to merge this Do you want me to create a pull request for any of this? If so, what do you want and in which repos? Thanks again for all this great work. |
It's unlikely I will add this to amrlib. I think there is better co-referencing approaches available and I may take a look at those at some point in the future. Todo - add note to README about compatibility with non-Ubuntu systems due to multi-processing. |
@bjascob This makes sense. Question about the implementation overall: Why did you decide to create a new model using AMR graphs rather than using the alignments along with the output of say neuralcoref? |
Co-referencing the sentence and then aligning to the graph was the only method available previously. I simply tried it this way to see if it would work. While it did work, the scores seemed somewhat mediocre and the code programatically complex. Recently there were two different papers/projects posted that I believe do this directly as well. I'm not sure which of all of these approaches gives the best results. At some point I'd like do dig into this a bit deeper but it's not a priority for me right now. |
I have made these changes and another that raised an error while trying to use the library from a child process. I've forked this repo and made these changes in this repo. This includes my own setup to build wheels and deploy them to PyPi. I can create a pull request if you like. Regarding the other AMR co-reference methods: I'll comment in the other issue, but despite the issues you mentioned, this is still the best working co-reference library out there (thanks again). |
I followed the instructions to run the inference script `` but I get the following error (below). It appears the module global variable
gfeaturizer
is not being set in `amr_coref.amr_coref.coref.coref_featurizer`.Before running I added the downloaded model (from the releases section on this repo's GitHub section) and installed the LDC2020T02 corpus in the data directory.
The text was updated successfully, but these errors were encountered: