python script hangs when trainer_count > 1 and call trainer.train() more than once. #2534

juliecbd · 2017-06-20T16:51:37Z

import paddle.v2 as paddle
paddle.init(use_gpu=False, trainer_count=2)

if we set trainer_count > 1, and call trainer.train() more than one time, then the program just hangs. If we set trainer_count=1, the script runs as expected. What could be the reason? Thanks

lcy-seso · 2017-06-23T23:58:24Z

I have met the same bug #2565.

jacquesqiao · 2017-06-24T01:30:21Z

just to confirm. is it hang when paddle.trian or paddle.infer? Because @lcy-seso meet a similar problem when calling paddle.infer multiple times

juliecbd · 2017-06-24T01:31:27Z

Mine is when train()

…

On Jun 23, 2017, at 6:30 PM, 乔龙飞 ***@***.***> wrote: just to confirm. is it hang when paddle.trian or paddle.infer? Because @lcy-seso <https://github.com/lcy-seso> meet a similar problem when calling paddle.infer multiple times — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2534 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbSLuEGW723keAu8bjeSeDWD-dZ_ZGa4ks5sHGa2gaJpZM4N_3fc>.

jacquesqiao · 2017-06-24T01:34:35Z

thx，can you paste or give a demo to reproduce the problem?

juliecbd · 2017-06-24T01:47:34Z

Yes, just simply import paddle.v2 as paddle import paddle.v2.dataset.uci_housing as uci_housing def main(): # init paddle.init(use_gpu=False, trainer_count=2) # network config x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13)) y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear()) y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1)) cost = paddle.layer.mse_cost(input=y_predict, label=y) # create parameters parameters = paddle.parameters.create(cost) # create optimizer optimizer = paddle.optimizer.Momentum(momentum=0) trainer = paddle.trainer.SGD( cost=cost, parameters=parameters, update_equation=optimizer) feeding = {'x': 0, 'y': 1} # event_handler to print training and testing info def event_handler(event): if isinstance(event, paddle.event.EndIteration): if event.batch_id % 100 == 0: print "Pass %d, Batch %d, Cost %f" % ( event.pass_id, event.batch_id, event.cost) if isinstance(event, paddle.event.EndPass): result = trainer.test( reader=paddle.batch(uci_housing.test(), batch_size=2), feeding=feeding) print "Test %d, Cost %f" % (event.pass_id, result.cost) # training trainer.train( reader=paddle.batch( paddle.reader.shuffle(uci_housing.train(), buf_size=500), batch_size=2), feeding=feeding, event_handler=event_handler, num_passes=30) trainer.train( reader=paddle.batch( paddle.reader.shuffle(uci_housing.train(), buf_size=500), batch_size=2), feeding=feeding, event_handler=event_handler, num_passes=30) if __name__ == '__main__': main()

…

On Jun 23, 2017, at 6:34 PM, 乔龙飞 ***@***.***> wrote: thx，can you paste or give a demo to reproduce the problem? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2534 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbSLuBDTNocW6nBuA2pQbhqKq7wMOLhoks5sHGe0gaJpZM4N_3fc>.

jacquesqiao · 2017-06-24T02:00:16Z

thanx, I reproduced this problem with your demo. It's very strange, I will have a look.

juliecbd · 2017-06-24T02:04:55Z

Thank you

…

On Jun 23, 2017, at 7:00 PM, 乔龙飞 ***@***.***> wrote: thanx, I reproduced this problem with your demo. It's very strange, I will have a look. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2534 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbSLuAToK93mKdUlrYikBPJ8qtI4htLhks5sHG26gaJpZM4N_3fc>.

juliecbd changed the title ~~python script hangs when trainer_count > 1 and call trainer.train() more than one times.~~ python script hangs when trainer_count > 1 and call trainer.train() more than once. Jun 20, 2017

gongweibao assigned typhoonzero and gongweibao Jun 21, 2017

wangkuiyi assigned reyoung Jun 23, 2017

lcy-seso mentioned this issue Jun 23, 2017

inferring by using multi-thread will be hung and the results are not right #2565

Closed

lcy-seso assigned ghost , jacquesqiao and lcy-seso and unassigned gongweibao and typhoonzero Jun 24, 2017

jacquesqiao mentioned this issue Jun 25, 2017

fix MultiGradientMachine train and infer #2595

Merged

jacquesqiao closed this as completed in #2595 Jun 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python script hangs when trainer_count > 1 and call trainer.train() more than once. #2534

python script hangs when trainer_count > 1 and call trainer.train() more than once. #2534

juliecbd commented Jun 20, 2017

lcy-seso commented Jun 23, 2017

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email •

edited by lcy-seso

Loading

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email

python script hangs when trainer_count > 1 and call trainer.train() more than once. #2534

python script hangs when trainer_count > 1 and call trainer.train() more than once. #2534

Comments

juliecbd commented Jun 20, 2017

lcy-seso commented Jun 23, 2017

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email • edited by lcy-seso Loading

jacquesqiao commented Jun 24, 2017

juliecbd commented Jun 24, 2017 via email

juliecbd commented Jun 24, 2017 via email •

edited by lcy-seso

Loading