Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Enable multi gpu distributed training of fluid #9746

Merged
merged 12 commits into from
Apr 11, 2018

Conversation

typhoonzero
Copy link
Contributor

@typhoonzero typhoonzero commented Apr 8, 2018

Resolves #8139

Sample code to run multi GPU distributed training:

def train_loop_parallel(use_gpu, trainer_prog, trainer_id=0, bcast=False):
        place = core.CPUPlace() if not use_gpu else core.CUDAPlace(0)
        startup_exe = fluid.Executor(place)
        startup_exe.run(fluid.default_startup_program())
        exe = fluid.ParallelExecutor(use_gpu, avg_cost.name)

        feeder = fluid.DataFeeder(place=place, feed_list=[images, label])

        for pass_id in range(args.num_passes):
            for batch_id, data in enumerate(train_reader()):
                print("before run one...")
                loss, = exe.run(
                        [avg_cost.name],
                        feed_dict=feeder.feed(data))
                if bcast:
                    exe.bcast_params()
                print("Pass %d, batch %d, loss %s" % (pass_id, batch_id, np.array(loss)))

@typhoonzero typhoonzero changed the title [WIP] [Feature] Enable multi gpu distributed training of fluid [Feature] Enable multi gpu distributed training of fluid Apr 11, 2018
@typhoonzero typhoonzero merged commit 652cf43 into PaddlePaddle:develop Apr 11, 2018
@typhoonzero typhoonzero deleted the multigpumultinode branch April 11, 2018 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants