Initialize large table value randomly #9787

Yancey1989 · 2018-04-09T07:54:09Z

…selected_rows_value

…ddle into random_selected_rows_value

typhoonzero · 2018-04-11T07:41:09Z

paddle/fluid/operators/uniform_random_table_op.cc

+    // Only allocate the memory of large table on CPU
+    auto cpu = platform::CPUPlace();
+    float *data = tensor->mutable_data<float>(cpu);
+    VLOG(3) << "generate seed";


VLOGs not needed.

typhoonzero · 2018-04-11T07:42:06Z

paddle/fluid/operators/uniform_random_table_op.cc

+    auto tensor = out->mutable_value();
+    tensor->Resize(framework::make_ddim(shape));
+    // Only allocate the memory of large table on CPU
+    auto cpu = platform::CPUPlace();


Do we need to enforce that dev_place is on CPU?

typhoonzero · 2018-04-11T07:52:48Z

paddle/fluid/operators/uniform_random_table_op.cc

+    for (int64_t idx = 1; idx < rows_size; ++idx) {
+      (*rows)[idx] = (*rows)[idx - 1] + shard_cnt;
+    }
+    out->set_height(max_id);


As discussed offline, with @Yancey1989 @jacquesqiao , need to implement this using an auto-growth SelectedRows, and init it randomly with a startup init buffer size, like 128M. This feature may also need to implement operations that when "Prefetch" runs.

When lookup table op find that some id is not in the rows of the table, it should push this new id to the table.rows to init it.

@jacquesqiao Yes, and I have an offline discussion with @typhoonzero, maybe we need a new class to represent the Table, and I create an issue #9841 for the details if you also agree with that, I will create a new PR to implement it.

FROM @typhoonzero

and init it randomly with a startup init buffer size, like 128M

We have already had an attribute shape to represent the shape of the SelectedRows.value(), maybe we can only use the shape to determinate the memory size, use shape and memory size at the same time may be confused.

And maybe we resue uniform_random_op, they have the same computing logic.

jacquesqiao · 2018-04-11T08:51:31Z

paddle/fluid/operators/uniform_random_table_op.cc

+    tensor->Resize(framework::make_ddim(shape));
+    // Only allocate the memory of large table on CPU
+    auto cpu = platform::CPUPlace();
+    float *data = tensor->mutable_data<float>(cpu);


This float should be a template

jacquesqiao · 2018-04-11T08:55:02Z

paddle/fluid/operators/uniform_random_table_op.cc

+    auto shape = Attr<std::vector<int>>("shape");
+
+    auto tensor = out->mutable_value();
+    tensor->Resize(framework::make_ddim(shape));


The first dimension of the tensor is not sure, should be calculated by some attribute, such as id_num / shard_cnt + buffer_size. Do you mean that this calculation is done in Python ?

Do you mean that this calculation is done in Python

I think so, but as the discussion just now #9787 (comment), I will update this PR according to it.

…selected_rows_value

Yancey1989 · 2018-04-12T04:28:53Z

@jacquesqiao @typhoonzero
I updated this PR and allocate an initialize buffer by the attribute shape, and then we need to refine lookup_table_op to re-allocate the Table's memory if new Id was coming and not enough memory.

I created an issue #9841 and will implement it ASAP.

typhoonzero · 2018-04-12T07:21:08Z

paddle/fluid/operators/uniform_random_op.cc

+      tensor = out_var->GetMutable<framework::SelectedRows>()->mutable_value();
+      tensor->Resize(framework::make_ddim(shape));
+    } else {
+      PADDLE_THROW("Only support SelectedRows and Tensor");


uniform_random_op's output only support...

typhoonzero · 2018-04-12T07:21:50Z

paddle/fluid/operators/uniform_random_op.cc

@@ -24,7 +24,17 @@ template <typename T>
 class CPUUniformRandomKernel : public framework::OpKernel<T> {
 public:
  void Compute(const framework::ExecutionContext& ctx) const override {
-    auto* tensor = ctx.Output<framework::Tensor>("Out");
+    framework::Tensor* tensor(nullptr);


framework::Tensor *tensor = nullptr;

Yancey1989 added 2 commits April 9, 2018 14:27

random selected rows value

972ae6e

update unit test

f909ff1

Yancey1989 requested review from jacquesqiao and dzhwinter April 9, 2018 07:54

Yancey1989 added 4 commits April 10, 2018 14:23

new op that init table value randomly

3f6fc10

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into random_…

1cc09c7

…selected_rows_value

Merge branch 'random_selected_rows_value' of github.com:Yancey1989/Pa…

291aa23

…ddle into random_selected_rows_value

revert uniform_random_op

cb7bbf4

Yancey1989 requested a review from typhoonzero April 10, 2018 06:51

Yancey1989 changed the title ~~Random selected rows value~~ Initialize large table value randomly Apr 10, 2018

typhoonzero reviewed Apr 11, 2018

View reviewed changes

jacquesqiao reviewed Apr 11, 2018

View reviewed changes

Yancey1989 added 2 commits April 11, 2018 18:10

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into random_…

1aada35

…selected_rows_value

update by comment

7132bbe

fix ci

9e9f5d8

typhoonzero reviewed Apr 12, 2018

View reviewed changes

update by comment

8eac2a4

jacquesqiao mentioned this pull request Apr 12, 2018

Support Distribute Lookup Table #9211

Closed

15 tasks

typhoonzero approved these changes Apr 13, 2018

View reviewed changes

Yancey1989 merged commit 41a9146 into PaddlePaddle:develop Apr 13, 2018

Yancey1989 deleted the random_selected_rows_value branch April 13, 2018 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize large table value randomly #9787

Initialize large table value randomly #9787

Yancey1989 commented Apr 9, 2018 •

edited by jacquesqiao

Loading

typhoonzero Apr 11, 2018

typhoonzero Apr 11, 2018

typhoonzero Apr 11, 2018

jacquesqiao Apr 11, 2018 •

edited

Loading

Yancey1989 Apr 11, 2018 •

edited

Loading

Yancey1989 Apr 11, 2018

Yancey1989 Apr 11, 2018

jacquesqiao Apr 11, 2018

jacquesqiao Apr 11, 2018

Yancey1989 Apr 11, 2018

Yancey1989 commented Apr 12, 2018

typhoonzero Apr 12, 2018

Yancey1989 Apr 12, 2018

typhoonzero Apr 12, 2018

Initialize large table value randomly #9787

Initialize large table value randomly #9787

Conversation

Yancey1989 commented Apr 9, 2018 • edited by jacquesqiao Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao Apr 11, 2018 • edited Loading

Choose a reason for hiding this comment

Yancey1989 Apr 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yancey1989 commented Apr 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yancey1989 commented Apr 9, 2018 •

edited by jacquesqiao

Loading

jacquesqiao Apr 11, 2018 •

edited

Loading

Yancey1989 Apr 11, 2018 •

edited

Loading