Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scope design doc #2548

Merged
merged 32 commits into from
Jun 27, 2017
Merged

scope design doc #2548

merged 32 commits into from
Jun 27, 2017

Conversation

jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao commented Jun 21, 2017

@reyoung @kuke and @jacquesqiao will coedit this doc on the design of Scope

Markdown link is here

Fixes #2546


private:
/// variable name -> variable
std::unordered_map<std::string, VariablePtr> vars_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unordered_map other than map? I am not sure which is better -- I'd always believed the former is a hash map and the latter is an RD-tree.

Anyway, I noticed that Caffe2 is using map in its Workspace:

 private:
  BlobMap blob_map_;

where BlobMap was defined as a CaffeMap at here:

typedef CaffeMap<string, unique_ptr<Blob> > BlobMap;

and CaffeMap is defined here

using CaffeMap = std::map<Key, Value>;
// using CaffeMap = std::unordered_map;

Copy link
Contributor

@dzhwinter dzhwinter Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yangqing has metioned why he prefer map instead unordered_map
here

Note(Yangqing): NVCC does not play well with unordered_map on some platforms,
// forcing us to use std::map instead of unordered_map.

using VariablePtr = std::shared_ptr<Variable>;

class Scope final {
public:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder -- I've put .clang-format files that strictly follows Google style in new source directories paddle/platform, paddle/framework. So this line would trigger clang-format error because Google style requires a single space before public:.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


```cpp
class Variable;
using VariablePtr = std::shared_ptr<Variable>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this using alias because in a different scenario we might use different smart pointers. I'd suggest that we write everything in the definition of vars_ like:

using string = std::string; // So we can switch to other implementation of strings later.
std::map<string, unique_ptr<Variable> > vars_;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have a global header called TypeDefs.h. In this header, we specify all default types for Map, String, Set, etc. Like

namespace paddle {
template <typename KEY, typename VAL>
using Map = std::unordered_map<KEY, VAL>;
using String = std::string;
template <typename T>
using Set = std::unordered_set<T>;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us do such kind of summarization later -- after we make sure that some types are common to various situations.


```cpp
class Variable;
using VariablePtr = std::shared_ptr<Variable>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Caffe2 is using unique_ptr in its Workspace definition:

typedef CaffeMap<string, unique_ptr<Blob> > BlobMap;

Why do we use shared_ptr here?

To my understand, unique_ptr is the right choice here because it transfers ownership of the pointed object when copy, whereas shared_ptr increment the reference count when copy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using unique_ptr and return a raw pointer, users must not hold this pointer inside private member or any global variable. Because when scope is destroyed, all pointers get before are invalid.

I think to use shared_ptr inside a Scope, but return a weak_ptr when GetVariable can resolve this issue. But this code is a little bit complicated.

Maybe we should have an agreement. We cannot store any variable pointer in our code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us don't save pointers from unique_ptr as a class data member. I don't think gemm need to hold such a pointer when we write an operator.


```cpp
Scope global;
auto x = newVar("X"); // x is created in scope global, implicitly.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a remind -- the following functions/methods do not following Google naming style:

newVar ==> NewVar
addOp ==> AddOp
run ==> Run

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

};
```

You need to specify a scope to run a Net. One net can run in different scopes and update different variable in the scope. If you did not specify one, It will run in a default scope.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

I second removing the requirement that a net belongs to a workspace and am happy to see that our Scope doesn't have a line like the one in caffe2::Workspace:

class Workspace {
  ...
  NetMap net_map_;


//! Get Variable in this scope.
//! @return nullptr if no such variable.
const VariablePtr& getVar(const std::string& name) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the return type must be Variable* if we use unique_ptr other than shared_ptr in vars_.

Also, following Google code style, this method should be named GetVar.

const VariablePtr& getVar(const std::string& name) const;

//! Create or get a variable in this scope.
VariablePtr& createOrGetVar(const std::string& name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename it into GetOrCreateVar? After all, the primary purpose of this method is to "get", other than "create" -- it creates only if it cannot get.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wangkuiyi wangkuiyi mentioned this pull request Jun 21, 2017
1. Scope should destruct all Variables within it when itself is destructed.

Because Variable can only be got from Scope, when destroying Scope, we also need to destroy all the Vars in it.
Variable can not belong to many scopes. If you want to use variables from parent scope, you can use `parent scope`.
Copy link
Member Author

@jacquesqiao jacquesqiao Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use multi instead of many is better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe multiple ?


1. We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
2. Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在编程语言里面,scope是可以多层嵌套的。这里scope可以嵌套多层吗?比如如果local没有,就先找parent,然后再找parent的parent,直到找到为止。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

读到文档末尾看到了,是可以嵌套的

class Scope {
public:
Scope(const std::shared_ptr<Scope>& scope): parent_(scope) {}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里传递的参数是指针的引用,是说parent_这个指针还会发生变化吗?也就是说一个Scope的parent scope是可以自己修改的?

Copy link
Collaborator

@reyoung reyoung Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于复杂类型的传参,应该传递const 引用。而parent_调用了shared_ptr的复制方法,复制了一份scope的指针。

如果这里改成传递std::shared_ptr<Scope>,会在传参的时候创建一个临时变量。shared_ptr的开销在于每次创建变量的时候,要对这个变量加一个全局的Mutex,再增加一下计数器。所以开销也不算特别小。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一个Scope的parent scope自己不可以修改。

Copy link
Collaborator

@wangkuiyi wangkuiyi Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us don't over use smart pointers. I believe the following would be enough for this case.

explicit Scope(const Scope& parent) : parent_(parent) {}

FYI, Caffe2 has the following:

explicit Workspace(Workspace* const shared) : shared_(shared) {}


private:
std::shared_ptr<Scope> parent_;
std::unordered_map<std::string, std::unique_ptr<Attribute>> vars_;
Copy link
Member

@QiJune QiJune Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里似乎是笔误吧,放置了一个Attribute

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (var != nullptr) {
return var;
} else if (parent_ != nullptr) {
return parent_->Get(name);
Copy link
Member

@QiJune QiJune Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

既然parent也是scope,在这里调用的Get方法,那么Scope里面需要定义一个Get接口吧?还是说这里笔误了,应该是return parent_->GetVariable(name)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

```
## Only scope can create a variable

To ensure `only scope can create a variable`, we should mark `Variable`'s constructor as a private member function, and Scope is a friend class of Variable. And then only `CreateVariable` can construct `Variable`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a variable_test.cc, this test file must contain scope.h?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在想我们是在写代码的时候就要限制死用户不能在其他地方创建Variable,还是这个只是个『约定』。

感觉如果写代码的时候限制死,用户在其他地方创建Variable的时候直接报编译错,似乎更科学。

如果要这样的话,variable_test.cc如果需要写单测就要include scope.h

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @hedaoyuan . I don't think we need the restriction that Variables can only be created by Scope.

Copy link
Contributor

@hedaoyuan hedaoyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember this afternoon we discussed the third point in this comments(#2545 (comment)).
The grad of the value is recorded by the Scope. But I did not seem to see the design.

@reyoung
Copy link
Collaborator

reyoung commented Jun 22, 2017

I remember this afternoon we discussed the third point in this comments(#2545 (comment)).
The grad of the value is recorded by the Scope. But I did not seem to see the design.

如何根据一个参数的名字,来找到他的梯度。这件事情并不是Scope设计中的事情。Scope只存储了名字到变量的映射。

如何根据一个参数的名字找到他的梯度,目前来看可以有两种方法:

  • 方法1: 参数的名字和梯度的名字二者具有一定规律。例如参数都叫做xxx_param,而梯度都叫做xxx_param_grad。这样二者就可以相互对应了。
  • 方法2: 在另一个地方存储一份映射表。里面的key是参数的名字,而value是梯度的名字。

不过,这两种方法究竟应该选择哪个,如何做,应该是另一个issue和PR确定的事情。

@jacquesqiao
Copy link
Member Author

jacquesqiao commented Jun 22, 2017

一点疑问
Variable是不是一定只能由Scope创建,Op.Run()内部可能会创建一些临时的Variable么,为什么?

@reyoung
Copy link
Collaborator

reyoung commented Jun 22, 2017

Op内部可能会创建一些临时的Variable

  1. Op是无状态的,不能在Op中保存临时变量。
    • Op的run函数是const的,它也没办法修改类的private member。
  2. 如果Op是有状态的,一个神经网络,运行时切换Scope就不能完成了。因为如果Op在显卡1上申请了临时变量,而切换到显卡2的Scope,就根本没办法跑了。
  3. 如果每一次run函数需要临时的计算局部变量,也应该创建一个Tensor而不是Variable。Variable是Op间的输入输出,而Tensor可以随时创建销毁。
     void run(...) const {
        Tensor<Place> tmp;
     }
  4. 一个Op函数,如果有某些变量需要充当这个Op的状态,例如BatchNorm中的MovingMeansMovingStd,也都应该是存在Scope中的Variable。

所以Variable只能由Scope创建。

@wangkuiyi
Copy link
Collaborator

I don't think Operators are "stateless". an RNNOp would even create a sub-scope as it runs.

Actually, I used to think that Net and Scope can be decoupled, but above seems an objective example. What do you think about this?

@reyoung
Copy link
Collaborator

reyoung commented Jun 23, 2017

I don't think Operators are "stateless". an RNNOp would even create a sub-scope as it runs.

RNN Op will create its sub-scopes actually. But it should return sub-scopes as Operator's output. The sub-scopes should not store locally in the operator class.

In general, if an Op holds a GPU memory by private data member, we cannot switch Scope when running this Op.

If an Op needs some temporary memory while computing, we can just use Tensor<GPUPlace> to create a temporary tensor. As we will write an effective memory allocation module, to create temporary tensor while running should be fast.

If an Op needs some temporary memory sharing between multiple ops, it should be a variable in scope by our definition.

@QiJune
Copy link
Member

QiJune commented Jun 23, 2017

在caffe2中,除了workspace中的blobMap保存了所有blob的信息,python端还提供了一个ModleHelper的类,抽象出了一个Model的概念,用来索引所有参数相关的blob;在dynet中,也存在一个Model的抽象,用来索引所有的参数。
当然,我们可以只需要scope,让用户来自己写optimize中参数更新部分的逻辑,对应更新相应的参数。但这样给用户的负担可能比较大,因此建议引入Model的抽象。

那么我们可能的做法有以下两个:

  • 在C++端就能够实现配置一个网络,完成网络的训练,那么我觉得在Scope之外,还需要一个Model的概念;
  • C++端仍然只是有一个Scope的概念,在python端做一个model(这样就与python强绑定了)

@reyoung
Copy link
Collaborator

reyoung commented Jun 23, 2017

在C++端就能够实现配置一个网络,完成网络的训练,那么我觉得在Scope之外,还需要一个Model的概念;

还是在C++这边实现这些吧。。不要和Python强绑定为好。

@hedaoyuan
Copy link
Contributor

Op是无状态的,不能在Op中保存临时变量
如果Op是有状态的,一个神经网络,运行时切换Scope就不能完成了。因为如果Op在显卡1上申请了临> 时变量,而切换到显卡2的Scope,就根本没办法跑了。

我也不认为OP只能是stateless的,另外,是否stateless和有无临时变量也没关系。卷积OP里面有临时变量并不妨碍什么。
另外,我记得讨论的时候说的Op是带Device模板参数的,确定Scope可以从显卡1切换到显卡2吗?

@reyoung
Copy link
Collaborator

reyoung commented Jun 23, 2017

是否stateless和有无临时变量也没关系
线下交流,道远比较担心在 Run函数中每次使用Tensor<GPU> tmp;申请临时变量比较耗费时间。特别是变量大小比较大的情况。

我们肯定期待使用高效的alloc算法会让临时变量的申请和释放不耗时。但是如果Tensor<GPU> tmp申请释放比较耗时,那么我们可以将临时的变量加到这个Op的output中。这样,这个Variable也会在Scope中被创建出来,不会消亡。

即:

class OpBase {
 private:
  vector<string> outputs_;
};

class SomeOpUseHugeMemory : public OpBase {
 public:
  SomeOpUseHugeMemory() {
    outputs_.push_back("HugeTempVariableName");
  }
};

@jacquesqiao
Copy link
Member Author

jacquesqiao commented Jun 23, 2017

在C++端就能够实现配置一个网络,完成网络的训练,那么我觉得在Scope之外,还需要一个Model的概念;

看了下ModelHelper貌似是对Cpp中的net和parameters的一层封装,net就是core.Net(), parameter是BlobReference,按照现在的设计,在实现Python部分的时候,应该也需要这样一层抽象

Just like [scope](https://en.wikipedia.org/wiki/Scope_(computer_science)) in programming languages, `Scope` in the neural network can also be a local scope. There are two attributes about local scope.

1. We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
2. Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a local scope can have multiple parent scopes.
If a local scope Ls has two parent scopes PsA and PsB; PsA and PsB have two variables called a, and b, respectively.
Ls want to use PsA.a and PsB.b how to do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, user cannot access PsA.a and PsB.b in one local scope.

The scope is a linked-list. It will get local variable firstly, and local variable will hide parent variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just not at present, or never? Whether to consider later?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this situation is not quite useful right now.

reyoung added a commit to reyoung/Paddle that referenced this pull request Jun 23, 2017
The original discuess in

* PaddlePaddle#2548 (comment)
* PaddlePaddle#2579 (comment)

This commit is just a proposal, let's do such kind of summarization in
this PR.
@jacquesqiao jacquesqiao merged commit 718eff9 into PaddlePaddle:develop Jun 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants