Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scope design doc #2548

Merged
merged 32 commits into from
Jun 27, 2017
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
bcac91a
scope design doc
jacquesqiao Jun 21, 2017
002a6c9
Rearrange docs
reyoung Jun 21, 2017
674b1d3
Update code
reyoung Jun 21, 2017
7a48507
fix code style
jacquesqiao Jun 21, 2017
3e09978
Add scope doc
reyoung Jun 22, 2017
acf0b75
Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into fe…
reyoung Jun 22, 2017
04ad9b6
Add Scope Parent & Local section
reyoung Jun 22, 2017
581e4c1
Parent & local scope done
reyoung Jun 22, 2017
0b70361
Refining english
reyoung Jun 22, 2017
76e2a3c
Refine English
reyoung Jun 22, 2017
8282138
some properties of scope
jacquesqiao Jun 22, 2017
4413726
Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into scope
jacquesqiao Jun 22, 2017
d7aca77
Update API
reyoung Jun 22, 2017
2d5507f
Add interfaces
reyoung Jun 22, 2017
64a1cdf
Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into fe…
reyoung Jun 22, 2017
73b1c5b
add overview for scope design doc
Jun 22, 2017
1f0056b
Update interface
reyoung Jun 22, 2017
0b07583
Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into scope
Jun 22, 2017
17eed33
Update key attributes
reyoung Jun 22, 2017
37fd48b
some detailed explaination of the Scope properties
jacquesqiao Jun 22, 2017
c3a4b8b
refine style of markdown
jacquesqiao Jun 22, 2017
db96c0e
Use unique_ptr instead of shared_ptr/weak_ptr.
reyoung Jun 22, 2017
f104ce2
fix a mistake share by nets -> share by scopes
jacquesqiao Jun 22, 2017
32fe097
Merge branch 'scope' of https://github.com/jacquesqiao/Paddle into fe…
reyoung Jun 22, 2017
eab0e52
To google code style
reyoung Jun 22, 2017
63a56b4
Change typo
reyoung Jun 22, 2017
921fa13
Remove delete
reyoung Jun 22, 2017
5d88249
Typo
reyoung Jun 22, 2017
f8a209c
Rearrange description.
reyoung Jun 22, 2017
c5ad89a
Change title
reyoung Jun 22, 2017
237efc2
Fix markdown
reyoung Jun 22, 2017
3bac2d0
Typo
reyoung Jun 22, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 124 additions & 0 deletions doc/design/scope.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Design of Scope in Paddle

## Overview

Scope is an important concept in programming languages, which defines a program region that a set of bindings between names and entities applies. In a specific scope, a valid name is uniquely associated with an entity, such as a variable. And in another scope, this name may refer to other entity or nothing at all. It clearly restricts the visibility and validity of names in a program. Hence **Scope** is introduced to PaddlePaddle to manage variables in context. But different from the original abstract concept, Scope now becomes an object with two important attributes:

- Scope is an association of a name to variable.
- Variables in a parent scope can be retrieved from local scope.

A detailed explanation of these two attributes goes as following.


## Scope is an association of a name to variable.

Scope is an association of a name to variable. All variables belong to `Scope`. You need to specify a scope to run a Net, i.e., `net.Run(&scope)`. One net can run in different scopes and update different variable in the scope.


1. Scope only contains a map of a name to variable.

All parameters, data, states in a Net should be variables and stored inside a scope. Each op should get inputs and outputs to do computation from a scope, such as data buffer, state(momentum) etc.

1. Variable can only be created by Scope and a variable can only be got from Scope. User cannot create or get a variable outside a scope. This is a constraints of our framework, and will keep our framework simple and clear.

1. Scope only contains methods that are used to Create and Get Variables. Scope do not contain Operators and have no information to run them.
`Net` is designed to drive the computation and Scope only contains a map of variables. There is no computation logic inside a `Scope`. Scope just handles the lifetime management of variables.
- `Create` is used to create a Variable by its name and add the mapping relation.
- `Get` is used to find a Variable by name.

1. Every variable only belongs to one certain Scope.

Variable can not belong to many scopes. If you want to use variables from parent scope, you can use `parent scope`.

1. Scope should destruct all Variables inside it when itself is destructed. User can never store `Variable` pointer somewhere else.

Because Variable can only be got from Scope. When destroying Scope, we also need to destroy all the Variables in it. If user store `Variable` pointer to private data member or some global variable, the pointer will be a invalid pointer when associated `Scope` is destroyed.

```cpp
class Scope {
public:
Variable* CreateVariable(const std::string& name);
const Variable* GetVariable(const std::string& name) const;

private:
std::unordered_map<std::string, std::unique_ptr<Vairable>> vars_;
};
```


## Parent scope and local scope

Just like [scope](https://en.wikipedia.org/wiki/Scope_(computer_science)) in programming languages, `Scope` in the neural network can also be a local scope. There are two attributes about local scope.

1. We can create local variables in a local scope. When that local scope are destroyed, all local variables should also be destroyed.
2. Variables in a parent scope can be retrieved from local scopes of that parent scope, i.e., when user get a variable from a scope, it will try to search this variable in current scope. If there is no such variable in the local scope, `scope` will keep searching from its parent, until the variable is found or there is no parent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a local scope can have multiple parent scopes.
If a local scope Ls has two parent scopes PsA and PsB; PsA and PsB have two variables called a, and b, respectively.
Ls want to use PsA.a and PsB.b how to do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, user cannot access PsA.a and PsB.b in one local scope.

The scope is a linked-list. It will get local variable firstly, and local variable will hide parent variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just not at present, or never? Whether to consider later?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this situation is not quite useful right now.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在编程语言里面,scope是可以多层嵌套的。这里scope可以嵌套多层吗?比如如果local没有,就先找parent,然后再找parent的parent,直到找到为止。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

读到文档末尾看到了,是可以嵌套的

```cpp
class Scope {
public:
Scope(const std::shared_ptr<Scope>& scope): parent_(scope) {}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里传递的参数是指针的引用,是说parent_这个指针还会发生变化吗?也就是说一个Scope的parent scope是可以自己修改的?

Copy link
Collaborator

@reyoung reyoung Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于复杂类型的传参,应该传递const 引用。而parent_调用了shared_ptr的复制方法,复制了一份scope的指针。

如果这里改成传递std::shared_ptr<Scope>,会在传参的时候创建一个临时变量。shared_ptr的开销在于每次创建变量的时候,要对这个变量加一个全局的Mutex,再增加一下计数器。所以开销也不算特别小。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一个Scope的parent scope自己不可以修改。

Copy link
Collaborator

@wangkuiyi wangkuiyi Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us don't over use smart pointers. I believe the following would be enough for this case.

explicit Scope(const Scope& parent) : parent_(parent) {}

FYI, Caffe2 has the following:

explicit Workspace(Workspace* const shared) : shared_(shared) {}

Variable* GetVariable(const std::string& name) const {
Variable* var = GetVarLocally(name);
if (var != nullptr) {
return var;
} else if (parent_ != nullptr) {
return parent_->GetVariable(name);
} else {
return nullptr;
}
}

private:
std::shared_ptr<Scope> parent_ {nullptr};
};
```

In `Scope` class, there is a private data member called `parent_`. `parent_` is a smart pointer to its parent scope. When user `Get` a variable by its `name`, the `name` will be searched inside the current scope. If the variable cannot be found locally and parent scope is not a `nullptr`, the variable will be searched inside that parent scope. `parent_` pointer's default value is `nullptr`. It means that the scope is a global scope when `parent_` is nullptr.

A local scope is very useful when we implement Recurrent Neural Network. Each timestep of an RNN should be a `Net`. Each `Net` of timestep (`StepNet` for short) should use an independent local scope. Just like variables in a while loop is inside a local scope in programming languages. By using a single `StepNet` and changing local scope, we can implement an RNN easily.

# Interface Design

```cpp
class Variable {
private:
Variable() = default;
friend class Scope;
};

class Scope {
private:
Scope(const std::shared_ptr<Scope>& parent = nullptr);

public:
static std::shared_ptr<Scope> Create(const std::shared_ptr<Scope>& parent = nullptr);

// return nullptr if not found.
Variable* GetVariable(const std::string& name) const;

// return Error if already contains same name variable.
Error CreateVariable(const std::string& name);

private:
std::shared_ptr<Scope> parent_;
std::unordered_map<std::string, std::unique_ptr<Variable>> vars_;
};
```
## Only scope can create a variable

To ensure `only scope can create a variable`, we should mark `Variable`'s constructor as a private member function, and Scope is a friend class of Variable. And then only `CreateVariable` can construct `Variable`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a variable_test.cc, this test file must contain scope.h?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在想我们是在写代码的时候就要限制死用户不能在其他地方创建Variable,还是这个只是个『约定』。

感觉如果写代码的时候限制死,用户在其他地方创建Variable的时候直接报编译错,似乎更科学。

如果要这样的话,variable_test.cc如果需要写单测就要include scope.h

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @hedaoyuan . I don't think we need the restriction that Variables can only be created by Scope.


## When scope destroyed, all variables inside this scope should be destroyed together

The scope hold unique pointers for all variables. User can `GetVariable` from scope, but he should not hold this pointer as a member variable. Because when scope is destroyed, all variables inside this scope will be destroyed together.

## Sharing a parent scope

Local scope contains a `parent_` pointer. It is a linked-list for scopes. Using a `shared_ptr` because when a local scope is using, its parents cannot be destroyed.

Also, as the parent scope is a `shared_ptr`, we can only `Create()` a scope shared pointer. We cannot construct a scope variable, because it cannot be passed to other scope as `parent` pointer.

## Orthogonal interface

`GetVariable` will return `nullptr` when `name` is not found. It can be used as `Contains` method. `CreateVariable` will return a `Error` when there is a name conflict locally. Combine `GetVariable` and `CreateVariable`, we can implement `CreateOrGetVariable` easily.