Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Of Refactor Topology #1665

Closed
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
0e92fbf
Design Draft for using map in protobuf.
reyoung Mar 21, 2017
f5a14b4
Add whole design
reyoung Mar 21, 2017
cf2d77c
Typo
reyoung Mar 21, 2017
a09299a
Make self mutable
reyoung Mar 21, 2017
52d43cd
Add invoke graph
reyoung Mar 21, 2017
b79af86
Add More details
reyoung Mar 21, 2017
cab093d
Update Design.
reyoung Mar 27, 2017
d30c033
Follow luotao's tips, add more description.
reyoung Mar 27, 2017
f001bc9
Fix wrong code style.
reyoung Mar 31, 2017
857f752
Rearrange Documentation
reyoung Apr 5, 2017
1cfd1da
Merge branch 'develop' of github.com:baidu/Paddle into feature/design…
reyoung Apr 5, 2017
b922b00
Add how to write a layer in pure cpp.
reyoung Apr 5, 2017
b3a3b0e
Add skeleton of dynet
reyoung Apr 5, 2017
4a94baa
Add comparation of ways to define layer.
reyoung Apr 5, 2017
d346d49
Merge remote-tracking branch 'reyoung/feature/design_of_layer_code_ge…
QiJune Apr 5, 2017
3e5d22a
add dynamic net doc
QiJune Apr 5, 2017
e3d0fa6
Merge pull request #4 from QiJune/feature/dynamic_net_doc
reyoung Apr 6, 2017
7d440eb
Merge branch 'develop' of github.com:baidu/Paddle into feature/design…
reyoung Apr 6, 2017
4ac8719
Simplize dynamic net implementation
reyoung Apr 6, 2017
386133a
Add link to dynet
reyoung Apr 6, 2017
ff63670
Add how to write a layer in pure cpp
reyoung Apr 6, 2017
12a430a
Change highlight to text
reyoung Apr 6, 2017
7ce9fd5
Merge branch 'develop' of github.com:baidu/Paddle into feature/design…
reyoung May 6, 2017
03184c1
Unify topology design in CPP
reyoung May 6, 2017
4acd579
Add topology user stories and goals
reyoung May 10, 2017
a109c54
Add title
reyoung May 10, 2017
e99e19c
Fix typo
reyoung May 10, 2017
6b8893e
Refine English
reyoung May 10, 2017
726ba05
Add implementation steps.
reyoung May 10, 2017
d4ccdea
Typo
reyoung May 10, 2017
bb562b6
Merge branch 'develop' of github.com:baidu/Paddle into feature/design…
reyoung May 16, 2017
bb68fda
Update developer code example
reyoung May 17, 2017
ccf5d7d
Add implementation details
reyoung May 17, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
223 changes: 223 additions & 0 deletions doc/design/topology.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Topology Overview
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Topology Overview

==>

Design Doc: Add New Layers

to make the title of this document consistent with the title of this PR.

Topology is a concept in Paddle for representing neural networks. A neural network contains one topology, which describes how layers connected to each other, and many parameters. The other deep learning frameworks may call this concept a computation graph, neural network configurations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this paragraph can be shortened as:

In PaddlePaddle, we represent a neural network by its topology and parameters. The topology is a directed graph of layers.


The topology is not only an API level concept but also how we organize the computation codes for each `Layer` or `Function` in Paddle. The Paddle should maintain a dictionary from `Layer Type` to Layer implementation, e.g. from string `mul` to function `void tensor_multiply(Tensor& ins, Tensor& outs)'. The mechanism about how to manipulate topology by users, how Paddle maps user topology to implementations of `Layer` and `Function` is a fundamental problem for refactoring Paddle.
Copy link
Collaborator

@wangkuiyi wangkuiyi May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段话问题挺多,看不明白想说什么。

  1. “Topology不仅是。。。。”,然后应该接“而且是。。。”。但是没有接上。
  2. "API level concept" 这样的说法假大空,不知所指。
  3. Layer Type是什么?
  4. 后面的例子里,mul显然不是一个layer,不仅没有解释前面的话,而且引入新的疑问。
  5. 接下来“manipulate topology”又是指什么?
  6. 有一个`对应的`被写成了'

这段话看上去想说很重要的东西。但是不仅没说明白这篇设计为什么能简化增加layer,甚至引入了一个新概念function。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • topology不仅仅是一个End Users看到的API接口,更重要的是,拓扑结构的设计也关乎到了Paddle的计算代码(Layer)如何组织
  • Layer Type是一个字符串来表示Layer的类型,譬如一个fc_layer,他的type就是"fc"。
  • Paddle总要维护一个 map<string, LayerCreator>的一个映射,用来根据用户配置创建Layer对象。
  • manipulate topology, 创建、修改一个拓扑结构。我改成create and modify

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make Function, Layer, Projection the same thing in the new design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make Layer and Projection are the same thing is simple, but Function didn't contain backward method.


## User Stories and Examples

### Kernel Developers

Alan is a professional developer in CPU and GPU. He can write kernel functions of a new `Layer` with the best performance. However, he is not a familiar with Paddle API language, Python. Alan just needs to write the kernel function and register them in Paddle, and then Paddle should generate the user-side APIs for these kernel functions without any codes written by Alan.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个例子和“让增加layers变得更方便”有什么关系呢?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新写Layer的开发者不需要写Python Code了。只需要写『如何计算』即可。


```cpp
struct CosSimAttribute : public topology::Attribute {
double scale;
REGISTER_FUNC_ATTRIBUTE() {
regAttr(&CosSimAttribute::scale, "scale", "the scale of cosine operator").defaultValue(1.0).largerThan(0.0);
}
};

template <DeviceType devType>
void cos_kernel(std::vector<Tensor>& ins, std::vector<Tensor>& outs, const CosSimAttribute& attr) {
// implemetation here.
}

BEGIN_REGISTER_FUNCTION(cosFwd, cosineForward, CosSimAttribute)
addTensor<INPUT>(/*dim*/ 2);
addTensor<INPUT>(/*dim*/ 2);
addTensor<OUTPUT>(/*shape = */ {topology::meta::kTensorShape_BATCH_SIZE, 1},
/*arg_type*/ ASSIGN_TO);

setDescription(R"DOC(Cosine similarity forward function.
There are two inputs of this function. The first matrix is a [h*w] matrix the
second input is a [h*w] matrix or a [1*w] matrix. the output matrix will be a
[h*1] matrix.
)DOC");

setShapeInferer([](std::vector<topology::TensorPtr>& ins,
std::vector<topology::TensorPtr>& outs) {
auto& shape0 = ins[0]->shape();
auto& shape1 = ins[1]->shape();

if (shape0 != shape1 && (shape0[1] != shape1[1] || shape1[0] != 1))
return Error(
"Input shape should be same, or the second height should be 1");
if (ins[0]->sequenceType() != ins[1]->sequenceType())
return Error("Input sequence type should be same");
outs[0]->setShape({ins[0]->shape()[0], 1});
outs[0]->setSequenceType(ins[0]->sequenceType());
outs[0]->setDataType(ins[0]->dataType());
return Error();
});
END_REGISTER_FUNCTION(cosFwd);
```

### QA Developers

Bob is a QA developer of Paddle. He wants to tests all Paddle supported `Function` and `Layer`. However, each layer has different configuration attributes, e.g. `scale` in `cosine` function. Each configuration attribute has different value range, data type. Bob should easily test all boundary conditions of one Layer or Functions by using new mechanism about topology.

```
auto cos = function::Register("cos");
Copy link
Collaborator

@wangkuiyi wangkuiyi May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么要用registerer?直接调用cos函数不行吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不太行呀。。这就和层的register一样。用Register来管理Paddle所有支持的层和函数,这样其他语言就可以用这个函数了。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同样也不是很明白这里的cos,如果是一个Function对象,那这里的function::Register("cos")应该是一个create?


for each_attribute in cos.attributes:
each_attribute = each_attribute.min_value

test(cos);

for each_attribute in cos.attributes:
each_attribute = each_attribute.max_value
test(cos);
```

### Language Binding Developers

Carol is a language binding developer of Paddle. She wants to develop a language binding of Paddle. She is not familiar with Paddle C++ core and does not want to go so deep in Paddle. She just wants a clear list of what Layer Paddle supports, the configuration parameters of each Layer.

Also as a language binding developer, Carol does not want to write any topology validation code in language binding because Paddle C++ Core could be in flux and layer's API could be changed.

She just can access the register information of `Topology` and uses this information in another language. She can either uses reflection or code generation in that language to generate end-user APIs.

```python
import paddle

for layer_name in paddle.topology.meta.all_registed_layers:
def __func__(**kwargs):
layer_meta = paddle.topology.meta.all_registed_layers["layer_name"]
return layer_meta.new_configration(kwargs)

globals()[layer_name] = __func__
```

### API End-Users

David is a new user of Paddle, who are not familiar with Paddle and deep learning. He writes a Python program and configures a neural network. When he run this program, he expects a clear error message when his configuration is wrong. The error message should be like `cosine layer's scale parameter should be larger than 0.0.`, not just a `check error` in our computation kernel. Because we register all parameter's meta information, it is easy to achieve this goal.


## Goals

After thinking lots of user stories, we make the conclusion of what we want in Topology design.

* User should directly operate C++ topology configuration because we should maintain the consistency between each language bindings, and make language binding layer thin and easily to develop.
* Our topology configuration should be able to validate user's input and give a reasonable error message. Also, we should maintain some meta information of each configuration attribute, e.g. `scale` attribute in `cos` layer is a `double` value, should be larger than 0.0, and the default value is 1.0.
* We should serialize our topology into a portable format, so users can use the model they trained before for inference.
* We should let our kernel developer easily to register their kernel functions to Paddle and not make them write configuration APIs in Python.

## Implementation

### Meta Information
To achieve goals above, we need a place to store meta information of each layer. The meta information is used to describe what a layer could be configured, what the attributes of one layer could set, what the input types could be.

For example, the cosine layer should have two inputs, and the two inputs should be the same shape. The two inputs should both be the dense matrix. The cosine layer should have only one output, and the output shape should be [batch_size, 1] because, for each pair of input sample, the cosine similarity should be a scalar. The cosine layer has one configurable argument, `scale`. It is the scalar number multiplied to the cosine similarity. `scale` should be a `double` value, the default value is 1.0, and should be larger than 0.0.

All these meta information should be written in namespace `paddle::topology::meta`. There are several basic classes in this namespace.

* Constraints: It is a function list which stores the constraints of one attribute. It used to validate user input must be correct.
```cpp
template <typename T> // T is attribute type
class Constraints {
private:
// (T* attr, bool alreadySet) -> Error
// attr is an inout parameter for attribute.
// alreadySet means if this attribute is set by user or previous callbacks.
// return Error if the attribute is not valid.
std::vector<std::function<Error(T*, bool)>> callbacks_;
public:

// Each constraint function will add a check function to callbacks_;
Constraints<T>& mustSet();
Constraints<T>& defaultValue(const T& val);
Constraints<T>& largerThan(const T& val);

// More constraint function below.
};
```

* AttributeMeta: It represent a meta information of an attribute, e.g. `scale`. It contains the attribute name, description, type information and `Constraints`.
```cpp
class AttributeMeta {
public:
std::string name; // e.g. "scale"
std::string description; // e.g. "the scale of cosine operator"
const std::type_info& type; // e.g. type_id(double)

std::any constraints; // the contraints of this attribute. When implementation,
// using `std::any` to get rid of template argument.
// But it is actually Constraints<T> actually, while T's type
// is store in the `type` field.
};
typedef std::shared_ptr<AttributeMeta> AttributeMetaPtr;
```
* AttributeMetaMap: The attribute meta map contains many AttributeMeta. Each Layer, FunctionMeta, TensorMeta is a AttributeMetaMap. User can addAttribute to an AttributeMetaMap.
```cpp
class AttributeMetaMap: public std::unordered_map<std::string, AttributeMetaPtr> {
public:

/// Add attribute to map, returns constraints object, which user can add constraints.
/// @code
/// attr_map.addAttribute("scale", "the scale of cosine operator").defaultValue(1.0).largerThan(0.0);
/// @endcode
template <typename T>
Constraints<T>& addAttribute(const std::string& name, const std::string& description);
};
```

* AttributeMap: The attribute map is the data structure which save attributes. The AttributeMap is not only used by user defined topology information, but also used by some meta information, which make meta information can store any type of Attributes, and be decoupled with upper invoker.
```cpp
typedef std::unordered_map<std::string, std::any> AttributeMap;
```

* TensorMeta: Tensor is the input/output of the Layer or Function. It is an vector `AttributeMetaMap`. The data type, sequence type is just an attribute of the tensor.
```cpp
enum DataType {DENSE, SPARSE, ...};
enum SequenceType {NO_SEQUENCE, SEQUENCE, ...};
class TensorMeta: public AttributeMetaMap {
public:
TensorMeta& setValidDataType(std::set<DataType> dataType);
TensorMeta& setValidSeqType(std::set<SeqType> seqType);
TensorMeta& setShapeDim(size_t dim);

private:
// nothing! TensorMeta just a AttributeMetaMap, but add some helper functions.
};
```

* FunctionMeta: It represent a meta information of a paddle::Function. It contains two vectors of TensorMeta, and they are inputs and outputs. The FunctionMeta is a AttributeMetaMap. that kernel developers can add the attributes used by their kernel.
```cpp
class FunctionMeta : public AttributeMetaMap {
public:
std::vector<TensorMetaPtr> inputs;
std::vector<TensorMetaPtr> outputs;

/// `cpuKernel` `gpuKernel` `shapeInferer`, `estimateFlops` just store into metaAttrs.
/// The invoker of function meta will decide which type and name should be.
/// Also each field could be many types.
/// For example,
/// shapeInferer is a function from `std::vector<Tensor>& in` to `std::vector<Tensor>& out`.
/// but different function could use different Attribute class.
/// cosine layer's shapeInferer could be
/// (std::vector<Tensor>& in, std::vector<Tensor>& out, const CosSimAttribute& attr) {...}
AttributeMap metaAttrs;
};
```

* LayerMeta: A similar concept like FunctionMeta, but used to represent `Layer'.
* TopologyMeta: A topology meta contains a vector of `AttributeMeta`, which represent the attributes can be set globally in a topology.


### Topology information

The topology information is the actual information of a neural network. It is one to one correspondence to meta information. We use `std::any`(a.k.a `boost::any`) to represent the attribute value of each attribute because attribute could be any type(double/int/vector<int>, etc).

So the `topology::Tensor` contains an attribute map, e.g. `map<string, any>`. The `Function` contains an attribute map, input tensors, and output tensors. The rest types of topology information are correspondent to its meta information.

## Step by step approach

After building the `Topology` concept in C++, Paddle's Python code could be clean up. However, the development process would be broken down into step by step, carefully completed, to make Paddle code steady and not introduce bugs.

The step by step approach are:

1. Add `Constraints`, `AttributeMeta` , `TensorMeta`, `FunctionMeta` to refactor the `paddle::Function` package. Make `paddle::Function` just a plain function registered to `FunctionMeta`. Use a small scope experiment make sure we could uses `topology::meta` and `topology` represent a piece of neural network.

2. Complete the `LayerMeta`, `TopologyMeta`, etc. But write a conversion method from `protobuf::LayerConfig`/`protobuf::ModelConfig` to `topology::Layer`/`topology::Topology`. Make `paddle_trainer` can use and test `topology` package. A side-effect of this job is to let `paddle_trainer` validation users' `trainer_config.conf` file, and give a reasonalbe error message when user gives a wrong configuration.

3. Clean up the implementation of `paddle.v2` topology. Let `v2` package not invoke `trainer_config_helper`, just invoke `topology` package directly from C-API.