-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design doc for operator attribute #2606
Changes from 5 commits
cdd28f7
87e3820
581ce7d
306dcfe
0d9b9d3
e3a63d7
ba54a0c
18dd0ad
908c8c1
b901e3b
7250a92
3090785
2bde865
b90a3a6
d21d486
911113d
1f35526
224c6a4
1895f06
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
# Design Doc about operator attribute | ||
|
||
## background | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Background There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
In a neural network, each operator could contain some configurable attributes. For example, a cosine similarity operator may contain an attribute named `scale`. The default cosine similarity returns a value in range [-1.0, 1.0]. But the user can set range scale manually, e.g., user set `scale=5.0`, then that cosine operator will return a value in the range [-5.0, 5.0]. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
==>
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
The configurable attributes could be various types. Some operators need `float` value to configure; some need `string` value. We need a data structure to represent different types. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. be of various types There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
==>
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
Each operator contains different configurable attributes. The names of attributes are not same. We need an associate map from attribute name to attribute value for `Operator`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
==>
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In protobuf, the type of class CosineOp {
public:
void Init(const AttributeReader& reader) {
scale_ = reader.Get<float>("scale")
}
private:
float scale_;
}; If we extract |
||
|
||
Also as we want to use `protobuf` to serialize and deserialize our model, we need to implement the attribute value and the associated map from attribute name to attribute value in `protobuf`. | ||
|
||
In conclusion, there are four things we know as background. | ||
|
||
1. We need an attribute type for Operator. | ||
1. That attribute type could represent different types. | ||
1. That attribute value should be associated with an attribute name, like a map<string, Attribute>. | ||
1. We need to implement them in `protobuf`. | ||
|
||
## Protobuf Implementation | ||
|
||
There are two frameworks implement `Attribute` concept in `protobuf`. They are [`caffe2`](https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2.proto#L98) and [`tensorflow`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto#L16). | ||
|
||
* Caffe2 uses `proto2` syntax. It treats all attributes as a list, and each attribute contains a `name`. Each time caffe2 read an attribute is searching a variable in a list. It is slow if the number of attributes is large. Caffe2 also mark all field as `optional`. It doesn't ensure `one of` attribute value is set. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Surprise to know that Caffe2 searches attributes in a protobuf list. The right way is to load from protobuf field There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's my mistake, Caffe2 is loading all attribute into memory first. |
||
* By using `proto3` syntax in tensorflow, the attribute implementation in tensorflow is using `map`, and `oneof` keywords. Looking up from attribute map in tensorflow is fast. | ||
|
||
Paddle is using `protobuf 3` as its dependency library. By simplify `tensorflow`'s implementation, Paddle's Attribute protobuf message schema could be | ||
|
||
```protobuf | ||
message Attribute { | ||
message ListValue { | ||
repeated int32 ints = 1; | ||
repeated float floats = 2; | ||
repeated string strings = 3; | ||
} | ||
|
||
oneof value { | ||
ListValue list = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The oneof directive doesn't seem help much here -- it can locate a field in syntax = "proto3";
message Attribute {
enum Type {
INTS = 0;
FLOATS = 1;
STRINGS = 2;
INT = 3;
FLOAT = 4;
STRING = 5;
}
Type type = 1;
repeated int32 ints = 2;
repeated float floats = 3;
repeated string strings = 4;
int32 int = 5;
float float = 6;
string string = 7;
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. syntax = "proto3";
message Attribute {
repeated int32 ints = 1;
repeated float floats = 2;
repeated string strings = 3;
optional int32 int = 4;
optional float float = 5;
optinoal string string = 6;
} Maybe the |
||
int32 i = 2; | ||
float f = 3; | ||
string s = 4; | ||
} | ||
} | ||
``` | ||
|
||
In `OperatorDescription` message, there should be a field like this: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OperatorDescription => OperatorDesc ? |
||
|
||
```protobuf | ||
message OperatorDescription { | ||
map<string, Attribute> attrs; | ||
} | ||
``` | ||
|
||
## CPP implementation | ||
|
||
### AttributeReader | ||
|
||
In CPP, it should be a helper class for reading `map<string, Attribute>`. The reading method should accept a template parameter, which is the type of Attribute. If type mismatch or attribute is not found, `Get` method should return an `Error`. That helper class we named `AttributeReader`. | ||
|
||
The interface of `AttributeReader` is like this: | ||
|
||
```cpp | ||
using AttributeMap = google::protobuf::Map<std::string, Attribute>; | ||
class AttributeReader { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AttributeReader => Attributes This is not a reader, it holds and searches attributes as well. |
||
public: | ||
explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} | ||
|
||
template <typename T> | ||
Error __must_check Get(const std::string& attributeName, T* attr) const; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what will There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The attr is always pointing to an outside variable. The pointer is not modified in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, so the value There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The pointer |
||
|
||
template <typename T> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if an attribute was marked int in the protobuf message, but the user tries to retrieve its string value? How can we check and find such kind of errors? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because we use |
||
Error __must_check GetArray(const std::string& attributeName, | ||
std::vector<T>* array) const; | ||
|
||
private: | ||
const AttributeMap& attrs_; | ||
}; | ||
``` | ||
|
||
There are two methods in `AttributeReader`: `Get` and `GetArray`. `GetArray` is used for `ListValue`, and `Get` is used for the rests. The user should invoke either of them when he wants to get an Attribute value from `AttributeMap`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The ListValue message is defined as:
And GetArray method only has one template parameter T. So, how the GetArray get ListValue with various type. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Only one of The tensorflow has a similar implementation |
||
### Attribute in Operator | ||
|
||
Each operator stores its attributes. For faster attribute access, we should not let user parse `AttributeMap` during `Run` method in Operator. When `NetworkBase` adds an operator to computation graph, the `Attribute` could be parsed, and stored in each operator's the private member. | ||
|
||
```cpp | ||
class OperatorBase { | ||
public: | ||
virtual Error InitializeAttribute(const AttributeReader& attrs) = 0; | ||
}; | ||
|
||
class CosineOp : public OperatorBase { | ||
public: | ||
Error InitializeAttribute(const AttributeReader& attrs) { | ||
auto err = attrs.Get<float>("scale", &scale_); | ||
|
||
// ignore AttributeNotFound because scale_ is default = 1.0 | ||
if (!err.isOK() && err != "Attribute Not Found") { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What will happen if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the error is The default value is set in private data member, like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
return err; | ||
} | ||
if (scale_ <= 0.0f) { | ||
return Error("Scale of cosine op should be larger than 0.0"); | ||
} | ||
return Error(); // OK; | ||
} | ||
|
||
private: | ||
float scale_ {1.0}; | ||
}; | ||
``` | ||
|
||
When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute` and returns error code. The implementation of `CreateOperator` could be | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, I am curious about how the Python API can fill in the protobuf field |
||
```cpp | ||
Error CreateOperator(const OperatorDescription& desc, OperatorBase** ptr) { | ||
*ptr = OperatorRegister.create(desc.type(), desc.inputs(), desc.outputs()); | ||
Error err = (*ptr) -> InitializeAttribute(desc.attrs()); | ||
if (!err.isOK()) { | ||
delete (*ptr); | ||
} | ||
return err; | ||
} | ||
``` | ||
|
||
`InitializeAttribute` will validation the user's configuration, and might return an `Error`. It is clearer to invoke the method `InitializeAttribute` and return an `Error` than let each operator's constructor implement this logic because the constructor cannot return a value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design Doc: Operator Attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.