Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc for operator attribute #2606

Closed
Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions doc/design/attribute.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Design Doc about operator attribute
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design Doc: Operator Attributes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


## background
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Background

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


In a neural network, each operator could contain some configurable attributes. For example, a cosine similarity operator may contain an attribute named `scale`. The default cosine similarity returns a value in range [-1.0, 1.0]. But the user can set range scale manually, e.g., user set `scale=5.0`, then that cosine operator will return a value in the range [-5.0, 5.0].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a neural network, each operator could contain some configurable attributes.

==>

Operators have attributes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


The configurable attributes could be various types. Some operators need `float` value to configure; some need `string` value. We need a data structure to represent different types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be of various types

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configurable attributes could be various types. Some operators need float value to configure; some need string value. We need a data structure to represent different types.

==>

Each attributes has a name and a type. We are going to describe them in a protobuf message.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Each operator contains different configurable attributes. The names of attributes are not same. We need an associate map from attribute name to attribute value for `Operator`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each operator contains different configurable attributes. The names of attributes are not same. We need an associate map from attribute name to attribute value for Operator.

==>

We plan to list attributes of an operator in an associate map: map<string, Attribute*>.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In protobuf, the type of Attributes is a map<string, Attribute> attrs; But in each op class, it stores simple strong typed values as member data.

class CosineOp {
public:
  void Init(const AttributeReader& reader) {
    scale_ = reader.Get<float>("scale")
  }

private:
  float scale_;
};

If we extract scale from protobuf map every time when we use it, it is very slow.


Also as we want to use `protobuf` to serialize and deserialize our model, we need to implement the attribute value and the associated map from attribute name to attribute value in `protobuf`.

In conclusion, there are four things we know as background.

1. We need an attribute type for Operator.
1. That attribute type could represent different types.
1. That attribute value should be associated with an attribute name, like a map<string, Attribute>.
1. We need to implement them in `protobuf`.

## Protobuf Implementation

There are two frameworks implement `Attribute` concept in `protobuf`. They are [`caffe2`](https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2.proto#L98) and [`tensorflow`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto#L16).

* Caffe2 uses `proto2` syntax. It treats all attributes as a list, and each attribute contains a `name`. Each time caffe2 read an attribute is searching a variable in a list. It is slow if the number of attributes is large. Caffe2 also mark all field as `optional`. It doesn't ensure `one of` attribute value is set.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprise to know that Caffe2 searches attributes in a protobuf list. The right way is to load from protobuf field repeated Attrbribute attrs = x; into C++ data structure map<string, Attribute> and search in the C++ data structure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my mistake, Caffe2 is loading all attribute into memory first.

* By using `proto3` syntax in tensorflow, the attribute implementation in tensorflow is using `map`, and `oneof` keywords. Looking up from attribute map in tensorflow is fast.

Paddle is using `protobuf 3` as its dependency library. By simplify `tensorflow`'s implementation, Paddle's Attribute protobuf message schema could be

```protobuf
message Attribute {
message ListValue {
repeated int32 ints = 1;
repeated float floats = 2;
repeated string strings = 3;
}

oneof value {
ListValue list = 1;
Copy link
Collaborator

@wangkuiyi wangkuiyi Jun 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The oneof directive doesn't seem help much here -- it can locate a field in value but it cannot help us locate a field in ListValue. If our C++ code would still have to check which field is the one in ListValue, it would be easier just use the old plain syntax:

syntax = "proto3";

message Attribute {
  enum Type {
    INTS = 0;
    FLOATS = 1;
    STRINGS = 2;
    INT = 3;
    FLOAT = 4;
    STRING = 5;
  }
  Type type = 1;

  repeated int32 ints = 2;
  repeated float floats = 3;
  repeated string strings = 4;
  int32 int = 5;
  float float = 6;
  string string = 7;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax = "proto3";

message Attribute {
  repeated int32 ints = 1;
  repeated float floats = 2;
  repeated string strings = 3;
  optional int32 int = 4;
  optional float float = 5;
  optinoal string string = 6;
}

Maybe the Type is not useful, because we could use optional field and has_xxx() to detect AttributeType.

int32 i = 2;
float f = 3;
string s = 4;
}
}
```

In `OperatorDescription` message, there should be a field like this:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OperatorDescription => OperatorDesc ?


```protobuf
message OperatorDescription {
map<string, Attribute> attrs;
}
```

## CPP implementation

### AttributeReader

In CPP, it should be a helper class for reading `map<string, Attribute>`. The reading method should accept a template parameter, which is the type of Attribute. If type mismatch or attribute is not found, `Get` method should return an `Error`. That helper class we named `AttributeReader`.

The interface of `AttributeReader` is like this:

```cpp
using AttributeMap = google::protobuf::Map<std::string, Attribute>;
class AttributeReader {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AttributeReader => Attributes

This is not a reader, it holds and searches attributes as well.

public:
explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {}

template <typename T>
Error __must_check Get(const std::string& attributeName, T* attr) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will attr point to if Error happens?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attr is always pointing to an outside variable. The pointer is not modified in Get method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, so the value T* attr point to just not change, is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer attr is will never be changed inside Get method. The value *attr will not be changed if an error occurred. The value *attr will be set to correct value when there is no error.


template <typename T>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if an attribute was marked int in the protobuf message, but the user tries to retrieve its string value? How can we check and find such kind of errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we use optional, we can detect which type is set in protobuf, and assert fail when type mismatch.

Error __must_check GetArray(const std::string& attributeName,
std::vector<T>* array) const;

private:
const AttributeMap& attrs_;
};
```

There are two methods in `AttributeReader`: `Get` and `GetArray`. `GetArray` is used for `ListValue`, and `Get` is used for the rests. The user should invoke either of them when he wants to get an Attribute value from `AttributeMap`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ListValue message is defined as:

message ListValue {
     repeated int32 ints = 1;
     repeated float floats = 2;
     repeated string strings = 3;
 }

And GetArray method only has one template parameter T. So, how the GetArray get ListValue with various type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one of ListValue field should be set. The protobuf 3 currently cannot make repeated field inside a oneof. So just make all ListValue a single message.

The tensorflow has a similar implementation

### Attribute in Operator

Each operator stores its attributes. For faster attribute access, we should not let user parse `AttributeMap` during `Run` method in Operator. When `NetworkBase` adds an operator to computation graph, the `Attribute` could be parsed, and stored in each operator's the private member.

```cpp
class OperatorBase {
public:
virtual Error InitializeAttribute(const AttributeReader& attrs) = 0;
};

class CosineOp : public OperatorBase {
public:
Error InitializeAttribute(const AttributeReader& attrs) {
auto err = attrs.Get<float>("scale", &scale_);

// ignore AttributeNotFound because scale_ is default = 1.0
if (!err.isOK() && err != "Attribute Not Found") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if err == "Attribute Not Found" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error is Attribute Not Found, then the default value is used.

The default value is set in private data member, like float scale_{1.0};.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think If the error is Attribute Not Found, then the default value is used. should be added to comment.
Another thing is that remind the user to set a default value for their attributes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return err;
}
if (scale_ <= 0.0f) {
return Error("Scale of cosine op should be larger than 0.0");
}
return Error(); // OK;
}

private:
float scale_ {1.0};
};
```

When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute` and returns error code. The implementation of `CreateOperator` could be

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I am curious about how the Python API can fill in the protobuf field OperatorDesc::attrs, so that the C++ code could load.

```cpp
Error CreateOperator(const OperatorDescription& desc, OperatorBase** ptr) {
*ptr = OperatorRegister.create(desc.type(), desc.inputs(), desc.outputs());
Error err = (*ptr) -> InitializeAttribute(desc.attrs());
if (!err.isOK()) {
delete (*ptr);
}
return err;
}
```

`InitializeAttribute` will validation the user's configuration, and might return an `Error`. It is clearer to invoke the method `InitializeAttribute` and return an `Error` than let each operator's constructor implement this logic because the constructor cannot return a value.