From cdd28f743f2ed9ee435c885c83acda64dd9cce8d Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Mon, 26 Jun 2017 16:55:36 +0800 Subject: [PATCH 01/14] Init commit for attribute design. * Add background. --- doc/design/attribute.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 doc/design/attribute.md diff --git a/doc/design/attribute.md b/doc/design/attribute.md new file mode 100644 index 0000000000000..e63ffddcc8438 --- /dev/null +++ b/doc/design/attribute.md @@ -0,0 +1,18 @@ +# Attribute of operator + +## background + +In a neural network, each operator could contain some configurable attributes. For example, a cosine similarity operator may contain an attribute named `scale`. The default cosine similarity returns a value in range [-1.0, 1.0]. But the user can set range scale manually, e.g., user set `scale=5.0`, then that cosine operator will return a value in the range [-5.0, 5.0]. + +The configurable attributes could be various types. Some operators need `float` value to configure; some need `string` value. We need a data structure to represent different types. + +Each operator contains different configurable attributes. The names of attributes are not same. We need an associate map from attribute name to attribute value for `Operator`. + +Also as we want to use `protobuf` to serialize and deserialize our model, we need to implement the attribute value and the associated map from attribute name to attribute value in `protobuf`. + +In conclusion, there are four things we know as background. + +1. We need an attribute type for Operator. +1. That attribute type could represent different types. +1. That attribute value should be associated with an attribute name, like a map. +1. We need to implement them in `protobuf`. From 87e3820617ab2962076c84838a32367a12d6a311 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Mon, 26 Jun 2017 17:06:43 +0800 Subject: [PATCH 02/14] Adding protobuf implementation --- doc/design/attribute.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index e63ffddcc8438..34a720496d14a 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -16,3 +16,37 @@ In conclusion, there are four things we know as background. 1. That attribute type could represent different types. 1. That attribute value should be associated with an attribute name, like a map. 1. We need to implement them in `protobuf`. + +## Protobuf Implementation + +There are two frameworks implement `Attribute` concept in `protobuf`. They are [`caffe2`](https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2.proto#L98) and [`tensorflow`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto#L16). + +* Caffe2 uses `proto2` syntax. It treats all attributes as a list, and each attribute contains a `name`. Each time caffe2 read an attribute is searching a variable in a list. It is slow if the number of attributes is large. Caffe2 also mark all field as `optional`. It doesn't ensure `one of` attribute value is set. +* By using `proto3` syntax in tensorflow, the attribute implementation in tensorflow is using `map`, and `oneof` keywords. Looking up from attribute map in tensorflow is fast. + +Paddle is using `protobuf 3` as its dependency library. By simplify `tensorflow`'s implementation, Paddle's Attribute protobuf message schema could be + +```protobuf +message Attribute { + message ListValue { + repeated int32 ints = 1; + repeated float floats = 2; + repeated string strings = 3; + } + + oneof value { + ListValue list = 1; + int32 i = 2; + float f = 3; + string s = 4; + } +} +``` + +In `OperatorDescription` message, there should be a field like this: + +```protobuf +message OperatorDescription { + map attrs; +} +``` From 581ce7decbf7e4c8286d89c76c8ce0ef0ee806af Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Mon, 26 Jun 2017 17:33:52 +0800 Subject: [PATCH 03/14] Add implementation in CPP --- doc/design/attribute.md | 73 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 34a720496d14a..5978b0934aa64 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -1,4 +1,4 @@ -# Attribute of operator +# Design Doc about operator attribute ## background @@ -50,3 +50,74 @@ message OperatorDescription { map attrs; } ``` + +## CPP implementation + +### AttributeReader + +In CPP, it should be a helper class for reading `map`. The reading method should accept a template parameter, which is the type of Attribute. If type mismatch or attribute is not found, `Get` method should return an `Error`. That helper class we named `AttributeReader`. + +The interface of `AttributeReader` is like this: + +```cpp +using AttributeMap = google::protobuf::Map; +class AttributeReader { + public: + explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} + + template + Error __must_check Get(const std::string& attributeName, T* attr) const; + + template + Error __must_check GetArray(const std::string& attributeName, + std::vector* array) const; + + private: + const AttributeMap& attrs_; +}; +``` + +There are two methods in `AttributeReader`: `Get` and `GetArray`. `GetArray` is used for `ListValue`, and `Get` is used for the rests. The user should invoke either of them when he wants to get an Attribute value from `AttributeMap`. + +### Attribute in Operator + +Each operator stores its attributes. For faster attribute access, we should not let user parse `AttributeMap` during `Run` method in Operator. When `NetworkBase` adds an operator to computation graph, the `Attribute` could be parsed, and stored in each operator's the private member. + +```cpp +class OperatorBase { + public: + Error InitializeAttribute(const AttributeReader& attrs) = 0; +}; + +class CosineOp : public OperatorBase { + public: + Error InitializeAttribute(const AttributeReader& attrs) { + auto err = attrs.Get("scale", &scale_); + if (!err.isOK() && err != "Attribute Not Found") { + return err; + } + if (scale_ <= 0.0f) { + return Error("Scale of cosine op should be larger than 0.0"); + } + return Error(); // OK; + } + + private: + float scale_ {1.0}; +}; +``` + +When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute` and returns error code. The implementation of `CreateOperator` could be + +```cpp +Error CreateOperator(const OperatorDescription& desc, OperatorBase** ptr) { + *ptr = OperatorRegister.create(desc.type(), desc.inputs(), desc.outputs()); + Error err = (*ptr) -> InitializeAttribute(desc.attrs()); + if (!err.isOK()) { + delete (*ptr); + } + return err; +} +``` + +`InitializeAttribute` will validation the user's configuration, and might return an `Error`. It is clearer to invoke the method `InitializeAttribute` and return an `Error` than let each operator's constructor implement this logic because the constructor cannot return a value. From 306dcfe73f89d89e813f1b05ed9ef4852fc581ed Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Tue, 27 Jun 2017 15:41:25 +0800 Subject: [PATCH 04/14] Typo, base class should be virtual. --- doc/design/attribute.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 5978b0934aa64..6238572a9b79d 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -86,7 +86,7 @@ Each operator stores its attributes. For faster attribute access, we should not ```cpp class OperatorBase { public: - Error InitializeAttribute(const AttributeReader& attrs) = 0; + virtual Error InitializeAttribute(const AttributeReader& attrs) = 0; }; class CosineOp : public OperatorBase { From 0d9b9d3ab02552ed1d5c28ea538213cea1326e22 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Tue, 27 Jun 2017 16:06:59 +0800 Subject: [PATCH 05/14] Add comments --- doc/design/attribute.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 6238572a9b79d..4118c4295cffb 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -93,6 +93,8 @@ class CosineOp : public OperatorBase { public: Error InitializeAttribute(const AttributeReader& attrs) { auto err = attrs.Get("scale", &scale_); + + // ignore AttributeNotFound because scale_ is default = 1.0 if (!err.isOK() && err != "Attribute Not Found") { return err; } From e3a63d76f5db2d6a6e70e42132eb9afa5b2f5916 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Wed, 28 Jun 2017 18:12:08 +0800 Subject: [PATCH 06/14] Update Attribute Design * Simplify background description. * Not using Error as return value, use PADDLE_ENFORCE --- doc/design/attribute.md | 59 +++++++++++++++-------------------------- 1 file changed, 21 insertions(+), 38 deletions(-) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 4118c4295cffb..6e137beebca9e 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -1,21 +1,14 @@ -# Design Doc about operator attribute +# Design Doc: Operator Attributes -## background +## Background -In a neural network, each operator could contain some configurable attributes. For example, a cosine similarity operator may contain an attribute named `scale`. The default cosine similarity returns a value in range [-1.0, 1.0]. But the user can set range scale manually, e.g., user set `scale=5.0`, then that cosine operator will return a value in the range [-5.0, 5.0]. +An operator could have attributes. For example, CosineOp could have a float typed attribute scale, which changes the output range from [-1,1] to [-scale,scale]. The default value of scale is `1.0`. -The configurable attributes could be various types. Some operators need `float` value to configure; some need `string` value. We need a data structure to represent different types. +Attributes is defined by a name and a type. An instance of an attribute has a value of that type. -Each operator contains different configurable attributes. The names of attributes are not same. We need an associate map from attribute name to attribute value for `Operator`. +As part of the network description, attribute need to be serialized. So we need a protobuf message that describes an attribute, say `Attribute`. -Also as we want to use `protobuf` to serialize and deserialize our model, we need to implement the attribute value and the associated map from attribute name to attribute value in `protobuf`. - -In conclusion, there are four things we know as background. - -1. We need an attribute type for Operator. -1. That attribute type could represent different types. -1. That attribute value should be associated with an attribute name, like a map. -1. We need to implement them in `protobuf`. +An operator could parse the Attribute and save them into its private data member. ## Protobuf Implementation @@ -65,12 +58,14 @@ class AttributeReader { public: explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} + // return false if attribute is not found. + // For typemismatch, it just called `ENFORCE` and throw exception. template - Error __must_check Get(const std::string& attributeName, T* attr) const; + bool Get(const std::string& attributeName, T* attr) const; + // return false if attribute is not found. template - Error __must_check GetArray(const std::string& attributeName, - std::vector* array) const; + bool GetArray(const std::string& attributeName, std::vector* array) const; private: const AttributeMap& attrs_; @@ -86,22 +81,14 @@ Each operator stores its attributes. For faster attribute access, we should not ```cpp class OperatorBase { public: - virtual Error InitializeAttribute(const AttributeReader& attrs) = 0; + virtual void InitializeAttribute(const AttributeReader& attrs) = 0; }; class CosineOp : public OperatorBase { public: - Error InitializeAttribute(const AttributeReader& attrs) { - auto err = attrs.Get("scale", &scale_); - - // ignore AttributeNotFound because scale_ is default = 1.0 - if (!err.isOK() && err != "Attribute Not Found") { - return err; - } - if (scale_ <= 0.0f) { - return Error("Scale of cosine op should be larger than 0.0"); - } - return Error(); // OK; + void InitializeAttribute(const AttributeReader& attrs) { + attrs.Get("scale", &scale_); + PADDLE_ENFORCE(scale_ > 0.0f, "Scale of consine op should be larger than 0.0"); } private: @@ -109,17 +96,13 @@ class CosineOp : public OperatorBase { }; ``` -When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute` and returns error code. The implementation of `CreateOperator` could be +When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttributeĀ·. The implementation of `CreateOperator` could be ```cpp -Error CreateOperator(const OperatorDescription& desc, OperatorBase** ptr) { - *ptr = OperatorRegister.create(desc.type(), desc.inputs(), desc.outputs()); - Error err = (*ptr) -> InitializeAttribute(desc.attrs()); - if (!err.isOK()) { - delete (*ptr); - } - return err; +std::unique_ptr CreateOperator(const OperatorDescription& desc) { + std::unique_ptr op(OperatorRegister.Create( + desc.type(), desc.inputs(), desc.outputs())); + op->InitializeAttribute(AttributeReader(desc.attrs())); + return std::move(op); } ``` - -`InitializeAttribute` will validation the user's configuration, and might return an `Error`. It is clearer to invoke the method `InitializeAttribute` and return an `Error` than let each operator's constructor implement this logic because the constructor cannot return a value. From 908c8c1bf59b12025896452a5285cff62eaf92f4 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 14:10:50 +0800 Subject: [PATCH 07/14] Implement Attribute --- paddle/framework/CMakeLists.txt | 2 + paddle/framework/attribute.proto | 17 ++++ paddle/framework/attribute_reader.h | 108 ++++++++++++++++++++++ paddle/framework/attribute_reader_test.cc | 25 +++++ 4 files changed, 152 insertions(+) create mode 100644 paddle/framework/attribute.proto create mode 100644 paddle/framework/attribute_reader.h create mode 100644 paddle/framework/attribute_reader_test.cc diff --git a/paddle/framework/CMakeLists.txt b/paddle/framework/CMakeLists.txt index b06ecc26286de..548799ab95373 100644 --- a/paddle/framework/CMakeLists.txt +++ b/paddle/framework/CMakeLists.txt @@ -3,3 +3,5 @@ cc_test(ddim_test SRCS ddim_test.cc DEPS ddim) nv_test(dim_test SRCS dim_test.cu DEPS ddim) cc_test(variable_test SRCS variable_test.cc) cc_test(enforce_test SRCS enforce_test.cc) +proto_library(attribute SRCS attribute.proto) +cc_test(attribute_reader_test SRCS attribute_reader_test.cc DEPS attribute protobuf) diff --git a/paddle/framework/attribute.proto b/paddle/framework/attribute.proto new file mode 100644 index 0000000000000..1b9bb25724de1 --- /dev/null +++ b/paddle/framework/attribute.proto @@ -0,0 +1,17 @@ +syntax="proto3"; +package paddle.framework; + +message Attribute { + message ListValue { + repeated int32 ints = 1; + repeated float floats = 2; + repeated string strings = 3; + } + + oneof value { + ListValue list = 1; + int32 i = 2; + float f = 3; + string s = 4; + } +} diff --git a/paddle/framework/attribute_reader.h b/paddle/framework/attribute_reader.h new file mode 100644 index 0000000000000..3047d8a38e2e5 --- /dev/null +++ b/paddle/framework/attribute_reader.h @@ -0,0 +1,108 @@ +#pragma once +#include +#include +#include "attribute.pb.h" + +namespace paddle { +namespace framework { +using AttributeMap = google::protobuf::Map; +class AttributeReader { + public: + explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} + + template + bool ContainPlain(const std::string& name) const; + + bool IsArray(const std::string& name) const; + + template + T Get(const std::string& name) const; + + template + void GetArray(const std::string& name, std::vector* vec) const; + + private: + const AttributeMap& attrs_; +}; + +/// Implementation +namespace details { +inline const ::paddle::framework::Attribute* GetField(const AttributeMap& attrs, + const std::string& name) { + auto it = attrs.find(name); + if (it == attrs.end()) { + return nullptr; + } else { + return &it->second; + } +} +} // namespace details + +bool AttributeReader::IsArray(const std::string& name) const { + auto attr = details::GetField(attrs_, name); + if (attr) { + return attr->has_list(); + } else { + return false; + } +} + +namespace details { +template +inline bool IsType(const ::paddle::framework::Attribute* attr); +} // namespace details + +template +bool AttributeReader::ContainPlain(const std::string& name) const { + auto attr = details::GetField(attrs_, name); + if (attr) { + return details::IsType(attr); + } else { + return false; + } +} + +#define ATTR_READER_ISTYPE_IMPL(T, CASE) \ + namespace details { \ + template <> \ + inline bool IsType(const ::paddle::framework::Attribute* attr) { \ + return attr->value_case() == CASE; \ + } \ + } + +ATTR_READER_ISTYPE_IMPL(int, ::paddle::framework::Attribute::kI); +ATTR_READER_ISTYPE_IMPL(float, ::paddle::framework::Attribute::kF); +ATTR_READER_ISTYPE_IMPL(std::string, ::paddle::framework::Attribute::kS); + +#undef ATTR_READER_ISTYPE_IMPL + +namespace details { +template +inline T GetValue(const ::paddle::framework::Attribute* attr); +} + +template +T AttributeReader::Get(const std::string& name) const { + auto attr = details::GetField(attrs_, name); + PADDLE_ENFORCE(attr != nullptr, "Attribute %s not found", name); + PADDLE_ENFORCE(details::IsType(attr), + "Attribute type mismatch. Expected %s", typeid(T).name()); + return details::GetValue(attr); +} + +#define ATTR_READER_GETVALUE_IMPL(T, FIELD) \ + namespace details { \ + template <> \ + inline T GetValue(const ::paddle::framework::Attribute* attr) { \ + return attr->FIELD(); \ + } \ + } + +ATTR_READER_GETVALUE_IMPL(int, i); +ATTR_READER_GETVALUE_IMPL(float, f); +ATTR_READER_GETVALUE_IMPL(std::string, s); + +#undef ATTR_READER_GETVALUE_IMPL + +} // namespace framework +} // namespace paddle diff --git a/paddle/framework/attribute_reader_test.cc b/paddle/framework/attribute_reader_test.cc new file mode 100644 index 0000000000000..41b4f2a5157f1 --- /dev/null +++ b/paddle/framework/attribute_reader_test.cc @@ -0,0 +1,25 @@ +#include "attribute_reader.h" +#include + +TEST(AttributeReader, ReadPlain) { + using paddle::framework::AttributeMap; + AttributeMap test; + test["floatValue"].set_f(0.23); + test["strValue"].set_s("unittest string"); + test["intValue"].set_i(-1); + + using paddle::framework::AttributeReader; + + AttributeReader reader(test); + + ASSERT_TRUE(reader.ContainPlain("intValue")); + ASSERT_TRUE(reader.ContainPlain("strValue")); + ASSERT_TRUE(reader.ContainPlain("floatValue")); + + ASSERT_EQ(-1, reader.Get("intValue")); + ASSERT_EQ("unittest string", reader.Get("strValue")); + ASSERT_NEAR(0.23f, reader.Get("floatValue"), 1e-5); + + ASSERT_FALSE(reader.ContainPlain("intValue")); + ASSERT_FALSE(reader.ContainPlain("otherValue")); +} \ No newline at end of file From 7250a92d4e6038cd132e14df9a339f1a1f38bb68 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 15:23:37 +0800 Subject: [PATCH 08/14] Change AttributeReader's header --- paddle/framework/attribute_reader.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/paddle/framework/attribute_reader.h b/paddle/framework/attribute_reader.h index 3047d8a38e2e5..60057bb4604e7 100644 --- a/paddle/framework/attribute_reader.h +++ b/paddle/framework/attribute_reader.h @@ -1,7 +1,7 @@ #pragma once #include +#include #include -#include "attribute.pb.h" namespace paddle { namespace framework { From 30907858cf88fcbb506345c8bfd85ec773e831c5 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 15:24:12 +0800 Subject: [PATCH 09/14] Fix TravisCI --- paddle/pserver/CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/paddle/pserver/CMakeLists.txt b/paddle/pserver/CMakeLists.txt index f2e0b4b76bcad..2245c7d88ca74 100644 --- a/paddle/pserver/CMakeLists.txt +++ b/paddle/pserver/CMakeLists.txt @@ -17,7 +17,7 @@ add_library(paddle_network STATIC add_style_check_target(paddle_network ${NETWORK_SOURCES}) add_style_check_target(paddle_network ${NETWORK_HEADERS}) -add_dependencies(paddle_network paddle_proto) +add_dependencies(paddle_network paddle_proto ${external_project_dependencies}) ################### paddle_pserver ###################### set(PSERVER_SOURCES From b90a3a6c7e96ae56f3c0592da41f05e7a6c641f7 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 16:18:38 +0800 Subject: [PATCH 10/14] Add GetArray --- paddle/framework/attribute_reader.h | 21 ++++++++++++++++++ paddle/framework/attribute_reader_test.cc | 27 +++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/paddle/framework/attribute_reader.h b/paddle/framework/attribute_reader.h index 60057bb4604e7..2bad703f0a0d2 100644 --- a/paddle/framework/attribute_reader.h +++ b/paddle/framework/attribute_reader.h @@ -2,6 +2,8 @@ #include #include #include +#include +#include namespace paddle { namespace framework { @@ -104,5 +106,24 @@ ATTR_READER_GETVALUE_IMPL(std::string, s); #undef ATTR_READER_GETVALUE_IMPL +#define ATTR_GETARRAY_IMPL(T, FIELD) \ + template <> \ + void AttributeReader::GetArray(const std::string& name, \ + std::vector* vec) const { \ + PADDLE_ENFORCE(vec->empty(), "Input vector should be empty"); \ + auto attr = details::GetField(attrs_, name); \ + PADDLE_ENFORCE(attr != nullptr, "Attribute %s not found", name); \ + PADDLE_ENFORCE(attr->has_list(), "Attribute %s is not array", name); \ + auto& field = attr->list().FIELD(); \ + vec->reserve(field.size()); \ + std::copy(field.begin(), field.end(), std::back_inserter(*vec)); \ + } + +ATTR_GETARRAY_IMPL(int, ints); +ATTR_GETARRAY_IMPL(float, floats); +ATTR_GETARRAY_IMPL(std::string, strings); + +#undef ATTR_GETARRAY_IMPL + } // namespace framework } // namespace paddle diff --git a/paddle/framework/attribute_reader_test.cc b/paddle/framework/attribute_reader_test.cc index 41b4f2a5157f1..8e407b307ec0c 100644 --- a/paddle/framework/attribute_reader_test.cc +++ b/paddle/framework/attribute_reader_test.cc @@ -22,4 +22,31 @@ TEST(AttributeReader, ReadPlain) { ASSERT_FALSE(reader.ContainPlain("intValue")); ASSERT_FALSE(reader.ContainPlain("otherValue")); +} + +TEST(AttributeReader, ReadArray) { + using paddle::framework::AttributeMap; + AttributeMap test; + auto ilist = test["listInt"].mutable_list()->mutable_ints(); + auto strlist = test["listStr"].mutable_list()->mutable_strings(); + constexpr int length = 4; + ilist->Reserve(length); + strlist->Reserve(length); + std::vector expected; + std::vector expectedStr; + expected.reserve(length); + for (int i = 0; i < length; ++i) { + *ilist->Add() = i; + *strlist->Add() = std::to_string(i); + expected.push_back(i); + expectedStr.push_back(std::to_string(i)); + } + std::vector actual; + std::vector actualStr; + using paddle::framework::AttributeReader; + AttributeReader reader(test); + reader.GetArray("listInt", &actual); + reader.GetArray("listStr", &actualStr); + ASSERT_EQ(expected, actual); + ASSERT_EQ(expectedStr, actualStr); } \ No newline at end of file From 911113d55eb06ec5889ef8d319b9c76add99fb2e Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 18:49:36 +0800 Subject: [PATCH 11/14] Simplify API of AttributeReader Add copy right and comments Update design doc --- cmake/flags.cmake | 3 +- doc/design/attribute.md | 18 ++-- paddle/framework/attribute_reader.h | 108 +++++++++++++++++----- paddle/framework/attribute_reader_test.cc | 27 +++++- 4 files changed, 120 insertions(+), 36 deletions(-) diff --git a/cmake/flags.cmake b/cmake/flags.cmake index 7a996dea92b13..6978bad8a731f 100644 --- a/cmake/flags.cmake +++ b/cmake/flags.cmake @@ -101,7 +101,8 @@ set(COMMON_FLAGS -fPIC -fno-omit-frame-pointer -Wall - -Wextra +# -Wextra # Because map implementation of protobuf 3.1 has + # several warnings in GCC Wextra. Disable Wextra until we upgrad our protobuf library. -Werror -Wnon-virtual-dtor -Wdelete-non-virtual-dtor diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 6e137beebca9e..831afe823e6c2 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -58,14 +58,14 @@ class AttributeReader { public: explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} - // return false if attribute is not found. - // For typemismatch, it just called `ENFORCE` and throw exception. template - bool Get(const std::string& attributeName, T* attr) const; + T Get(const std::string& attributeName) const; - // return false if attribute is not found. template - bool GetArray(const std::string& attributeName, std::vector* array) const; + void GetArray(const std::string& attributeName, std::vector* array) const; + + template + bool Contain(const std::string& name) const; private: const AttributeMap& attrs_; @@ -87,8 +87,10 @@ class OperatorBase { class CosineOp : public OperatorBase { public: void InitializeAttribute(const AttributeReader& attrs) { - attrs.Get("scale", &scale_); - PADDLE_ENFORCE(scale_ > 0.0f, "Scale of consine op should be larger than 0.0"); + if (attrs.Contain("scale")) { + scale_ = attrs.Get("scale"); + PADDLE_ENFORCE(scale_ > 0.0f, "Scale of consine op should be larger than 0.0"); + } } private: @@ -96,7 +98,7 @@ class CosineOp : public OperatorBase { }; ``` -When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttributeĀ·. The implementation of `CreateOperator` could be +When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute`. The implementation of `CreateOperator` could be ```cpp std::unique_ptr CreateOperator(const OperatorDescription& desc) { diff --git a/paddle/framework/attribute_reader.h b/paddle/framework/attribute_reader.h index 2bad703f0a0d2..bcf26fdfcec06 100644 --- a/paddle/framework/attribute_reader.h +++ b/paddle/framework/attribute_reader.h @@ -1,8 +1,23 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + #pragma once #include #include #include #include +#include #include namespace paddle { @@ -12,14 +27,47 @@ class AttributeReader { public: explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} + /** + * @brief Contains a attribute with name and type T. + * + * The example code like + * @code{cpp} + * AttributeReader reader; + * + * assert(reader.Contain("SomeIntValue")==true); + * assert(reader.Contain("SomeIntValue")==false); + * assert(reader.Contain>("SomeIntList")==true); + * @endcode{cpp} + * + * @tparam T Attribute Type, could be {int, float, string and std::vector of + * them}. + * @param name attribute name + * @return true if contain an attribute with name and type T, false if Type + * mismatch or not contains that name. + */ template - bool ContainPlain(const std::string& name) const; - - bool IsArray(const std::string& name) const; - + bool Contain(const std::string& name) const; + + /** + * @brief Get Attribute value. Not support std::vector. If want to return a + * std::vector, use `GetArray` + * @tparam T could be int, float, string. + * @param name attribute name. + * @return Value + * @throw If attribute is not found or type mismatch, an EnforceNotMet will + * throw + */ template T Get(const std::string& name) const; + /** + * @brief Get Attribute array values. + * @tparam T could be int, float, string. + * @param name attribute name. + * @param vec the return vector. Must be empty. + * @throw If attribute is not found, or type mismatch, or vec is not empty, an + * EnforceNotMet will throw + */ template void GetArray(const std::string& name, std::vector* vec) const; @@ -27,7 +75,7 @@ class AttributeReader { const AttributeMap& attrs_; }; -/// Implementation +/// Implementation of Contain namespace details { inline const ::paddle::framework::Attribute* GetField(const AttributeMap& attrs, const std::string& name) { @@ -38,30 +86,44 @@ inline const ::paddle::framework::Attribute* GetField(const AttributeMap& attrs, return &it->second; } } -} // namespace details -bool AttributeReader::IsArray(const std::string& name) const { - auto attr = details::GetField(attrs_, name); - if (attr) { - return attr->has_list(); - } else { - return false; +template +inline bool IsType(const ::paddle::framework::Attribute* attr); + +template +struct ContainImpl {}; + +template +struct ContainImpl { + bool operator()(const AttributeMap& attrs, const std::string& name) { + auto attr = GetField(attrs, name); + if (attr) { + return details::IsType(attr); + } else { + return false; + } } -} +}; -namespace details { template -inline bool IsType(const ::paddle::framework::Attribute* attr); +struct ContainImpl { + bool operator()(const AttributeMap& attrs, const std::string& name) { + auto attr = GetField(attrs, name); + if (attr) { + return attr->has_list(); + } else { + return false; + } + } +}; } // namespace details template -bool AttributeReader::ContainPlain(const std::string& name) const { - auto attr = details::GetField(attrs_, name); - if (attr) { - return details::IsType(attr); - } else { - return false; - } +bool AttributeReader::Contain(const std::string& name) const { + constexpr bool is_vec = std::is_same>::value || + std::is_same>::value || + std::is_same>::value; + return details::ContainImpl()(attrs_, name); } #define ATTR_READER_ISTYPE_IMPL(T, CASE) \ @@ -78,6 +140,7 @@ ATTR_READER_ISTYPE_IMPL(std::string, ::paddle::framework::Attribute::kS); #undef ATTR_READER_ISTYPE_IMPL +/// Implementation of Get namespace details { template inline T GetValue(const ::paddle::framework::Attribute* attr); @@ -106,6 +169,7 @@ ATTR_READER_GETVALUE_IMPL(std::string, s); #undef ATTR_READER_GETVALUE_IMPL +/// Implementation of GetArray #define ATTR_GETARRAY_IMPL(T, FIELD) \ template <> \ void AttributeReader::GetArray(const std::string& name, \ diff --git a/paddle/framework/attribute_reader_test.cc b/paddle/framework/attribute_reader_test.cc index 8e407b307ec0c..1e4042190f706 100644 --- a/paddle/framework/attribute_reader_test.cc +++ b/paddle/framework/attribute_reader_test.cc @@ -1,3 +1,17 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + #include "attribute_reader.h" #include @@ -12,16 +26,16 @@ TEST(AttributeReader, ReadPlain) { AttributeReader reader(test); - ASSERT_TRUE(reader.ContainPlain("intValue")); - ASSERT_TRUE(reader.ContainPlain("strValue")); - ASSERT_TRUE(reader.ContainPlain("floatValue")); + ASSERT_TRUE(reader.Contain("intValue")); + ASSERT_TRUE(reader.Contain("strValue")); + ASSERT_TRUE(reader.Contain("floatValue")); ASSERT_EQ(-1, reader.Get("intValue")); ASSERT_EQ("unittest string", reader.Get("strValue")); ASSERT_NEAR(0.23f, reader.Get("floatValue"), 1e-5); - ASSERT_FALSE(reader.ContainPlain("intValue")); - ASSERT_FALSE(reader.ContainPlain("otherValue")); + ASSERT_FALSE(reader.Contain("intValue")); + ASSERT_FALSE(reader.Contain("otherValue")); } TEST(AttributeReader, ReadArray) { @@ -49,4 +63,7 @@ TEST(AttributeReader, ReadArray) { reader.GetArray("listStr", &actualStr); ASSERT_EQ(expected, actual); ASSERT_EQ(expectedStr, actualStr); + + ASSERT_TRUE(reader.Contain>("listInt")); + ASSERT_TRUE(reader.Contain>("listStr")); } \ No newline at end of file From 1f3552682a0bfd6a71859acc0df8dc9c5a6e178a Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 18:56:22 +0800 Subject: [PATCH 12/14] Rename Contain -> Contains --- doc/design/attribute.md | 2 +- paddle/framework/attribute_reader.h | 12 ++++++------ paddle/framework/attribute_reader_test.cc | 14 +++++++------- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 831afe823e6c2..500aad9244f2f 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -65,7 +65,7 @@ class AttributeReader { void GetArray(const std::string& attributeName, std::vector* array) const; template - bool Contain(const std::string& name) const; + bool Contains(const std::string& name) const; private: const AttributeMap& attrs_; diff --git a/paddle/framework/attribute_reader.h b/paddle/framework/attribute_reader.h index bcf26fdfcec06..b8e6407a99913 100644 --- a/paddle/framework/attribute_reader.h +++ b/paddle/framework/attribute_reader.h @@ -46,7 +46,7 @@ class AttributeReader { * mismatch or not contains that name. */ template - bool Contain(const std::string& name) const; + bool Contains(const std::string& name) const; /** * @brief Get Attribute value. Not support std::vector. If want to return a @@ -91,10 +91,10 @@ template inline bool IsType(const ::paddle::framework::Attribute* attr); template -struct ContainImpl {}; +struct ContainsImpl {}; template -struct ContainImpl { +struct ContainsImpl { bool operator()(const AttributeMap& attrs, const std::string& name) { auto attr = GetField(attrs, name); if (attr) { @@ -106,7 +106,7 @@ struct ContainImpl { }; template -struct ContainImpl { +struct ContainsImpl { bool operator()(const AttributeMap& attrs, const std::string& name) { auto attr = GetField(attrs, name); if (attr) { @@ -119,11 +119,11 @@ struct ContainImpl { } // namespace details template -bool AttributeReader::Contain(const std::string& name) const { +bool AttributeReader::Contains(const std::string& name) const { constexpr bool is_vec = std::is_same>::value || std::is_same>::value || std::is_same>::value; - return details::ContainImpl()(attrs_, name); + return details::ContainsImpl()(attrs_, name); } #define ATTR_READER_ISTYPE_IMPL(T, CASE) \ diff --git a/paddle/framework/attribute_reader_test.cc b/paddle/framework/attribute_reader_test.cc index 1e4042190f706..3e1137da7d46c 100644 --- a/paddle/framework/attribute_reader_test.cc +++ b/paddle/framework/attribute_reader_test.cc @@ -26,16 +26,16 @@ TEST(AttributeReader, ReadPlain) { AttributeReader reader(test); - ASSERT_TRUE(reader.Contain("intValue")); - ASSERT_TRUE(reader.Contain("strValue")); - ASSERT_TRUE(reader.Contain("floatValue")); + ASSERT_TRUE(reader.Contains("intValue")); + ASSERT_TRUE(reader.Contains("strValue")); + ASSERT_TRUE(reader.Contains("floatValue")); ASSERT_EQ(-1, reader.Get("intValue")); ASSERT_EQ("unittest string", reader.Get("strValue")); ASSERT_NEAR(0.23f, reader.Get("floatValue"), 1e-5); - ASSERT_FALSE(reader.Contain("intValue")); - ASSERT_FALSE(reader.Contain("otherValue")); + ASSERT_FALSE(reader.Contains("intValue")); + ASSERT_FALSE(reader.Contains("otherValue")); } TEST(AttributeReader, ReadArray) { @@ -64,6 +64,6 @@ TEST(AttributeReader, ReadArray) { ASSERT_EQ(expected, actual); ASSERT_EQ(expectedStr, actualStr); - ASSERT_TRUE(reader.Contain>("listInt")); - ASSERT_TRUE(reader.Contain>("listStr")); + ASSERT_TRUE(reader.Contains>("listInt")); + ASSERT_TRUE(reader.Contains>("listStr")); } \ No newline at end of file From 224c6a4f01613ddd94053a7514b522dada87ca0b Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 19:09:49 +0800 Subject: [PATCH 13/14] Simplify Attribute Doc --- cmake/flags.cmake | 2 +- doc/design/attribute.md | 18 ++++++++++-------- 2 files changed, 11 insertions(+), 9 deletions(-) diff --git a/cmake/flags.cmake b/cmake/flags.cmake index 6978bad8a731f..fd97210acc305 100644 --- a/cmake/flags.cmake +++ b/cmake/flags.cmake @@ -102,7 +102,7 @@ set(COMMON_FLAGS -fno-omit-frame-pointer -Wall # -Wextra # Because map implementation of protobuf 3.1 has - # several warnings in GCC Wextra. Disable Wextra until we upgrad our protobuf library. + # several warnings in GCC Wextra. Disable Wextra until we upgrade our protobuf library. -Werror -Wnon-virtual-dtor -Wdelete-non-virtual-dtor diff --git a/doc/design/attribute.md b/doc/design/attribute.md index 500aad9244f2f..f2bf56ba07503 100644 --- a/doc/design/attribute.md +++ b/doc/design/attribute.md @@ -48,7 +48,7 @@ message OperatorDescription { ### AttributeReader -In CPP, it should be a helper class for reading `map`. The reading method should accept a template parameter, which is the type of Attribute. If type mismatch or attribute is not found, `Get` method should return an `Error`. That helper class we named `AttributeReader`. +In CPP, it should be a helper class for reading `map`. The reading methods in that helper class should accept a template parameter, which is the type of Attribute. That helper class we named `AttributeReader`. The interface of `AttributeReader` is like this: @@ -58,12 +58,16 @@ class AttributeReader { public: explicit AttributeReader(const AttributeMap& attrs) : attrs_(attrs) {} + // Get a plain type T attribute, which name is `name` template - T Get(const std::string& attributeName) const; + T Get(const std::string& name) const; + // Get attribute with a array of type T, which name is `name` template - void GetArray(const std::string& attributeName, std::vector* array) const; + void GetArray(const std::string& name, std::vector* vec) const; + // Is that `name` attribute with type T in map or not. + // T could be int, float, string and std::vector of them template bool Contains(const std::string& name) const; @@ -72,11 +76,9 @@ class AttributeReader { }; ``` -There are two methods in `AttributeReader`: `Get` and `GetArray`. `GetArray` is used for `ListValue`, and `Get` is used for the rests. The user should invoke either of them when he wants to get an Attribute value from `AttributeMap`. - ### Attribute in Operator -Each operator stores its attributes. For faster attribute access, we should not let user parse `AttributeMap` during `Run` method in Operator. When `NetworkBase` adds an operator to computation graph, the `Attribute` could be parsed, and stored in each operator's the private member. +Each operator parse and store its attribute into private member data when `InitializeAttribute`. That method will be invoked by `CreateOperator `. User can use `PADDLE_ENFORCE` to validate attribute. Also, use `Contains` method, user can set default value of attributes. ```cpp class OperatorBase { @@ -87,7 +89,7 @@ class OperatorBase { class CosineOp : public OperatorBase { public: void InitializeAttribute(const AttributeReader& attrs) { - if (attrs.Contain("scale")) { + if (attrs.Contains("scale")) { scale_ = attrs.Get("scale"); PADDLE_ENFORCE(scale_ > 0.0f, "Scale of consine op should be larger than 0.0"); } @@ -98,7 +100,7 @@ class CosineOp : public OperatorBase { }; ``` -When `NetworkBase` invokes `CreateOperator(const OperatorDescription& desc)`, it create an operator first. Then `CreateOperator` will invoke `InitializeAttribute`. The implementation of `CreateOperator` could be +`InitializeAttribute` will be invoked by `CreateOperator`. Since `InitializeAttribute ` could throw an EnforceNotMet, a `unique_ptr` is used to make code exception-safe. ```cpp std::unique_ptr CreateOperator(const OperatorDescription& desc) { From 1895f06357109739d193e8bd214f56f702434445 Mon Sep 17 00:00:00 2001 From: Yu Yang Date: Thu, 29 Jun 2017 19:27:54 +0800 Subject: [PATCH 14/14] Add comments in code --- paddle/framework/attribute.proto | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/paddle/framework/attribute.proto b/paddle/framework/attribute.proto index 1b9bb25724de1..ce751292d1457 100644 --- a/paddle/framework/attribute.proto +++ b/paddle/framework/attribute.proto @@ -1,6 +1,9 @@ syntax="proto3"; package paddle.framework; +// Attribute for OperatorDescription. +// It represent a variant type of value. +// The int, float, string and repeated of them are supported. message Attribute { message ListValue { repeated int32 ints = 1; @@ -9,7 +12,10 @@ message Attribute { } oneof value { + // Since proto3 not support repeated filed in `oneof`, we use + // ListValue to support repeated. ListValue list = 1; + int32 i = 2; float f = 3; string s = 4;