-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Implement new rules design #119
[ML] Implement new rules design #119
Conversation
250a85b
to
aaa2e04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some tidy up suggestions, but basically LGTM.
return false; | ||
} | ||
|
||
for (auto& member : scopeObject.GetObject()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like it can be const auto&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried this but it doesn't compile. I guess I'm using public fields of member
and the square brackets operator which probably is not const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because RapidJSON doesn't have begin()
and end()
functions that return const iterators.
Read downwards from "Just one problem ...." in https://mbevin.wordpress.com/2012/11/14/range-based-for/
In RapidJSON there is this:
#if RAPIDJSON_HAS_CXX11_RANGE_FOR
MemberIterator begin() const { return value_.MemberBegin(); }
MemberIterator end() const { return value_.MemberEnd(); }
#endif
But sadly no corresponding functions that return ConstMemberIterator
.
It would be trivial to add - you could submit a PR to RapidJSON if you want to be famous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll hold on to get famous as a rock star instead!
lib/model/CDetectionRule.cc
Outdated
: m_Action(E_FilterResults), m_Conditions(), m_ConditionsConnective(E_Or), | ||
m_TargetFieldName(), m_TargetFieldValue() { | ||
m_Conditions.reserve(1); | ||
CDetectionRule::CDetectionRule() : m_Action(E_SkipResult), m_Conditions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can initialise m_Action
to E_SkipResult
where it is declared in the class body and then keep the auto generated default constructor.
lib/model/CDetectionRule.cc
Outdated
@@ -51,99 +42,48 @@ bool CDetectionRule::apply(ERuleAction action, | |||
return false; | |||
} | |||
|
|||
if (this->isInScope(model, pid, cid) == false) { | |||
if (m_Scope.check(model, pid, cid) == false) { | |||
return false; | |||
} | |||
|
|||
for (std::size_t i = 0; i < m_Conditions.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: switch to range based for loop.
lib/model/CRuleScope.cc
Outdated
bool CRuleScope::check(const CAnomalyDetectorModel& model, std::size_t pid, std::size_t cid) const { | ||
|
||
const CDataGatherer& gatherer = model.dataGatherer(); | ||
for (auto& scopeField : m_Scope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto&
lib/model/CRuleScope.cc
Outdated
|
||
std::string CRuleScope::print() const { | ||
std::string result{""}; | ||
if (m_Scope.empty() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this if is unnecessary, begin can safely be called and you don't enter the loop in this case so no harm done.
lib/model/CRuleScope.cc
Outdated
} | ||
|
||
void CRuleScope::exclude(std::string field, const core::CPatternSet& filter) { | ||
m_Scope.push_back({field, TPatternSetCRef(filter), E_Exclude}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: emplace_back
lib/model/CRuleScope.cc
Outdated
} | ||
|
||
void CRuleScope::include(std::string field, const core::CPatternSet& filter) { | ||
m_Scope.push_back({field, TPatternSetCRef(filter), E_Include}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: emplace_back
lib/model/CRuleScope.cc
Outdated
namespace ml { | ||
namespace model { | ||
|
||
CRuleScope::CRuleScope() : m_Scope() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use CRuleScope() = default
in header.
lib/model/CRuleCondition.cc
Outdated
} | ||
std::string result = this->print(m_AppliesTo); | ||
result += " " + this->print(m_Condition.s_Op) + " " + | ||
core::CStringUtils::typeToString(m_Condition.s_Threshold); | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: naming the result doesn't gain anything in readability IMO, so I'd return a temporary here to guaranty RVO, i.e.
return this->print(m_AppliesTo) + " " + this->print(m_Condition.s_Op) + " " +
core::CStringUtils::typeToString(m_Condition.s_Threshold);
The ml-cpp side of the new rules implementation.
a518172
to
d1c9b36
Compare
@droberts195 @tveasey Could you have another look at this please? |
Noting here that I had a go at trying to reuse the state restore traverser paradigm to parse the rules. However, this didn't work out as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I left a couple of nits about unnecessary copying. In addition to these there will still be unnecessary copying because core::CTriple
doesn't have constructors that take rvalue references. Let's not complicate this PR by changing that here too, but it's worth noting that until all our classes are capable of efficiently consuming moved objects it might be better to stick to passing strings by const reference instead of value.
lib/model/CRuleScope.cc
Outdated
namespace model { | ||
|
||
void CRuleScope::include(std::string field, const core::CPatternSet& filter) { | ||
m_Scope.emplace_back(field, TPatternSetCRef(filter), E_Include); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're passing field
by value you should move it into m_Scope
, i.e.:
m_Scope.emplace_back(std::move(field), TPatternSetCRef(filter), E_Include);
Same in exclude
immediately below.
lib/model/CDetectionRule.cc
Outdated
void CDetectionRule::addCondition(const CRuleCondition& condition) { | ||
m_Conditions.push_back(condition); | ||
void CDetectionRule::includeScope(std::string field, const core::CPatternSet& filter) { | ||
m_Scope.include(field, filter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::move(field)
, as it was passed by value and is not used anywhere else.
Same in excludeScope
immediately below.
Regarding these cases: I agree, I don't think we should abandon passing by const reference, for consistency if nothing else. However, this is basically LGTM2. |
The ml-cpp side of the new rules implementation.
The ml-cpp side of elastic/elasticsearch#31110.
Release note: Detectors now support custom rules that enable the user to improve machine learning results by providing some domain-specific knowledge in the form of rule.