Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chunk eval op #5016

Merged
merged 4 commits into from
Nov 10, 2017
Merged

Conversation

guoshengCS
Copy link
Contributor

resolves #4749

ctx->SetOutputDim("F1-Score", {1});
}

framework::DataType IndicateDataType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndicateDataType is a protected member function.

 protected:
  framework::DataType IndicateDataType(...) {
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tag_single = -1;
} else {
PADDLE_THROW("Unknown chunk scheme.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to define a struct for these arguments and put these arguments initialization code to another member function?

Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates precision, recall and F1 scores for the chunk detection.
To use chunk evaluator, several concepts need to be clarified firstly.
[Chunk type] is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an empty line before line 81 and 82.

Give the full name for the NER.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

IOBES Four labels for chunk type X, B-X for chunk begining, I-X for chunk inside, E-X for chunk end and S-X for single word chunk.

To make it clear, let's illustrate by an NER example.
Assuming that there are three named entity types including ORG, PER and LOC which are called 'chunk type' here,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain the LOC here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.


tagType = label % numTagType
chunkType = label / numTagType
otherChunkType = numChunkTypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numTagType and numChunkTypes here is clear, but better to explain them again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

tag_end, tag_single, excluded_chunk_types);
}
*precision_data =
!num_output_segments ? 0 : (T)num_correct / num_output_segments;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(T) num_correct -> static_cast<T>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates precision, recall and F1 scores for the chunk detection.
To use chunk evaluator, several concepts need to be clarified firstly.
[Chunk type] is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary that we explain meaning of 'chunk' before 'chunk type'
[chunk] is a subset of the tokens in a sentence. a yellow dog is a chunk of sentence I have a yellow dog.. And chunk of sentence can be noun phrase, person name, organization name and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates precision, recall and F1 scores for the chunk detection.
To use chunk evaluator, several concepts need to be clarified firstly.
[Chunk type] is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole chunk -> a chunk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

sequence. It calculates precision, recall and F1 scores for the chunk detection.
To use chunk evaluator, several concepts need to be clarified firstly.
[Chunk type] is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)
[Tag type] indicates the position of a word in a chunk. (B for begin, I for inside, E for end, S for single)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O for outside

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

"IOB" so tagType has two values: 0 for B and 1 for I.
Here we will use I-LOC to explain the above mapping rules in detail.
For I-LOC, the label id is 5, so we can get tagType=1 and chunkType=2, which means I-LOC is a part of NER chunk LOC
and the tag is I.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about giving an example here?

Steven B-PER 2
Paul I-PER 3
Jobs I-PER 3
works O 6
for O 6
Baidu B-ORG 0
Inc. I-ORG 1
at O 6
Beijing B-LOC 4
of I-LOC 5
China I-LOC 5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.


void EvalOneSeq(const int* output, const int* label, int length,
std::vector<Segment>& output_segments,
std::vector<Segment>& label_segments,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_segments and label_segments are not used outside of EvalOneSeq. So why not difine them in EvalOneSeq and remove them from arguments list?

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codes in this PR LGTM (from the original chunk evaluator). But the documentation needs to refine. I think we can merge the codes and ask someone who is familiar with sequence tagging task and good at English writing for help to refine the doc.

AddInput("Label", "(Tensor, default: Tensor<int>) Labels of the data.");
AddOutput(
"Precision",
"(float) The precision ratio of the predictions on current data.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluated precision (called positive predictive value) of chunks on the given mini-batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"Precision",
"(float) The precision ratio of the predictions on current data.");
AddOutput("Recall",
"(float) The recall ratio of the predictions on current data.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluated recall (true positive rate or sensitivity) of chunks on the given mini-batch.

I think we should tell the users such an evaluation is performed on the mini-batch, not on the data tested up to now. But, once we change this, and make sure to update the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

AddOutput("Recall",
"(float) The recall ratio of the predictions on current data.");
AddOutput("F1-Score",
"(float) The F1-Score of the predictions on current data.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluated F1-Score on the given mini-batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Inference",
"(Tensor, default: Tensor<int>) Predictions from the network.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a "." after (Tensor, default: Tensor). The same below.

(Tensor, default: Tensor<int>). Predictions from the network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Inference",
"(Tensor, default: Tensor<int>) Predictions from the network.");
AddInput("Label", "(Tensor, default: Tensor<int>) Labels of the data.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The true tag sequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"(float) The F1-Score of the predictions on current data.");
AddAttr<int>("num_chunk_types", "(int) The number of chunk type.");
AddAttr<std::string>("chunk_scheme",
"(string, default IOB) The label scheme.")
Copy link
Contributor

@lcy-seso lcy-seso Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The labeling scheme indicating how to encode the chunks, including IOB, x, x, x, (all the supported schemes.) It is better to add a reference here to explain how these schemes label chunks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"excluded_chunk_types",
"(list<int>) A list<int> indicating chunk types not to be counted.")
.SetDefault(std::vector<int>{});
AddComment(R"DOC(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think it will much better to explain what is chunk first. For example, maybe like this.

Chunks are about character spans. In the sequence tagging problem, chunks are sequences of tokens (words or other units) and tags (tag labels, categories).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

.SetDefault(std::vector<int>{});
AddComment(R"DOC(
Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates precision, recall and F1 scores for the chunk detection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chunk detection --> chunks the model predicts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

AddComment(R"DOC(
Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates precision, recall and F1 scores for the chunk detection.
To use chunk evaluator, several concepts need to be clarified firstly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we first introduce some related concepts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the doc.

.SetDefault("IOB");
AddAttr<std::vector<int>>(
"excluded_chunk_types",
"(list<int>) A list<int> indicating chunk types not to be counted.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • indicating chunk types that are not counted.
  • This explanation is hard to understand for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Add see below for details.

@qingqing01 qingqing01 merged commit aa34067 into PaddlePaddle:develop Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Operator used as Chunk Evaluator
4 participants