-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155
Conversation
Can one of the admins verify this patch? |
Nice work. I am reading the implementation of MulticlassMetrics. According to your code, for Micro average, you calculate the recall and then let precision and f1 measure equal to the recall. I am not sure whether this makes sense. According to this post: http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html Assume we just have three classes. For each class, we have three numbers, true positive(tp), false positive(fp), false negative(fn). Hence, we have tp1, fp1 and fn1 for class 1. so on so forth. For Micro-Average Precision: (tp1 + tp2 + tp3) / (tp1 + tp2 + tp3 + fp1 + fp2 + fp3) Based on the above definition, recall and precision should not be the same. Is it correct? |
The micro averaged Precision and Recall are equal for multiclass classifier, because sum(fni)=sum(fpi), i.e. they are just the sum of all non-diagonal elements in confusion matrix. F1-measure, as a harmonic mean of teo equal numbers, also equals to P and R. For more details please refer to the book "Introduction to IR" by Manning. |
It makes sense. You are right. sum(fni)=sum(fpi). The recall and precision are the same. Thanks very much. |
👍 |
Jenkins, add to whitelist. |
Jenkins, test this please. |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16297/ |
* Evaluator for multiclass classification. | ||
* NB: type Double both for prediction and label is retained | ||
* for compatibility with model.predict that returns Double | ||
* and MLUtils.loadLibSVMFile that loads class labels as Double |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to mention loadLibSVMFile
in particular here. This is a "global" assumption in MLlib.
QA tests have started for PR 1155. This patch merges cleanly. |
QA results for PR 1155: |
@avulanov In Scala, "for" is slower than "while". See https://issues.scala-lang.org/browse/SI-1338 for example. So please replace the for loop with two while loops in your implementation. |
* as in "labels" | ||
*/ | ||
lazy val confusionMatrix: Matrix = { | ||
val transposedFlatMatrix = Array.ofDim[Double](labels.size * labels.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save labels.size
to n
? Btw, I'm not sure whether we should use lazy val
here because the result matrix could be 1000x1000, different from other lazy vals used here.
@mengxr I've addressed your comments. Thanks for pointing me to the Scala issue |
QA tests have started for PR 1155. This patch merges cleanly. |
@avulanov I made some minor updates and send you a PR at avulanov#1 . If it looks good to you, please merge that PR and the changes should show up here. Thanks! |
minor updates
@mengxr done! |
QA tests have started for PR 1155. This patch merges cleanly. |
QA results for PR 1155: |
Merged. Thanks for your contribution! |
Thanks! I'll be glad to contribute more. |
Adding two classes: 1) MulticlassMetrics implements various multiclass evaluation metrics 2) MulticlassMetricsSuite implements unit tests for MulticlassMetrics Author: Alexander Ulanov <nashb@yandex.ru> Author: unknown <ulanov@ULANOV1.emea.hpqcorp.net> Author: Xiangrui Meng <meng@databricks.com> Closes apache#1155 from avulanov/master and squashes the following commits: 2eae80f [Alexander Ulanov] Merge pull request apache#1 from mengxr/avulanov-master 5ebeb08 [Xiangrui Meng] minor updates 79c3555 [Alexander Ulanov] Addressing reviewers comments mengxr 0fa9511 [Alexander Ulanov] Addressing reviewers comments mengxr f0dadc9 [Alexander Ulanov] Addressing reviewers comments mengxr 4811378 [Alexander Ulanov] Removing println 87fb11f [Alexander Ulanov] Addressing reviewers comments mengxr. Added confusion matrix e3db569 [Alexander Ulanov] Addressing reviewers comments mengxr. Added true positive rate and false positive rate. Test suite code style. a7e8bf0 [Alexander Ulanov] Addressing reviewers comments mengxr c3a77ad [Alexander Ulanov] Addressing reviewers comments mengxr e2c91c3 [Alexander Ulanov] Fixes to mutliclass metics d5ce981 [unknown] Comments about Double a5c8ba4 [unknown] Unit tests. Class rename fcee82d [unknown] Unit tests. Class rename d535d62 [unknown] Multiclass evaluation
Implementation of various multi-label classification measures, including: Hamming-loss, strict and default Accuracy, macro-averaged Precision, Recall and F1-measure based on documents and labels, micro-averaged measures: https://issues.apache.org/jira/browse/SPARK-2329 Multi-class measures are currently in the following pull request: #1155 Author: Alexander Ulanov <nashb@yandex.ru> Author: avulanov <nashb@yandex.ru> Closes #1270 from avulanov/multilabelmetrics and squashes the following commits: fc8175e [Alexander Ulanov] Merge with previous updates 43a613e [Alexander Ulanov] Addressing reviewers comments: change Set to Array 517a594 [avulanov] Addressing reviewers comments: Scala style cf4222b [avulanov] Addressing reviewers comments: renaming. Added label method that returns the list of labels 1843f73 [Alexander Ulanov] Scala style fix 79e8476 [Alexander Ulanov] Replacing fold(_ + _) with sum as suggested by srowen ca46765 [Alexander Ulanov] Cosmetic changes: Apache header and parameter explanation 40593f5 [Alexander Ulanov] Multi-label metrics: Hamming-loss, strict and normal accuracy, fix to macro measures, bunch of tests ad62df0 [Alexander Ulanov] Comments and scala style check 154164b [Alexander Ulanov] Multilabel evaluation metics and tests: macro precision and recall averaged by docs, micro and per-class precision and recall averaged by class
@avulanov You have added a class called Can you give me an example for... say: the MNIST dataset (10 output neurons). Thanks! |
@tolgap As documentation suggests, For example:
|
@avulanov How many neurons does the output layer have in this case? 1 or 10? Because my current implementation has an output layer of 10 neurons, e.g: val output = Array[Double](7.466E-4, 4.16464E-9, 0.0, 0.0, 0.99462, /*..*/) In this case, this example has the highest probability of being the digit |
@tolgap ANNClassifier will create 10 output neurons for mnist, 10 is the number of distinct labels derived from the data. Each class usually is encoded with a separate output neuron, especially when there are no explicit relations (or ordering) between classes. If you wish to learn more, there is a good explanation here: http://www.faqs.org/faqs/ai-faq/neural-nets/part2/index.html |
https://github.pie.apple.com/IPR/apache-incubator-iceberg/compare/IPR:48834b0...IPR:1c9b798 [rdar://119151572 (Validate that partition tuples is same before and after data compaction](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/6abc5d585a9c0d109883a422cd1eb101c0fac2d1) [Internal: Cherry Pick Delete stats file in CatalogUtil:dropTableData (apache#1155)](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/e4b1bef06df411c7790df776bdf4a8828f30a42d)https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/1155 [Internal: Remove Check for non-null Sequence Number for Manifest Entries ](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/7d13aeefcc2144347a310d2ac51d626d5067ddf9) (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/1158)
Adding two classes: