Label model symmetry breaking #1451

ajratner · 2019-09-05T23:43:09Z

Description of proposed changes

This PR primarily implements a heuristic procedure for selecting one of several symmetric (equally optimal) solutions to the LabelModel parameter estimation procedure arising from orthogonal symmetries there. Basically, for any mu that we learn (the estimated conditional probabilities of the LFs), we can often also accept column permutations of this result. So, we choose the solution where the most LFs are estimated to be better than random, as per our standard modeling assumption.

This PR also:

Slightly refactors and changes the LabelModel.get_conditional_probs sub-function
Factors out the 'post-processing' operations in LabelModel.fit, i.e. right now, this symmetry breaking operation and clamping.

Related issue(s)

Fixes #1437 (at least to first order)

Test plan

Adding additional tests for (a) conditional probability table calculation, and (b) symmetry breaking specifically

Checklist

I have read the CONTRIBUTING document.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing tests passed.

- Factor out two post-processing ops on mu in LabelModel.fit - Implement heuristic symmetry breaking on mu

codecov · 2019-09-05T23:53:00Z

Codecov Report

Merging #1451 into master will increase coverage by 0.03%.
The diff coverage is 95.23%.

@@            Coverage Diff             @@
##           master    #1451      +/-   ##
==========================================
+ Coverage   97.55%   97.58%   +0.03%     
==========================================
  Files          55       55              
  Lines        2001     2032      +31     
  Branches      328      334       +6     
==========================================
+ Hits         1952     1983      +31     
  Misses         22       22              
  Partials       27       27

Impacted Files	Coverage Δ
snorkel/labeling/model/label_model.py	`95.72% <95.23%> (+0.46%)`	⬆️
snorkel/labeling/analysis.py	`100% <0%> (ø)`	⬆️

paroma

is there a test for checking whether the symmetry breaking is working correctly or is it inherent in one of the other tests?

snorkel/labeling/model/label_model.py

henryre

offline: added tests

snorkel/labeling/model/label_model.py

test/labeling/model/test_label_model.py

plison · 2019-10-01T11:40:59Z

I'm experiencing problems with the symmetric breaking code. Given the combinatorial explosion of the number of possible permutations (as a function of the number of output classes), the method does not really scale to problems with more than 6-7 classes. Maybe the code should stop after a few thousand permutations?

ajratner added 6 commits September 5, 2019 15:36

- Refactor get_conditional_probs

11ce72c

- Factor out two post-processing ops on mu in LabelModel.fit - Implement heuristic symmetry breaking on mu

Fixing style check errors

bf619b3

Passing basic tests

41ec6fe

Passes tox

4faa7bf

Changed to the standard heuristic / assumption

4560f96

Changed to proper test of accuracies vs. cond prob

bf12afa

paroma requested changes Sep 5, 2019

View reviewed changes

henryre reviewed Sep 6, 2019

View reviewed changes

ajratner added 2 commits September 5, 2019 17:36

Refactor subfn for counting good LFs + add test

b01c38c

Address PR comments

4f5e009

ajratner requested review from henryre and paroma September 6, 2019 01:00

paroma approved these changes Sep 6, 2019

View reviewed changes

henryre reviewed Sep 6, 2019

View reviewed changes

snorkel/labeling/model/label_model.py Outdated Show resolved Hide resolved

henryre reviewed Sep 6, 2019

View reviewed changes

test/labeling/model/test_label_model.py Outdated Show resolved Hide resolved

ajratner added 2 commits September 5, 2019 18:31

Add unit test for symmetry breaking

8a5a13d

Address PR comments

a7bce9d

ajratner requested a review from henryre September 6, 2019 01:33

henryre approved these changes Sep 6, 2019

View reviewed changes

Fix naming bug

e0f0994

ajratner merged commit 8e4526e into master Sep 6, 2019

ajratner deleted the label-model-symmetry-breaking branch September 6, 2019 03:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label model symmetry breaking #1451

Label model symmetry breaking #1451

ajratner commented Sep 5, 2019 •

edited

Loading

codecov bot commented Sep 5, 2019 •

edited

Loading

paroma left a comment

henryre left a comment

plison commented Oct 1, 2019

Label model symmetry breaking #1451

Label model symmetry breaking #1451

Conversation

ajratner commented Sep 5, 2019 • edited Loading

Description of proposed changes

Related issue(s)

Test plan

Checklist

codecov bot commented Sep 5, 2019 • edited Loading

Codecov Report

paroma left a comment

Choose a reason for hiding this comment

henryre left a comment

Choose a reason for hiding this comment

plison commented Oct 1, 2019

ajratner commented Sep 5, 2019 •

edited

Loading

codecov bot commented Sep 5, 2019 •

edited

Loading