Remove console logging of conversation level F1-score and precision since these calculations were not meaningful.
Add conversation level accuracy to core policy results logged to file in story_report.json
after running rasa test core
or rasa test
.