-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using file_account()
to filter training data doesn't work for certain use cases
#122
Comments
Hi, the use case sounds somewhat exotic but legitimate... thank you for bringing this up. As for a potential solution, what you want to achieve is to not filter training data by account, correct? Training data is filtered in the class PredictPostingsCustom(PredictPostings):
def training_data_filter(self, txn):
return True You can, of course, implement any other filtering logic instead of simply returning True. Is this what you are looking for? |
Awesome, that helps! I do have a suggestion though: would This way, a) we can have more than one interesting account, and b) can pass in the list without needing to write code. Here's my sense of why this might becoming a fairly often used case in the future: anyone using commodity-leaf accounts (eg: Cash management accounts, which are becoming popular (anecdotal), change all that: they mix investments with expenses. So a few of us ran into it, including myself. Here's the ugly hack we've been using, just FYI. |
Glad it helped!
Not sure. This needs careful consideration. If I remember well, the upstream beancount importer classes have a singular account field where importers file the imported CSV or PDF. This may or may not be in conflict with what you are proposing. |
It wouldn't be in conflict because what I'm proposing is to broaden the set of transactions that |
smart_importer
filters training data from a single account, obtained by calling the importer's
file_account()`, as discussed in #30. This presents two questions:(For my understanding): shouldn't it filter out for accounts in each transaction instead of assuming the importer will always have one single account (that
file_account()
returns) as a part of every transaction?My question: There are cases where
file_account()
is not the account we want to train by. The use case at least a couple of users have run into (including myself) is here. In short, when importing using a commodity-leaf structure, postings look like this:file_account()
must returnAssets:Investments:Fidelity
(without the:USD
), sobean-file
has the correct directory to file this against. However, this means thatsmart_importer
fails to find training data, since that account doesn't exist.Is there a config option or alternative I'm missing here? If this is a valid use case, I'm happy to help figure out a solution + patch.
The text was updated successfully, but these errors were encountered: