Skip to content

Commit

Permalink
fix text_bert incorrect dimension in run_single()
Browse files Browse the repository at this point in the history
  • Loading branch information
FlameSky-S committed Sep 23, 2022
1 parent 5e5c8ae commit bda2ace
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 170 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ dist
__pycache__
exts
tmp.csv
build
168 changes: 0 additions & 168 deletions src/MSA_FET/dataloader.py

This file was deleted.

3 changes: 1 addition & 2 deletions src/MSA_FET/single.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,6 @@ class FeatureExtractionTool(object):
[ ] Add csv/dataframe output format.
[ ] Support specifying existing feature files, modify only some of the modalities.
[ ] Implement resume function for run_dataset().
[ ] Forced Alignment & Aligned Feature Extraction.
[ ] GPU support in `run_dataset()`. Maybe discard Dataset and Dataloader is a good idea. Just implement multiprocessing pool manually.
[ ] Clean up tmp folder before run_single.
[ ] Better error logs, optimize stack traces to avoid duplicate messages.
[ ] Set gpu_id during init, not in config.
Expand Down Expand Up @@ -174,6 +172,7 @@ def _text_extract_single(self, in_file : Path, in_text : str = None) -> np.ndarr
text = self.text_extractor.load_text_from_file(in_file)
text_result = self.text_extractor.extract(text)
text_tokens = self.text_extractor.tokenize(text)
text_tokens = text_tokens.transpose(0, 2, 1)
return text_result, text_tokens

def _aligned_extract_single(
Expand Down

0 comments on commit bda2ace

Please sign in to comment.