This repo is a collection of AWESOME things regarding data augmentation techniques in NLP, CV and GraphML, including papers, code, etc. Feel free to star and fork.
-
[SDA'20] Syntactic Data Augmentation Increases Robustness to Inference Heuristics. ACL 2020.
Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen.
-
[PDA'20] Parallel Data Augmentation for Formality Style Transfer. ACL 2020.
Yi Zhang, Tao Ge, Xu Sun.
-
[LGDA'20] Logic-Guided Data Augmentation and Regularization for Consistent Question Answering. ACL 2020.
Akari Asai, Hannaneh Hajishirzi.
-
[GEC'20] Good-Enough Compositional Data Augmentation. ACL 2020.
Jacob Andreas.
-
[EU'20] Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity. ACL 2020.
Yuxia Wang, Fei Liu, Karin Verspoor, Timothy Baldwin.
-
[TRB'20] Towards Reversal-Based Textual Data Augmentation for NLI Problems with Opposable Classes. ACL 2020.
Alexey Tarasov.
-
[HTYD'20] How to Tame Your Data: Data Augmentation for Dialog State Tracking. ACL 2020.
Adam Summerville, Jordan Hashemi, James Ryan, William Ferguson.
-
[DATD'20] Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors. ACL 2020.
Longshaokan Wang, Maryam Fazel-Zarandi, Aditya Tiwari, Spyros Matsoukas, Lazaros Polymenakos.
-
[DATB'20] Data Augmentation for Transformer-based G2P. ACL 2020.
Zach Ryan, Mans Hulden.
-
[LAB'20] Local Additivity Based Data Augmentation for Semi-supervised NER. EMNLP 2020.
Jiaao Chen, Zhenghui Wang, Ran Tian, Zichao Yang, Diyi Yang.
-
[SSMBA'20] SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness. EMNLP 2020.
Nathan Ng, Kyunghyun Cho, Marzyeh Ghassemi.
-
[LP'20] Learning Physical Common Sense as Knowledge Graph Completion via BERT Data Augmentation and Constrained Tucker Factorization. EMNLP 2020.
Zhenjie Zhao, Evangelos Papalexakis, Xiaojuan Ma.
-
[VHDA'20] Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation. EMNLP 2020.
Kang Min Yoo, Hanbit Lee, Franck Dernoncourt, Trung Bui, Walter Chang, Sang-goo Lee.
-
[SDA'20] Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging. EMNLP 2020.
Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong.
-
[CMR'20] Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies. EMNLP 2020.
Chris Kedzie, Kathleen McKeown.
-
[SL'20] Sequence-Level Mixed Sample Data Augmentation. EMNLP 2020.
Demi Guo, Yoon Kim, Alexander Rush.
-
[TMH'20] Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space. EMNLP 2020.
Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Jiancheng Lv, Nan Duan, Ming Zhou.
-
[DAGA'20] DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks. EMNLP 2020.
Bosheng Ding, Linlin Liu, Lidong Bing, Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao.
-
[TDA'20] Textual Data Augmentation for Efficient Active Learning on Tiny Datasets. EMNLP 2020.
Husam Quteineh, Spyridon Samothrakis, Richard Sutcliffe.
-
[DB'20] Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation. EMNLP 2020.
Ruibo Liu, Guangxuan Xu, Chenyan Jia, Weicheng Ma, Lili Wang, Soroush Vosoughi.
-
[TextAttack'20] TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP. EMNLP 2020.
John Morris, Eli Lifland, Jin Yong Yoo, Jake Grigsby, Di Jin, Yanjun Qi.
-
[PHICON'20] PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation. EMNLP 2020.
Xiang Yue, Shuang Zhou.
-
[GenAug'20] GenAug: Data Augmentation for Finetuning Text Generators.EMNLP 2020.
Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy.
-
[GDA'20] Generative Data Augmentation for Commonsense Reasoning. EMNLP 2020.
Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta, Ronan Le Bras, Ji-Ping Wang, Chandra Bhagavatula, Yejin Choi, Doug Downey.
-
[HE'20] How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?. EMNLP 2020.
Shayne Longpre, Yu Wang, Chris DuBois.
-
[TDA'20] Twitter Data Augmentation for Monitoring Public Opinion on COVID-19 Intervention Measures. EMNLP 2020.
Lin Miao, Mark Last, Marina Litvak.
-
[NS'20] Noising Scheme for Data Augmentation in Automatic Post-Editing. EMNLP 2020.
WonKee Lee, Jaehun Shin, Baikjin Jung, Jihyung Lee, Jong-Hyeok Lee.
-
[QE'20] Quantifying the Evaluation of Heuristic Methods for Textual Data Augmentation. EMNLP 2020.
Omid Kashefi, Rebecca Hwa.
-
[CIT'20] Linguist Geeks on WNUT-2020 Task 2: COVID-19 Informative Tweet Identification using Progressive Trained Language Models and Data Augmentation. EMNLP 2020.
Vasudev Awatramani, Anupam Kumar.
-
[DA'20] NEU at WNUT-2020 Task 2: Data Augmentation To Tell BERT That Death Is Not Necessarily Informative. EMNLP 2020.
Kumud Chauhan.
-
[DAFH'21] Data Augmentation for Hypernymy Detection. EACL 2021.
Thomas Kober, Julie Weeds, Lorenzo Bertolini, David Weir.
-
[FSL'21] Few-shot learning through contextual data augmentation. EACL 2021.
Farid Arthaud, Rachel Bawden, Alexandra Birch.
-
[DAVA'21] Data Augmentation for Voice-Assistant NLU using BERT-based Interchangeable Rephrase. EACL 2021.
Akhila Yerukola, Mason Bretan, Hongxia Jin.
-
[SSD'21] Sarcasm and Sentiment Detection In Arabic Tweets Using BERT-based Models and Data Augmentation. EACL 2021.
Abeer Abuzayed, Hend Al-Khalifa.
- [CDA'21] Counterfactual Data Augmentation for Neural Machine Translation. NAACL 2021.
Qi Liu, Matt Kusner, Phil Blunsom.
- [AS'21] Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks. NAACL 2021.
Nandan Thakur, Nils Reimers, Johannes Daxenberger, Iryna Gurevych.
-
[IZ'21] Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation. NAACL 2021.
Alexander Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad.
-
[TA'21] Target-Aware Data Augmentation for Stance Detection. NAACL 2021.
Yingjie Li, Cornelia Caragea.
-
[FS'21] Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning. NAACL 2021.
Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu.
-
[TDA'21] Training Data Augmentation for Code-Mixed Translation. NAACL 2021.
Abhirut Gupta, Aditya Vavre, Sunita Sarawagi.
-
[SC'21] Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation. NAACL 2021.
Seiichiro Kondo, Kengo Hotate, Tosho Hirasawa, Masahiro Kaneko, Mamoru Komachi.
-
[TopGuNN'21] TopGuNN: Fast NLP Training Data Augmentation using Large Corpora. NAACL 2021.
Rebecca Iglesias-Flores, Megha Mishra, Ajay Patel, Akanksha Malhotra, Reno Kriz, Martha Palmer, Chris Callison-Burch.
-
[DAC'21] UoB at ProfNER 2021: Data Augmentation for Classification Using Machine Translation. NAACL 2021.
Frances Adriana Laureano De Leon, Harish Tayyar Madabushi, Mark Lee.
- [CG'21] Conversation Graph: Data Augmentation, Training, and Evaluation for Non-Deterministic Dialogue Management. NAACL 2021.
Milan Gritta, Gerasimos Lampouras, Ignacio Iacobacci.
-
[AugNLG'21] AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation. ACL 2021.
Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee.
-
[CVRM'21] Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. ACL 2021.
Venelin Kovatchev, Phillip Smith, Mark Lee, Rory Devine.
-
[UXLA'21] UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resource Cross-Lingual NLP. ACL 2021.
M Saiful Bari, Tasnim Mohiuddin, Shafiq Joty.
-
[DATG'21] Data Augmentation for Text Generation Without Any Augmented Data. ACL 2021.
Wei Bi, Huayang Li, Jiacheng Huang.
-
[LearnDA'21] LearnDA: Learnable Knowledge-Guided Data Augmentation for Event Causality Identification. ACL 2021.
Xinyu Zuo, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao, Weihua Peng, Yuguang Chen.
-
[CLSS'21] Cross-language Sentence Selection via Data Augmentation and Rationale Training. ACL 2021.
Yanda Chen, Chris Kedzie, Suraj Nair, Petra Galuscakova, Rui Zhang, Douglas Oard, Kathleen McKeown.
-
[HiddenCut'21] HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability. ACL 2021.
Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang.
-
[CoRI'21] CoRI: Collective Relation Integration with Data Augmentation for Open Information Extraction. ACL 2021.
Zhengbao Jiang, Jialong Han, Bunyamin Sisman, Xin Luna Dong.
-
[DAAT'21] Data Augmentation with Adversarial Training for Cross-Lingual NLI. ACL 2021.
Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo.
-
[MulDA'21] MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER. ACL 2021.
Linlin Liu, Bosheng Ding, Lidong Bing, Shafiq Joty, Luo Si, Chunyan Miao.
-
[NRQA'21] Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation. ACL 2021.
Yinfei Yang, Ning Jin, Kuo Lin, Mandy Guo, Daniel Cer.
-
[AO'21] Avoiding Overlap in Data Augmentation for AMR-to-Text Generation. ACL 2021.
Wenchao Du, Jeffrey Flanigan.
- [DAUM'21] Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings. ACL 2021.
Sosuke Nishikawa, Ryokan Ri, Yoshimasa Tsuruoka.
- [CAiRE'21] CAiRE in DialDoc21: Data Augmentation for Information Seeking Dialogue System. ACL 2021.
Yan Xu, Etsuko Ishii, Genta Indra Winata, Zhaojiang Lin, Andrea Madotto, Zihan Liu, Peng Xu, Pascale Fung.
- [ASDA'21] A Survey of Data Augmentation Approaches for NLP. ACL 2021.
Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy.
- [BRMC'21] Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning. ACL 2021.
Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun.
- [SS'21] Substructure Substitution: Structured Data Augmentation for NLP. ACL 2021.
Haoyue Shi, Karen Livescu, Kevin Gimpel.
- [NN'21] Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax. ACL 2021.
Ehsan Kamalloo, Mehdi Rezagholizadeh, Peyman Passban, Ali Ghodsi.
- [mixSeq'21] mixSeq: A Simple Data Augmentation Methodfor Neural Machine Translation. ACL 2021.
Xueqing Wu, Yingce Xia, Jinhua Zhu, Lijun Wu, Shufang Xie, Yang Fan, Tao Qin.
- [DABC'21] Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution. ACL 2021.
Toan Q. Nguyen, Kenton Murray, David Chiang.
- [TUA'21] The University of Arizona at SemEval-2021 Task 10: Applying Self-training, Active Learning and Data Augmentation to Source-free Domain Adaptation. ACL 2021.
Xin Su, Yiyun Zhao, Steven Bethard.
- [NWM'21] Cambridge at SemEval-2021 Task 2: Neural WiC-Model with Data Augmentation and Exploration of Representation. ACL 2021.
Zheng Yuan, David Strohmaier.
- [RMDA'21] NLPIITR at SemEval-2021 Task 6: RoBERTa Model with Data Augmentation for Persuasion Techniques Detection. ACL 2021.
Vansh Gupta, Raksha Sharma.
- [DPT'21] LeCun at SemEval-2021 Task 6: Detecting Persuasion Techniques in Text Using Ensembled Pretrained Transformers and Data Augmentation. ACL 2021.
Dia Abujaber, Ahmed Qarqaz, Malak A. Abdullah.
- [DAL'21] Data augmentation for low-resource grapheme-to-phoneme mapping. ACL 2021.
Michael Hammond.
- [BSS'21] BME Submission for SIGMORPHON 2021 Shared Task 0. A Three Step Training Approach with Data Augmentation for Morphological Inflection. ACL 2021.
Gábor Szolnok, Botond Barta, Dorina Lakatos, Judit Ács.
-
[ZPDA'21] Zero-pronoun Data Augmentation for Japanese-to-English Translation. ACL 2021.
Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka.
-
[RCDA'21] Reinforced Counterfactual Data Augmentation for Dual Sentiment Classification. EMNLP 2021.
Hao Chen, Rui Xia, Jianfei Yu.
-
[GOLD'21] GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation. EMNLP 2021.
Derek Chen, Zhou Yu.
- [ECL'21] Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning. EMNLP 2021.
Seonghyeon Ye, Jiseon Kim, Alice Oh.
- [MRC'21] Machine Reading Comprehension as Data Augmentation: A Case Study on Implicit Event Argument Extraction. EMNLP 2021.
Jian Liu, Yufeng Chen, Jinan Xu.
-
[VDA'21] Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models. EMNLP 2021.
Kun Zhou, Wayne Xin Zhao, Sirui Wang, Fuzheng Zhang, Wei Wu, Ji-Rong Wen.
-
[SPDA'21] Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis. EMNLP 2021.
Ting-Wei Hsu, Chung-Chi Chen, Hen-Hsen Huang, Hsin-Hsi Chen.
-
[UDA'21] Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data. EMNLP 2021.
David Lowell, Brian Howard, Zachary C. Lipton, Byron Wallace.
-
[DACD'21] Data Augmentation for Cross-Domain Named Entity Recognition. EMNLP 2021.
Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio.
-
[SCDA'21] Simple Conversational Data Augmentation for Semi-supervised Abstractive Dialogue Summarization. EMNLP 2021.
Jiaao Chen, Diyi Yang.
-
[SDA'21] Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering. EMNLP 2021.
Arij Riabi, Thomas Scialom, Rachel Keraron, Benoît Sagot, Djamé Seddah, Jacopo Staiano.
-
[RDA'21] Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach. EMNLP 2021.
Víctor M. Sánchez-Cartagena, Miquel Esplà-Gomis, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez.
-
[DAH'21] Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing. EMNLP 2021.
Kun Wu, Lijie Wang, Zhenghua Li, Ao Zhang, Xinyan Xiao, Hua Wu, Min Zhang, Haifeng Wang.
-
[HypMix'21] HypMix: Hyperbolic Interpolative Data Augmentation. EMNLP 2021.
Ramit Sawhney, Megh Thakkar, Shivam Agarwal, Di Jin, Diyi Yang, Lucie Flek.
-
[DAIR'21] Data Augmentation of Incorporating Real Error Patterns and Linguistic Knowledge for Grammatical Error Correction. EMNLP 2021.
Xia Li, Junyi He.
-
[DAM'21] Data Augmentation Methods for Anaphoric Zero Pronouns. EMNLP 2021.
Abdulrahman Aloraini, Massimo Poesio.
-
[IDS'21] Improving Dialogue State Tracking with Turn-based Loss Function and Sequential Data Augmentation. EMNLP 2021.
Jarana Manotumruksa, Jeff Dalton, Edgar Meij, Emine Yilmaz.
-
[AEDA'21] AEDA: An Easier Data Augmentation Technique for Text Classification. EMNLP 2021.
Akbar Karimi, Leonardo Rossi, Andrea Prati.
-
[LDA'21] Learning Data Augmentation Schedules for Natural Language Processing. EMNLP 2021.
Daphné Chopard, Matthias S. Treder, Irena Spasić.
-
[SH'21] Sister Help: Data Augmentation for Frame-Semantic Role Labeling. EMNLP 2021.
Ayush Pancholy, Miriam R L Petruck, Swabha Swayamdipta.
-
[AuGPT'21] AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models. EMNLP 2021.
Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek.
- [LRQA'21] Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation. EMNLP 2021.
Yang Deng, Wenxuan Zhang, Wai Lam.
-
[CipherDAug'22] CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation. ACL 2022.
Nishant Kambhatla, Logan Born, Anoop Sarkar.
-
[MELM'22] MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER. ACL 2022.
Ran Zhou, Xin Li, Ruidan He, Lidong Bing, Erik Cambria, Luo Si, Chunyan Miao.
-
[CFRL'22] Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation. ACL 2022.
Chengwei Qin, Shafiq Joty.
-
[PromDA'22] PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks. ACL 2022.
Yufei Wang, Can Xu, Qingfeng Sun, Huang Hu, Chongyang Tao, Xiubo Geng, Daxin Jiang.
-
[FlipDA'22] FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning. ACL 2022.
Jing Zhou, Yanan Zheng, Jie Tang, Li Jian, Zhilin Yang.
-
[STR'22] Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation. ACL 2022.
Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler.
-
[TS'22] Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks. ACL 2022.
Xing Wu, Chaochen Gao, Meng Lin, Liangjun Zang, Songlin Hu.
-
[DABF'22] Data Augmentation for Biomedical Factoid Question Answering. ACL 2022.
Dimitris Pappas, Prodromos Malakasiotis, Ion Androutsopoulos.
-
[SBDA'22] Simple Semantic-based Data Augmentation for Named Entity Recognition in Biomedical Texts. ACL 2022.
Uyen Phan, Nhung Nguyen.
-
[DARS'22] Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection. ACL 2022.
Bosung Kim, Ndapa Nakashole.
-
[HZ'22] Horses to Zebras: Ontology-Guided Data Augmentation and Synthesis for ICD-9 Coding. ACL 2022.
Matúš Falis, Hang Dong, Alexandra Birch, Beatrice Alex.
-
[TDA'22] DE-ABUSE@TamilNLP-ACL 2022: Transliteration as Data Augmentation for Abuse Detection in Tamil. ACL 2022.
Vasanth Palanikumar, Sean Benhur, Adeep Hande, Bharathi Raja Chakravarthi.
-
[EDA'22] BpHigh@TamilNLP-ACL2022: Effects of Data Augmentation on Indic-Transformer based classifier for Abusive Comments Detection in Tamil. ACL 2022.
Bhavish Pahwa.
-
[RDA'22] Retrieval Data Augmentation Informed by Downstream Question Answering Performance. ACL 2022.
James Ferguson, Hannaneh Hajishirzi, Pradeep Dasigi, Tushar Khot.
-
[AUS'22] When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation. ACL 2022.
Ehsan Kamalloo, Mehdi Rezagholizadeh, Ali Ghodsi.
-
[LDC'22] Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text. ACL 2022.
Siyuan Wang, Wanjun Zhong, Duyu Tang, Zhongyu Wei, Zhihao Fan, Daxin Jiang, Ming Zhou, Nan Duan.
-
[DA'22] Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue. ACL 2022.
Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen.
-
[CNEG'22] Improving Chinese Grammatical Error Detection via Data augmentation by Conditional Error Generation. ACL 2022.
Tianchi Yue, Shulin Liu, Huihui Cai, Tao Yang, Shengkang Song, TingHao Yu.
-
[AMR'22] AMR-DA: Data Augmentation by Abstract Meaning Representation. ACL 2022.
Ziyi Shou, Yuxin Jiang, Fangzhen Lin.
-
[AR'22] Addressing Resource and Privacy Constraints in Semantic Parsing Through Data Augmentation. ACL 2022.
Kevin Yang, Olivia Deng, Charles Chen, Richard Shin, Subhro Roy, Benjamin Van Durme.
-
[CL'22] Cross-lingual Inflection as a Data Augmentation Method for Parsing. ACL 2022.
Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, David Vilares.
-
[OIDA'22] On the Impact of Data Augmentation on Downstream Performance in Natural Language Processing. ACL 2022.
Itsuki Okimura, Machel Reid, Makoto Kawano, Yutaka Matsuo.
-
[IMT'22] Improving Machine Translation Formality Control with Weakly-Labelled Data Augmentation and Post Editing Strategies. ACL 2022.
Daniel Zhang, Jiang Yu, Pragati Verma, Ashwinkumar Ganesan, Sarah Campbell.
-
[ESM'22] FilipN@LT-EDI-ACL2022-Detecting signs of Depression from Social Media: Examining the use of summarization methods as data augmentation for text classification. ACL 2022.
Filip Nilsson, György Kovács.
-
[DAIS'22] Data Augmentation for Intent Classification with Off-the-shelf Large Language Models. ACL 2022.
Gaurav Sahu, Pau Rodriguez, Issam Laradji, Parmida Atighehchian, David Vazquez, Dzmitry Bahdanau.
- [Clozer'22] Clozer”:" Adaptable Data Augmentation for Cloze-style Reading Comprehension. ACL 2022.
Holy Lovenia, Bryan Wilie, Willy Chung, Zeng Min, Samuel Cahyawijaya, Dan Su, Pascale Fung.
-
[DALR'22] Data Augmentation for Low-Resource Dialogue Summarization. Findings 2022.
Yongtai Liu, Joshua Maynez, Gonçalo Simões, Shashi Narayan.
-
[TGD'22] Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation. Findings 2022.
Prakhar Gupta, Harsh Jhamtani, Jeffrey Bigham.
-
[EMG'22] Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation. Findings 2022.
Yong Cao, Wei Li, Xianzhi Li, Min Chen, Guangyong Chen, Long Hu, Zhengdao Li, Kai Hwang.
-
[DAD'22] Data Augmentation with Dual Training for Offensive Span Detection. NAACL 2022.
Nasim Nouri.
-
[PM'22] Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers. NAACL 2022.
Vivek Kumar, Rishabh Maheshwary, Vikram Pudi.
-
[GCD'22] Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction. NAACL 2022.
Junjie Li, Jianfei Yu, Rui Xia.
-
[ICG'22] Improving Compositional Generalization with Latent Structure and Data Augmentation. NAACL 2022.
Linlu Qiu, Peter Shaw, Panupong Pasupat, Pawel Nowak, Tal Linzen, Fei Sha, Kristina Toutanova.
-
[EPiDA'22] EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification. NAACL 2022.
Minyi Zhao, Lu Zhang, Yi Xu, Jiandong Ding, Jihong Guan, Shuigeng Zhou.
-
[TreeMix'22] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding. NAACL 2022.
Le Zhang, Zichao Yang, Diyi Yang.
-
[EP'22] ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization. NAACL 2022.
Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki.
-
[IC'22] Improving Classification of Infrequent Cognitive Distortions: Domain-Specific Model vs. Data Augmentation. NAACL 2022.
Xiruo Ding, Kevin Lybarger, Justin Tauscher, Trevor Cohen.
-
[GraDA'22] GraDA: Graph Generative Data Augmentation for Commonsense Reasoning. NAACL 2022.
Adyasha Maharana, Mohit Bansal.
-
[ZQA'22] ZusammenQA: Data Augmentation with Specialized Models for Cross-lingual Open-retrieval Question Answering System. NAACL 2022.
Chia-Chien Hung, Tommaso Green, Robert Litschko, Tornike Tsereteli, Sotaro Takeshita, Marco Bombieri, Goran Glavaš, Simone Paolo Ponzetto.
-
[IG'22] UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation. NAACL 2022.
Injy Sarhan, Pablo Mosteiro, Marco Spruit.
-
[PC'22] Tesla at SemEval-2022 Task 4: Patronizing and Condescending Language Detection using Transformer-based Models with Data Augmentation. NAACL 2022.
Sahil Bhatt, Manish Shrivastava.
-
[TA'22] Amsqr at SemEval-2022 Task 4: Towards AutoNLP via Meta-Learning and Adversarial Data Augmentation for PCL Detection. NAACL 2022.
Alejandro Mosquera.
-
[EDA'22] CS/NLP at SemEval-2022 Task 4: Effective Data Augmentation Methods for Patronizing Language Detection and Multi-label Classification with RoBERTa and GPT3. NAACL 2022.
Daniel Saeedi, Sirwe Saeedi, Aliakbar Panahi, Alvis C.M. Fong.
-
[SD'22] Plumeria at SemEval-2022 Task 6: Sarcasm Detection for English and Arabic Using Transformers and Data Augmentation. NAACL 2022.
Mosab Shaheen, Shubham Nigam.
-
[ACA'22] UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation. NAACL 2022.
Amirhossein Abaskohi, Arash Rasouli, Tanin Zeraati, Behnam Bahrak.
-
[ALI'22] HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity. NAACL 2022.
Zihang Xu, Ziqing Yang, Yiming Cui, Zhigang Chen.
-
[PT'22] ITNLP2022 at SemEval-2022 Task 8: Pre-trained Model with Data Augmentation and Voting for Multilingual News Similarity. NAACL 2022.
Zhongan Chen, Weiwei Chen, YunLong Sun, Hongqing Xu, Shuzhe Zhou, Bohan Chen, Chengjie Sun, Yuanchao Liu.
-
[IDA'22] MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis. NAACL 2022.
Cong Chen, Jiansong Chen, Cao Liu, Fan Yang, Guanglu Wan, Jinxiong Xia.
-
[AS'22] Infrrd.ai at SemEval-2022 Task 11: A system for named entity recognition using data augmentation, transformer-based sequence labeling model, and EnsembleCRF. NAACL 2022.
Jianglong He, Akshay Uppal, Mamatha N, Shiv Vignesh, Deepak Kumar, Aditya Kumar Sarda.
-
[OL'22] TEAM-Atreides at SemEval-2022 Task 11: On leveraging data augmentation and ensemble to recognize complex Named Entities in Bangla. NAACL 2022.
Nazia Tasnim, Md. Istiak Shihab, Asif Shahriyar Sushmit, Steven Bethard, Farig Sadeque.
-
[DAE'22] UA-KO at SemEval-2022 Task 11: Data Augmentation and Ensembles for Korean Named Entity Recognition. NAACL 2022.
Hyunju Song, Steven Bethard.
-
[DA'22] Sharing Data by Language Family: Data Augmentation for Romance Language Morpheme Segmentation. NAACL 2022.
Lauren Levine.
-
[CG'22] Compositional Generalization for Kinship Prediction through Data Augmentation. NAACL 2022.
Kangda Wei, Sayan Ghosh, Shashank Srivastava.