Conferences/Journals (by topic):

*: My advisee; #: visiting student (work done/started at OSU).
  • Shijie Chen‡, Ziru Chen‡, Xiang Deng†, Ashley Lewis†, Lingbo Mo†, Samuel Stevens† Zhen Wang†, Xiang Yue†, Tianshu Zhang†, Yu Su, Huan Sun, “Bootstrapping a User-Centered Task-Oriented Dialogue System,” 1st Proceedings of Alexa Prize TaskBot (Alexa Prize 2021, report finished in 2022). [paper, project website] (‡: Team Lead; †: Equal Contribution)
  • Boshi Wang*, Xiang Deng*, Huan Sun, “Shepherd Pre-trained Language Models to Develop a Train of Thought: An Iterative Prompting Approach,” arXiv 2022. [paper, code]
  • Bernal Jiménez Gutiérrez, Nikolas McNeal, Clay Washington, You Chen, Lang Li, Huan Sun, Yu Su, “Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again,” arXiv 2022. [paper, code]
  • Xiang Yue*, Ziyu Yao, Huan Sun, “Synthetic Question Value Estimation for Domain Adaptation of Question Answering,” The 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022, long). [paper, code]
  • Lingbo Mo*, Ashley Lewis, Huan Sun, Michael White, “Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction,” Findings of the 60th Annual Meeting of the Association for Computational Linguistics (Findings of ACL 2022, long). [paper, data&code]
  • Xiang Deng*, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Huan Sun, “DOM-LM: Learning Generalizable Representations for HTML Documents,” arXiv 2022.
  • Xiang Yue*, Xinliang Frederick Zhang*, Ziyu Yao*, Simon Lin, Huan Sun, “CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering,” 2021 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2021, long paper). Best Paper Award. [paper, code]
  • Xiang Deng*, Yu Su, Alyssa Lees, You Wu, Cong Yu, Huan Sun, “ReasonBERT: Pre-trained to Reason with Distant Supervision,” The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021, long paper). [paper, code]
  • Xinliang Frederick Zhang*, Heming Sun*, Xiang Yue*, Simon Lin, Huan Sun, “COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval,” The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021, short paper). [paper, code]
  • Boyuan Pan#, Yazheng Yang, Cai Deng, Huan Sun, “TopNet: Learning from Neural Topic Model to Generate Long Stories,” The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD 2021, research track, acceptance rate: ~15.4%). [paper, code]
  • Xiang Yue*, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow, “Differential Privacy for Text Analytics via Natural Text Sanitization,” Findings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Findings of ACL-IJCNLP 2021, long). [paper, code]
  • Xiang Deng*, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson, “Structure-Grounded Pretraining for Text-to-SQL,” The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2021, long). [paper, code]
  • Ziyu Yao*, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig, “Learning Structural Edits via Incremental Tree Transformations,” The Ninth International Conference on Learning Representations 2021 (ICLR'21). [paper, code]
  • Xiang Deng*, Huan Sun, Alyssa Lees, You Wu, Cong Yu, “TURL: Table Understanding through Representation Learning,” 47th International Conference on Very Large Data Bases (VLDB'21). Selected for the 2022 ACM SIGMOD Research Highlight Award. [paper, code, an earlier version on arXiv]
  • Zhen Wang*, Bo Zong, Huan Sun, “Modeling Context Pair Interaction for Pairwise Tasks on Graphs,” The 14th International Conference on Web Search and Data Mining (WSDM'21, acceptance rate: ~18.6%) [paper, code]
  • Kaushik Mani*, Xiang Yuh*, Bernal Jimenez Gutierrez*, Yungui Huang, Simon Lin, and Huan Sun, “Clinical Phrase Mining with Language Models,” IEEE International Conference on Bioinformatics and Biomedicine 2020 (BIBM'20, short). [paper, code, a longer version] (The first two authors contributed equally.)
  • Xiang Yue*, Xinliang (Frederick) Zhang*, Ziyu Yao*, Simon Lin, and Huan Sun, “CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering,” arXiv, 2020. [paper, code] (The first two authors contributed equally.)
  • Ziyu Yao*, Yiqi Tang, Wen-tau Yih, Huan Sun, Yu Su, “An Imitation Game for Learning Semantic Parsers from User Interaction,” 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'20, long). [paper, code, an earlier version on arXiv]
  • Bernhard Kratzwald#, Stefan Feuerriegel, Huan Sun, “Learning a Cost-Effective Annotation Policy for Question Answering,” 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'20, long). [paper, code]
  • Jie Zhao*, Huan Sun, “Adversarial Training for Code Retrieval with Question-Description Relevance Regularization,” Findings of 2020 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP'20, A new acceptance category, long). [paper, code] [Scores of this paper (reviewed with all papers to EMNLP'20): 4/4/4, with 4 being "Strong: I learned a lot from it. I would like to see it accepted" under a rating scale of 1-5 (5 being the highest)]
  • Zhen Wang*, Jennifer Lee, Simon Lin, Huan Sun, “Rationalizing Medical Relation Prediction from Corpus-level Statistics,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL'20, long). [paper, code]
  • Xiang Yue*, Bernal Jimenez*, Huan Sun, “Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL'20, long). [paper, code]
  • Jiankai Sun, Jie Zhao*, Huan Sun, Srinivasan Parthasarathy, “EndCold: An End-to-End Framework for Cold Question Routing in Community Question Answering Services,” The 29th International Joint Conference on Artificial Intelligence (IJCAI'20). [paper]
  • Jie Zhao*, Xiang Deng*, Huan Sun, “Easy-to-Hard: Leveraging Simple Questions for Complex Question Generation,” arXiv, 2019. [paper, code]
  • Ziyu Yao*, Yu Su, Huan Sun, Wen-tau Yih, “Model-based Interactive Semantic Parsing: A Unified Formulation and A Text-to-SQL Case Study,” 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP'19, long). [paper, code]
  • Xiang Deng*, Huan Sun, “Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction,” 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP'19, long). [paper, code]
  • Xiang Yue*, Zhen Wang*, Jingong Huang*, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, and Huan Sun, “Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations,” Bioinformatics, 2019 [paper, code]
  • Boyuan Pan#, Hao Li, Ziyu Yao*, Deng Cai, Huan Sun, “Reinforced Dynamic Reasoning for Conversational Question Generation,” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL'19, long). [paper, code]
  • Jie Zhao*, Ziyu Guan, Huan Sun, “Riker: Mining Rich Keyword Representations for Interpretable Product Question Answering,” The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (SIGKDD'19, research track, acceptance rate: ~14.2%, poster). [paper, code]
  • (SIGKDD'19: ~110 oral + ~60 poster presentations selected from ~1200 submissions)
  • Zhen Wang*, Xiang Yue*, Soheil Moosavinasab, Yungui Huang, Simon Lin and Huan Sun, “SurfCon: Synonym Discovery on Privacy-Aware Clinical Data,” The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (SIGKDD'19, research track, acceptance rate: ~14.2%, oral). [paper, code]
  • Z. Yao*, J. Peddamail*, H. Sun, “CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning,” The Web Conference (former WWW Conference) 2019 (WWW'19, acceptance rate: 18%, Oral + Poster). [paper, code]
  • Z. Yao*, X. Li, J. Gao, B. Sadler, H. Sun, “Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning,” The AAAI Conference on Artificial Intelligence 2019 (AAAI’19, acceptance rate: 16.2%). [paper, code]
  • L. Chen, Z. Guan, W. Zhao, W. Zhao, X. Wang, Z. Zhao, H. Sun, “Answer Identification from Product Reviews for User Questions by Multi-task Attentive Networks,” The AAAI Conference on Artificial Intelligence 2019 (AAAI’19, acceptance rate: 16.2%). [paper]
  • W. Zhao, Z. Guan, Y. Huang, T. Xi, H. Sun, Z. Wang, X. He, “Discerning Influence Patterns with Beta-Poisson Factorization in Microblogging Environments,” Transactions on Knowledge and Data Engineering (TKDE 2019). [paper]
  • J. Peddamail*, Z. Yao*, Z. Wang*, H. Sun, “A Comprehensive Study of StaQC for Deep Code Summarization,” SIGKDD Deep Learning Day 2018. [paper, slides] (SPOTLIGHT)
  • Z. Yao*, D. S. Weld, W.P. Chen, H. Sun, “StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow,” The Web Conference (former WWW Conference) 2018 (WWW'18, acceptance rate: 14.8%). [paper, code]
  • Y. Su, H. Liu, S. Yavuz, I. Gur, H. Sun, X. Yan, “Global Relation Embedding for Relation Extraction,” In Proc. of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018 (NAACL-HLT’18, long). [paper, code]
  • J. Zhao*, Y. Su, Z. Guan, H. Sun, “An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective,” Empirical Methods in Natural Language Processing 2017 (EMNLP'17). [paper, code]
  • Y. Li, N. Du, C. Liu, Y. Xie, W. Fan, Q. Li, J. Gao, H. Sun, “Reliable Medical Diagnosis from Crowdsourcing: Discover Trustworthy Answers from Non-Experts,” ACM Int. Conf. on Web Search and Data Mining 2017 (WSDM’17). [paper]
  • C. Liu, H. Sun, N. Du, S. Tan, H. Fei, W. Fan, T. Yang, H. Wu, Y. Li, C. Zhang, “Augmented LSTM Framework to Construct Medical Self-diagnosis Android,” IEEE Int. Conf. on Data Mining 2016 (ICDM’16). [paper]
  • Y. Su, H. Sun, B. Sadler, M. Srivatsa, I. Gur, Z. Yan, X. Yan, “On Generating Characteristic-rich Question Sets for QA Evaluation ,” Empirical Methods in Natural Language Processing 2016 (EMNLP'16, long). [paper, appendix, New Question-Answer Set (with rich characteristics to train more advanced QA systems)]
  • H. Sun, H. Ma, X. He, W. Yih, Y. Su, X. Yan, “Table Cell Search for Question Answering,” The 25th Int. World Wide Web Conference (WWW'16). [paper]
  • Y. Li, S. Tan, H. Sun, J. Han, D. Roth, X. Yan, “Entity Disambiguation with Linkless Knowledge Bases,” The 25th Int. World Wide Web Conference (WWW'16). [paper]
  • F. Han, S. Tan, H. Sun, X. Yan, M. Srivatsa, D. Cai, “Distributed Representations of Expertise,” SIAM Int. Conf. on Data Mining 2016 (SDM'16). [paper]
  • H. Sun, H. Ma, W. Yih, C. Tsai, J. Liu, M. Chang, “Open Domain Question Answering via Semantic Enrichment,” The 24th Int. World Wide Web Conference (WWW'15, acceptance rate: 14.1%). [paper]
  • Y. Su, S. Yang, H. Sun, M. Srivatsa , S. Kase, M. Vanni, X. Yan, "Exploiting Relevance Feedback in Knowledge Graph Search”, Proc. of the 21st Int. Conf. on Knowledge Discovery and Data Mining (KDD’15, acceptance rate: 19.4%). [paper]
  • Z. Guan, S. Yang, H. Sun, M. Srivatsa, X. Yan, “Fine-Grained Knowledge Sharing in Collaborative Environments ,” Transactions on Knowledge and Data Engineering (TKDE 2015). [paper]
  • H. Sun, M. Srivatsa, S. Tan, Y. Li, L. Kaplan, S. Tao, X. Yan, “Analyzing Expert Behaviors in Collaborative Networks,” Proc. of the 20th Int. Conf. on Knowledge Discovery and Data Mining (KDD'14, acceptance rate: 14.6%). [paper, slides, poster, Source Code]
  • S. Yang, Y. Wu, H. Sun, X. Yan, “Schemaless and Structureless Graph Querying,” Proc. of Int. Conf. on Very Large Data Bases (VLDB'14).[paper, poster]
  • S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, X. Yan, “SLQ: A User-friendly Graph Querying System,” Proc. of Int. Conf. on Management of Data (SIGMOD'14, Demo Track ).
  • H. Sun, M. Srivatsa, L. Kaplan, X. Yan, “Analyzing Expert Behaviors in Collaborative Networks,” International School and Conference on Network Science 2014 (NetSci'14)
  • N. Li, H. Sun, K. Chipman, J. George, X. Yan,“A Probabilistic Approach to Uncovering Attributed Graph Anomalies,” SIAM Int. Conf. on Data Mining 2014 (SDM'14, acceptance rate: 15.4%).[paper]
  • H. Sun, A. Morales, X. Yan,“Synthetic Review Spamming and Defense,” Proc. of the 19th Int. Conf. on Knowledge Discovery and Data Mining(KDD'13, acceptance rate: 17%). [paper, poster, Demo]
  • S. Tan, Y. Li, H. Sun, Z. Guan, X. Yan, J. Bu, C. Chen, X.He. “Interpreting the Public Sentiment Variations on Twitter” , Transactions on Knowledge and Data Engineering (TKDE 2014) .[paper]
  • H. Sun, G. Miao, X. Yan, “Noise-Resistant Bicluster Recognition,” IEEE Int. Conf. on Data Mining 2013 (ICDM'13, Oral presentation, acceptance rate: 11.6%).[paper] [slides][homepage] [A talk related to deep learning literature and techniques in this paper]
  • A. Morales, H. Sun, X. Yan,“Synthetic Review Spamming and Defense,” Proc. Of the 22nd International World Wide Web Conference(WWW'13, Companion Volume).
  • H. Sun, G. Miao, X. Yan, “Noise-Resistant Bicluster Recognition,” the 17th Annual International Conference on Research in Computational Molecular Biology (RECOMB'13, Poster).

Dissertations I advised:

Tutorials:

  • J. Pujara, P. Szekely, H. Sun, M. Chen. “From Tables to Knowledge: Recent Advances in Table Understanding,” Tutorials of KDD'21 (co-presenter). [website][slides (Part III)]
  • F. Zhu, H. Sun, X. Yan. “Network Mining and Analysis for Social Applications,” Tutorials of KDD'14 (co-presenter). [slides]

Miscellaneous:

  • Spatial Continuity Constrained Robust PCA for Recovering Images with Continuous Corruption, Intership work during 01/2010~06/2010, supervised by Dr. Yi Ma at MSRA. Excellent Graduation Thesis Award of USTC (top 5%) in 2010
  • Rating prediction of Collaborative Filtering recommendation systems, Undergraduate Research Project during 06~09/2009, supervised by Prof. Nenghai Yu at USTC. Excellent Undergraduate Research Project Scholarship (University-wide top 20%) in 2009