Conferences/Journals:

*: My advisee; #: visiting student (work done/started at OSU). A few papers might appear under multiple topics.

On semantic parsing and NLP/ML for automated programming:

  • Xiang Deng*, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson, “Structure-Grounded Pretraining for Text-to-SQL,” The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2021). [paper, code, an earlier version]
  • Ziyu Yao*, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig, “Learning Structural Edits via Incremental Tree Transformations,” The Ninth International Conference on Learning Representations 2021 (ICLR'21). [paper, code]
  • Ziyu Yao*, Yiqi Tang, Wen-tau Yih, Huan Sun, Yu Su, “An Imitation Game for Learning Semantic Parsers from User Interaction,” 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'20). [paper, code, an earlier version on arXiv]
  • Jie Zhao*, Huan Sun, “Adversarial Training for Code Retrieval with Question-Description Relevance Regularization,” Findings of 2020 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP'20, A new acceptance category). [paper, code] [Scores of this paper (reviewed with all papers to EMNLP'20): 4/4/4, with 4 being "Strong: I learned a lot from it. I would like to see it accepted" under a rating scale of 1-5 (5 being the highest)]
  • Ziyu Yao*, Yu Su, Huan Sun, Wen-tau Yih, “Model-based Interactive Semantic Parsing: A Unified Formulation and A Text-to-SQL Case Study,” 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP'19). [paper, code]
  • Z. Yao*, J. Peddamail*, H. Sun, “CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning,” The Web Conference (former WWW Conference) 2019 (WWW'19, acceptance rate: 18%, Oral + Poster). [paper, code]
  • Z. Yao*, X. Li, J. Gao, B. Sadler, H. Sun, “Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning,” The AAAI Conference on Artificial Intelligence 2019 (AAAI’19, acceptance rate: 16.2%). [paper, code]
  • Z. Yao*, D. S. Weld, W.P. Chen, H. Sun, “StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow,” The Web Conference (former WWW Conference) 2018 (WWW'18, acceptance rate: 14.8%). [paper, code]
  • J. Peddamail*, Z. Yao*, Z. Wang*, H. Sun, “A Comprehensive Study of StaQC for Deep Code Summarization,” SIGKDD Deep Learning Day 2018. [paper, slides] (SPOTLIGHT)
On pre-training and semi-structured table representation and understanding:
  • Xiang Deng*, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson, “Structure-Grounded Pretraining for Text-to-SQL,” The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2021). [paper, code, an earlier version]
  • Xiang Deng*, Huan Sun, Alyssa Lees, You Wu, Cong Yu, “TURL: Table Understanding through Representation Learning,” 47th International Conference on Very Large Data Bases (VLDB'21). [paper, code, an earlier version on arXiv]
  • Xiang Deng*, Huan Sun, “Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction,” 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP'19). [paper, code]
On knowledge representation and reasoning in (textual) graphs, with emphasis on interpretability:
  • Zhen Wang*, Bo Zong, Huan Sun, “Modeling Context Pair Interaction for Pairwise Tasks on Graphs,” The 14th International Conference on Web Search and Data Mining (WSDM'21, acceptance rate: ~18.6%) [paper, code]
  • Zhen Wang*, Jennifer Lee, Simon Lin, Huan Sun, “Rationalizing Medical Relation Prediction from Corpus-level Statistics,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL'20, long). [paper, code]
  • Zhen Wang*, Xiang Yue*, Soheil Moosavinasab, Yungui Huang, Simon Lin and Huan Sun, “SurfCon: Synonym Discovery on Privacy-Aware Clinical Data,” The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (SIGKDD'19, research track, acceptance rate: ~14.2%, oral). [paper, code]
  • Y. Su, H. Liu, S. Yavuz, I. Gur, H. Sun, X. Yan, “Global Relation Embedding for Relation Extraction,” In Proc. of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018 (NAACL-HLT’18). [paper, code]
On question answering and reading comprehension, with applications to the clinical domain:
  • Bernhard Kratzwald#, Stefan Feuerriegel, Huan Sun, “Learning a Cost-Effective Annotation Policy for Question Answering,” 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'20). [paper, code]
  • Xiang Yue*, Bernal Jimenez*, Huan Sun, “Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL'20, long). [paper, code]
  • Xiang Yue*, Xinliang (Frederick) Zhang*, Ziyu Yao*, Simon Lin, and Huan Sun, “CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering,” arXiv, 2020. [paper, code] (The first two authors contributed equally.)
  • Jiankai Sun, Jie Zhao*, Huan Sun, Srinivasan Parthasarathy, “EndCold: An End-to-End Framework for Cold Question Routing in Community Question Answering Services,” The 29th International Joint Conference on Artificial Intelligence (IJCAI'20). [paper]
  • Jie Zhao*, Xiang Deng*, Huan Sun, “Easy-to-Hard: Leveraging Simple Questions for Complex Question Generation,” arXiv, 2019. [paper, code]
  • Boyuan Pan#, Hao Li, Ziyu Yao*, Deng Cai, Huan Sun, “Reinforced Dynamic Reasoning for Conversational Question Generation,” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL'19). [paper, code]
  • Jie Zhao*, Ziyu Guan, Huan Sun, “Riker: Mining Rich Keyword Representations for Interpretable Product Question Answering,” The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (SIGKDD'19, research track, acceptance rate: ~14.2%, poster). [paper, code]
  • (SIGKDD'19: ~110 oral + ~60 poster presentations selected from ~1200 submissions)
  • L. Chen, Z. Guan, W. Zhao, W. Zhao, X. Wang, Z. Zhao, H. Sun, “Answer Identification from Product Reviews for User Questions by Multi-task Attentive Networks,” The AAAI Conference on Artificial Intelligence 2019 (AAAI’19, acceptance rate: 16.2%). [paper]
  • J. Zhao*, Y. Su, Z. Guan, H. Sun, “An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective,” Empirical Methods in Natural Language Processing 2017 (EMNLP'17). [paper, code]
  • H. Sun, H. Ma, X. He, W. Yih, Y. Su, X. Yan, “Table Cell Search for Question Answering,” The 25th Int. World Wide Web Conference (WWW'16). [paper]
  • Y. Su, H. Sun, B. Sadler, M. Srivatsa, I. Gur, Z. Yan, X. Yan, “On Generating Characteristic-rich Question Sets for QA Evaluation ,” Empirical Methods in Natural Language Processing 2016 (EMNLP'16). [paper, appendix, New Question-Answer Set (with rich characteristics to train more advanced QA systems)]
  • H. Sun, H. Ma, W. Yih, C. Tsai, J. Liu, M. Chang, “Open Domain Question Answering via Semantic Enrichment,” The 24th Int. World Wide Web Conference (WWW'15, acceptance rate: 14.1%). [paper]
  • S. Yang, Y. Wu, H. Sun, X. Yan, “Schemaless and Structureless Graph Querying,” Proc. of Int. Conf. on Very Large Data Bases (VLDB'14).[paper, poster]
  • S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, X. Yan, “SLQ: A User-friendly Graph Querying System,” Proc. of Int. Conf. on Management of Data (SIGMOD'14, Demo Track ).
On biomedical and clinical data mining:
  • Zhen Wang*, Jennifer Lee, Simon Lin, Huan Sun, “Rationalizing Medical Relation Prediction from Corpus-level Statistics,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL'20, long). [paper, code]
  • Zhen Wang*, Xiang Yue*, Soheil Moosavinasab, Yungui Huang, Simon Lin and Huan Sun, “SurfCon: Synonym Discovery on Privacy-Aware Clinical Data,” The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (SIGKDD'19, research track, acceptance rate: ~14.2%, oral). [paper, code]
  • Kaushik Mani*, Xiang Yue*, Bernal Jimenez Gutierrez*, Yungui Huang, Simon Lin, and Huan Sun, “Clinical Phrase Mining with Language Models,” IEEE International Conference on Bioinformatics and Biomedicine 2020 (BIBM'20, short). [paper, code, a longer version] (The first two authors contributed equally.)
  • Xiang Yue*, Zhen Wang*, Jingong Huang*, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, and Huan Sun, “Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations,” Bioinformatics, 2019 [paper, code]
  • Y. Li, N. Du, C. Liu, Y. Xie, W. Fan, Q. Li, J. Gao, H. Sun, “Reliable Medical Diagnosis from Crowdsourcing: Discover Trustworthy Answers from Non-Experts,” ACM Int. Conf. on Web Search and Data Mining 2017 (WSDM’17). [paper]
  • C. Liu, H. Sun, N. Du, S. Tan, H. Fei, W. Fan, T. Yang, H. Wu, Y. Li, C. Zhang, “Augmented LSTM Framework to Construct Medical Self-diagnosis Android,” IEEE Int. Conf. on Data Mining 2016 (ICDM’16). [paper]
On other topics in the area of data mining (especially network analysis and text mining):
  • W. Zhao, Z. Guan, Y. Huang, T. Xi, H. Sun, Z. Wang, X. He, “Discerning Influence Patterns with Beta-Poisson Factorization in Microblogging Environments,” Transactions on Knowledge and Data Engineering (TKDE 2019). [paper]
  • Y. Li, S. Tan, H. Sun, J. Han, D. Roth, X. Yan, “Entity Disambiguation with Linkless Knowledge Bases,” The 25th Int. World Wide Web Conference (WWW'16). [paper]
  • F. Han, S. Tan, H. Sun, X. Yan, M. Srivatsa, D. Cai, “Distributed Representations of Expertise,” SIAM Int. Conf. on Data Mining 2016 (SDM'16). [paper]
  • Y. Su, S. Yang, H. Sun, M. Srivatsa , S. Kase, M. Vanni, X. Yan, "Exploiting Relevance Feedback in Knowledge Graph Search”, Proc. of the 21st Int. Conf. on Knowledge Discovery and Data Mining (KDD’15, acceptance rate: 19.4%). [paper]
  • Z. Guan, S. Yang, H. Sun, M. Srivatsa, X. Yan, “Fine-Grained Knowledge Sharing in Collaborative Environments ,” Transactions on Knowledge and Data Engineering (TKDE 2015). [paper]
  • H. Sun, M. Srivatsa, S. Tan, Y. Li, L. Kaplan, S. Tao, X. Yan, “Analyzing Expert Behaviors in Collaborative Networks,” Proc. of the 20th Int. Conf. on Knowledge Discovery and Data Mining (KDD'14, acceptance rate: 14.6%). [paper, slides, poster, Source Code]
  • H. Sun, M. Srivatsa, L. Kaplan, X. Yan, “Analyzing Expert Behaviors in Collaborative Networks,” International School and Conference on Network Science 2014 (NetSci'14)
  • N. Li, H. Sun, K. Chipman, J. George, X. Yan,“A Probabilistic Approach to Uncovering Attributed Graph Anomalies,” SIAM Int. Conf. on Data Mining 2014 (SDM'14, acceptance rate: 15.4%).[paper]
  • H. Sun, A. Morales, X. Yan,“Synthetic Review Spamming and Defense,” Proc. of the 19th Int. Conf. on Knowledge Discovery and Data Mining(KDD'13, acceptance rate: 17%). [paper, poster, Demo]
  • S. Tan, Y. Li, H. Sun, Z. Guan, X. Yan, J. Bu, C. Chen, X.He. “Interpreting the Public Sentiment Variations on Twitter” , Transactions on Knowledge and Data Engineering (TKDE 2014) .[paper]
  • H. Sun, G. Miao, X. Yan, “Noise-Resistant Bicluster Recognition,” IEEE Int. Conf. on Data Mining 2013 (ICDM'13, Oral presentation, acceptance rate: 11.6%).[paper][slides][homepage] [A talk related to deep learning literature and techniques in this paper]
  • A. Morales, H. Sun, X. Yan,“Synthetic Review Spamming and Defense,” Proc. Of the 22nd International World Wide Web Conference(WWW'13, Companion Volume).
  • H. Sun, G. Miao, X. Yan, “Noise-Resistant Bicluster Recognition,” the 17th Annual International Conference on Research in Computational Molecular Biology (RECOMB'13, Poster).

Tutorial:

  • F. Zhu, H. Sun, X. Yan. “Network Mining and Analysis for Social Applications,” Tutorials of KDD'14 (co-presenter). [slides]

Miscellaneous:

  • Spatial Continuity Constrained Robust PCA for Recovering Images with Continuous Corruption, Intership work during 01/2010~06/2010, supervised by Dr. Yi Ma at MSRA. Excellent Graduation Thesis Award of USTC (top 5%) in 2010
  • Rating prediction of Collaborative Filtering recommendation systems, Undergraduate Research Project during 06~09/2009, supervised by Prof. Nenghai Yu at USTC. Excellent Undergraduate Research Project Scholarship (University-wide top 20%) in 2009