来小毛给他整个活!草,走,忽略!ጿ ኈ ቼ ዽ ጿ

I am a second-year Ph.D. student at Michigan State University supervised by Dr. Jiliang Tang. I am also working close with Dr. Rongrong Wang. In those years, I opens up the new direction, Graph Foundation Model in our group. See details in DSE GFM subgroup. I am super happy and proud to work with those talented guys.

My research revolves on understanding network and LLM mechanisms from a data generation perspective. I enjoy doing research on understanding some interesting phenomenons, finding pitfalls in existing literature, and defining some interesting new challenges. I am skilled in (1) conducting curious analytical experiments to observe interesting behaviors and (2) defining an interesting research perspective (but may not be practical). During my research road, I answer the following research questions:

Network mechanism & new data mining challenges

Details
  • The graph data can be diverse, how can one GNN perform well across graphs from different domains? It seemly obeying the Occam's razor I believe in.
  • Despite the graph data being seemingly diverse, are there any shared principles across certain graphs? What are the underlying latent factors underlying the graph we observed? In the current stage, I typically believe in three perspectives: (1) geometric perspective which can help when the graph is strictly constructed following principles (2) network analysis perspective: find the frequent motifs among graphs. (3) LLM perspective: I am not sure what the superpower of this black box.
  • How can we build graph foundation models and what is the killer application for it?
  • How can we apply graph techniques on an industry-level large graph with a trade-off between effectiveness and efficiency?
  • Is there any additional gain from graph structure after utilizing the LLM to encode textural node features?
  • How to define a more practical ranking and recommendation scenario in academics and build suitable system accordingly?


LLM mechanism analysis

Details
  • The Neuron Network is a complicated black-box system, how can we understand it? Is there any interesting generalization behavior for each individual neuron? How can we improve each individual neuron toward better generalization? I typically believe in the “competing subnetworks” concept: the model initially represents a variety of distinct algorithms, corresponding to different subnetworks, and generalization occurs when it ultimately converges into one. However, I think this may not work for large-scale LLM
  • LLMs show many emerging capabilities after specific fine-tuning, e.g., instruction following, however, the fine-tuning only requires minor modification on the original weights. How can the new capability come out with only minor modifications?
  • The most amazing LLM capability is the in-context learning capability which can learn from contextual information requiring no gradient update. How does the LLM learn new knowledge or activate the particular subnetwork with contextual information? I am specifically interested in self-correction and moral reasoning capability.

Selected publications

The order indicates my personal preference.

  • Graph Foundation Models
    Haitao Mao*, Zhikai Chen*, Wenzhuo Tang, Jianan Zhao, Yao Ma, Tong Zhao, Neil Shah, Mikhail Galkin, Jiliang Tang
    ICML 2024
    collaboration with SnapChat and Intel
    [pdf] [blog] [reading List 1] [reading List 2]

  • Revisiting Link Prediction: A data perspective
    Haitao Mao, Juanhui Li, Harry Shomer, Bingheng Li, Wenqi Fan, Yao Ma, Tong Zhao, Neil Shah, Jiliang Tang
    ICLR 2024
    collaboration with SnapChat
    [pdf] [slides] [video]

  • A Data Generation Perspective to the Mechanism of In-Context Learning
    Haitao Mao, Guangliang Liu, Yao Ma, Rongrong Wang, Jiliang Tang
    Preprint [pdf]

  • Neuron Campaign for Initialization Guided by Information Bottleneck Theory
    Haitao Mao, Xu Chen, Qiang Fu, Lun Du, Shi Han, Dongmei Zhang
    CIKM2021 Best Short Paper
    Work during internship in Microsoft Research Asia
    [pdf] [code] [blog] [Chinese blog] [Poster] [Slides] [Video]

  • Exploring the Potential of Large Language Models (LLMs) in Learning on Graphs
    Zhikai Chen, Haitao Mao, Hang Li, Wei Jin, Hongzhi Wen, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Wenqi Fan, Hui Liu, Jiliang Tang
    SIGKDD Explorations 2023
    collaboration with Baidu
    [pdf] [code] [slides]

  • Demystifying Structural Disparity in Graph Neural Networks: Can One Size Fit All?
    Haitao Mao, Zhikai Chen, Wei Jin, Haoyu Han, Yao Ma, Tong Zhao, Neil Shah, Jiliang Tang
    NeurIPS 2023
    collaboration with SnapChat
    [pdf] [code] [slides] [poster] [video]

  • Source Free Graph Unsupervised Domain Adaptation
    Haitao Mao, Lun Du, Yujia Zheng, Qiang Fu, Zelin Li, Xu Chen, Shi Han, Dongmei Zhang
    WSDM 2024 Best Paper Honor Mention
    Work during internship in Microsoft Research Asia
    [pdf] [blog] [code]

Awards:

  • WSDM2024 Best Paper Honor Mentioned award (first author) (3/615)
  • CIKM2021 Best short paper award (first author) (1/626)
  • NSF Student Travel Grant - WSDM 2024
  • NeurIPS 2023 Scholar Award
  • Excellent Student of High Education in Sichuan Province (30/763)
  • Outstanding Graduate in University of Electronic Science and Technology of China (74/763)
  • Star of tomorrow intern award in Microsoft Research Asia (top 10%)
  • National first prize in Chinese Software (20/45,000) [Github]
  • Best project and best performance award in Nanjing University, NLP lab. (3/57)
  • A+ performance (4/40) on the summer camp of the Nation University of Singapore, School of computing.

Professional Experience

  • Visiting scholar at Hong Kong Polytechnic University (March, 2023 - September 2023): supervised by Research Assistant Professor Wenqi Fan and Professor Qing Li
  • Research Intern at Baidu (March, 2022 - September, 2022): Search strategy department, supervised by Lixin Zou(Now an associate professor in Wuhan University).
  • Research intern at Microsoft Research Asia (January, 2021 - November, 2021)

Education

Michigan State University (August 2022 - present)

University of Electronic Science and Technology of China (September 2018 - June 2022)

Support

This page is supported by Hanlin Lan, one of my best friends in undergraduate period. Thanks for his great help.