Type
Text
Type
Dissertation
Advisor
Skiena, Steven | Akoglu, Leman | Gao, Jie | Mirrokni, Vahab.
Date
2016-12-01
Keywords
community detection, deep learning, graph mining, social networks | Computer science
Department
Department of Computer Science
Language
en_US
Source
This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.
Identifier
http://hdl.handle.net/11401/77247
Publisher
The Graduate School, Stony Brook University: Stony Brook, NY.
Format
application/pdf
Abstract
It is increasingly common to encounter real-world graphs which have attributes associated with the nodes, in addition to their raw connectivity information. For example, social networks contain both the friendship relations as well as user attributes such as interests and demographics. A protein-protein interaction network may not only have the interaction relations but the expression levels associated with the proteins. Such information can be described by a graph in which nodes represent the objects, edges represent the relations between them, and feature vectors associated with the nodes represent the attributes. This graph data is often referred to as an attributed graph. This thesis focuses on developing scalable algorithms and models for attributed graphs. This data can be viewed as either discrete (set of edges), or continuous (distances between embedded nodes), and I examine the issue from both sides. Specifically, I present an online learning algorithm which utilizes recent advances in deep learning to create rich graph embeddings. The multiple scales of social relationships encoded by this novel approach are useful for multi-label classification and regression tasks in networks. I also present local algorithms for anomalous community scoring in discrete graphs. These algorithms discover subsets of the graph's attributes which cause communities to form (e.g. shared interests on a social network). The scalability of all the methods in this thesis is ensured by building from a restricted set of graph primitives, such as ego-networks and truncated random walks, which exploit the local information around each vertex. In addition, limiting the scope of graph dependencies we consider enables my approaches to be trivially parallelized using commodity tools for big data processing, like MapReduce or Spark. The applications of this work are broad and far reaching across the fields of data mining and information retrieval, including user profiling/demographic inference, online advertising, and fraud detection. | 121 pages
Recommended Citation
Perozzi, Bryan, "Local Modeling of Attributed Graphs: Algorithms and Applications" (2016). Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions). 3072.
https://commons.library.stonybrook.edu/stony-brook-theses-and-dissertations-collection/3072