第 8 3 章 :Network Module Definition Module detection Bayesian approach Markov clustering algorithm
Network Modular
Modularity Suppose we are given a candidate division of the vertices into some number of groups. The modularity of this division is defined to be the fraction of the edges that fall within the given groups minus the expected such fraction if edges were distributed at random. http://en.wikipedia.org/wiki/modularity_(networks)
Modularity A ij : adjacency matrix k i : degree m: total number of edges
Modularity For two class problem, let s i =1 of node I belongs to group1 and s i = 1 if it belongs to group 2,
Example 1 2 4 3 5
Spectrum Method The largest eigenvectors will gives the best grouping, positive entries corresponding to one class, and negative ones corresponding to another class. This can be achieved by power method where e=(1,1,,1) T
Example 对于上述矩阵 B, 可以计算出最大特征值为 10, 对应的特征向量 于是我们对节点的划分为 {1,2}; {3,4,5} 1 2 4 3 5
优化方法 既然现在有一个衡量划分 好坏 的量 Q, 那么一般的优化方法都可以使用 ; 1. 给定初始划分 2. 对于划分的某种修正, 计算 Q 的改变量 3. 依据一定的原则考虑是否接受这种修正, 重复步骤 2, 直到某种收敛条件满足 Greedy 方法 模拟退火方法
A Bayesian Approach to Network Modularity Slides for this part are mainly from Hofman s talk www.jakehofman.com/talks/apam_20071019.pdf
Overview: Modular Networks Given a network Assign nodes to modules? Determine number of modules(scale/complexity)?
Overview: Modular Networks With a generative model of modular networks, rules of probability tell us how to calculate model parameters (e.g. number of modules & assignments)
Generative Models Known model (parameters, assignment variables, complexity) Generate Synthetic data Inferred Model (parameters, latent variables, complexity) Observed real data
Generating Modular Networks For each node: Roll K sided die to determine z i =1,...,K, the (unobserved) module assignment for ith node For each pair of nodes (i,j): If z i =z j, flip in community coin with bias θ c to determine edge If z i z j, flip between communities coin with bias θ d to determine edge
Generating Modular Networks Die rolling, coin flipping, and priors:
Generating Modular Networks Edges within modules Non edges within modules Edges between modules Non edges between modules Nodes in each modules
Inferring Modular Networks From observed graph structure, infer distributions over module assignments, model parameters, and model complexity
Approximate Inference for Modular Networks Jensen s inequality (log of expected value bounds expected value of log) for any distribution q
Approximate Inference for Modular Networks F is a functional of q; find approximation to posterior by optimizing approximation to evidence Take q(z, π, θ)=q(z)q(π)q(θ); Q iμ is probability of node iin module μ
Variational Bayesian
Variational Bayesian Where the expected counts
Variational Bayesian Method 问题 : 给定数据 D, 缺失数据 Z={Z 1, Z 2,,Z k }, 极大化后验概率分布, 目标函数复杂, 很难得到 Close form. 策略寻找一个合适的近似分布 Q(Z)
如何近似 解决两个问题 如何度量 Q(Z) 和 P(Z D) 的差异 Kullback Leibler Divergence 如何得到简单的分布 Q(Z) 可分解分布 (Factorized Approximation)
Kullback Leibler Divergence 具有如下三个性质 不对称性 : 非负性 : 当且仅当 p=q 时为 0 不满足三角不等式
优化基础 故令 则
优化基础 另一方面 这里 log P(Z,D) 在物理上称之为能量, 而后者是 Q 的熵
平均场 (Mean Field) 近似 采用可分解分布进行近似
平均场近似下的优化算法 优化问题 约束条件
优化问题求解 可以证明, 上述优化问题的解 : 其中 C 为归一化因子. 表示除了 Z i 之外的其它隐变量
考虑 Z=(Z i, Z [ i] ) 优化公式推导
优化公式推导 定义其中 C 为 Q * 的归一化因子. Q* 定义了一个 Z i 上的新的概率分布 另一方面
优化公式推导 于是 上式作为 Q i 的泛函, 由变分原理得
优化公式推导 实际上, 直接由 KL 散度的非负性, 只有当散度为 0 时,L(Q) 取最大值, 即
变分原理 泛函 : 以函数为自变量的函数 泛函 J(y) 在 y=y(x) 极大值 : 对于 y 的任何一个扰动 y 从形式上看, 和 y 是普通变量的含义类似
变分原理 极大值的必要条件 : 一级变分为 0 带约束 J 0 (y) 极大值的必要条件 (Lagrange 乘子 ) 从形式上看, 和 y 是普通变量的含义类似
迭代算法
Markov Clustering Algorithm van Dongen. A cluster algorithm for graphs. Information Systems, 2000
K length Path Basic idea: dense regions in sparse graphs corresponding with regions in which the number of k length path is relatively large. Random walks can also be used to detect clusters in graphs, the idea is that the more closed is a subgraph, the largest the time a random walker need to escape from it.
K path Clustering Matrix manipulation: (N+I) 2
Markov Clustering Expansion: Through matrix manipulation (power), one obtains a matrix for a n steps connection. Inflation: Enhance intercluster passages by raising the elements to a certain power and then normalize
Markov Clustering Algorithm Iteratively running two operators Inflation: Column normalization Expansion:
MCL Running
MCL Running
MCL Running
MCL Running
A Heuristic for MCL We take a random walk on the graph described by the similarity matrix After each step we weaken the links between distant nodes and strengthen the links between nearby nodes Graphic from van Dongen, 2000
References Newman MEJ, Modularity and community structure in network. PNAS 103: 8577 8582, 2006. Hofman, JM and Wiggins, CH. Bayesian approach to network modularity. Physical Review Letters 100:258701, 2008. S. van Dongen. A cluster algorithm for graphs. CWI Report INS R0010, 2000. 侯琳, 邓明华. 网络聚类算法及其生物信息学应用. 数学建模及其生物信息学应用. Vol.2(2): 9 14, 2013. CharlesW. Fox, Stephen J. Roberts. A tutorial on variational Bayesian inference. Artif Intell Rev (2012) 38:85 95.DOI 10.1007/s10462 011 9236 8 C. M. Bishop. Pattern Recognition and Machine Learning. Springer. 2006. ( 第 10 章 : 变分近似 )