node_similarities
Module for node similarity measures in graphs.
LeichtHolmeNewman_index
¶
Compute the Leicht-Holme-Newman index between two nodes.
The Leicht-Holme-Newman index for all pairs of nodes \(V^2\) is defined as
where \(m\) is the number of edges in the graph, \(\lambda_1\) is the largest eigenvalue of the adjacency matrix \(A\), \(D\) is the diagonal degree matrix, and \(I\) is the identity matrix. To obtain \(S_{LHN}(v, w)\), the entry corresponding to nodes \(v\) and \(w\) is selected from the matrix \(S_{LHN}\).
Reference
Proposed by Leicht, Holme, and Newman in "Vertex similarity in networks"1.
-
Leicht, E. A., Holme, P. & Newman, M. E. J. Vertex similarity in networks. Phys. Rev. E 73, 026120 (2006). ↩
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
alpha
|
float
|
Parameter controlling the weight of longer paths. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The Leicht-Holme-Newman index between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
adamic_adar_index
¶
Compute the Adamic-Adar index between two nodes.
The Adamic-Adar index between two nodes \(v, w \in V\) is defined as
where \(\mathcal{N}(v)\) and \(\mathcal{N}(w)\) are the sets of neighbors of nodes \(v\) and \(w\), respectively.
Reference
Proposed by Adamic and Adar in "Friends and neighbors on the web"1.
-
Adamic, L. A. & Adar, E. Friends and neighbors on the Web. Social Networks 25, 211-230 (2003). ↩
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The Adamic-Adar index between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
common_neighbors
¶
Compute the number of common neighbors between two nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The number of common neighbors between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
cosine_similarity
¶
Compute the cosine similarity between two nodes.
The cosine similarity between two nodes \(v, w \in V\) is defined as
where \(\mathbf{A}_v\) and \(\mathbf{A}_w\) are row vectors of the adjacency matrix corresponding to nodes \(v\) and \(w\), respectively, \(\|\cdot\|_2\) denotes the Euclidean norm, and \(\cdot\) denotes the dot product.
Reference
For more details, see Equation (7.35) in Networks1 by Mark Newman.
-
Newman, M. E. J. Networks. (Oxford University Press, 2018). doi:10.1093/oso/9780198805090.001.0001. ↩
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The cosine similarity between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
inverse_path_length
¶
Compute the inverse path length similarity between two nodes.
Given a graph \(G=(V, E)\) and two nodes \(v, w \in V\), the inverse path length similarity is defined as
with the shortest path length \(d(v, w)\) between nodes \(v\) and \(w\).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The inverse path length similarity between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
jaccard_similarity
¶
Compute the Jaccard similarity between two nodes.
The Jaccard similarity between two nodes \(v, w \in V\) is defined as
where \(\mathcal{N}(v)\) and \(\mathcal{N}(w)\) are the sets of neighbors of nodes \(v\) and \(w\), respectively.
Reference
For more details, see Equation (7.38) in Networks1 by Mark Newman.
-
Newman, M. E. J. Networks. (Oxford University Press, 2018). doi:10.1093/oso/9780198805090.001.0001. ↩
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The Jaccard similarity between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
katz_index
¶
Compute the Katz index between two nodes.
The Katz index for all pairs of nodes \(V^2\) is defined as
where \(A\) is the adjacency matrix of the graph, \(I\) is the identity matrix, and \(\beta\) is a parameter controlling the weight of longer paths. To get \(S_{Katz}(v, w)\), the entry corresponding to nodes \(v\) and \(w\) is selected from the matrix \(S_{Katz}\).
Reference
While the Katz index was originally proposed by Leo Katz in 19531 as a measure of social influence of a node (centrality), it can also be used as a node similarity measure between two nodes.
-
Katz, L. A new status index derived from sociometric analysis. Psychometrika 18, 39-43 (1953). ↩
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
beta
|
float
|
Parameter controlling the weight of longer paths. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The Katz index between nodes |
Source code in src/pathpyG/statistics/node_similarities.py
overlap_coefficient
¶
Compute the overlap coefficient between two nodes.
The overlap coefficient between two nodes \(v, w \in V\) is defined as
where \(\mathcal{N}(v)\) and \(\mathcal{N}(w)\) are the sets of neighbors of nodes \(v\) and \(w\), respectively.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The input graph. |
required |
v
|
str | int
|
The first node given the node label (or index if no label is provided). |
required |
w
|
str | int
|
The second node given the node label (or index if no label is provided). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The overlap coefficient between nodes |