pandas
add_edge_attributes
¶
Add edge attributes from pandas data frame to existing graph, where source/target node
IDs are given in columns v
and w
and edge attributes x are given in columns edge_x
Source code in src/pathpyG/io/pandas.py
add_node_attributes
¶
Add node attributes from pandas data frame to existing graph, where node
IDs or indices are given in column v
and node attributes x are given in columns node_x
Source code in src/pathpyG/io/pandas.py
df_to_graph
¶
Reads a network from a pandas data frame.
The data frame is expected to have a minimum of two columns that give the source and target nodes of edges. Additional columns in the data frame will be mapped to edge attributes.
Args:
df: pandas.DataFrame
A data frame with rows containing edges and optional edge attributes. If the
data frame contains column names, the source and target columns must be called
'v' and 'w' respectively. If no column names are used the first two columns
are interpreted as source and target.
is_undirected: Optional[bool]=True
whether or not to interpret edges as undirected
multiedges: Optional[bool]=False
whether or not to allow multiple edges between the same node pair. By
default multi edges are ignored.
Example
Source code in src/pathpyG/io/pandas.py
df_to_temporal_graph
¶
Reads a temporal graph from a pandas data frame.
The data frame is expected to have a minimum of two columns v
and w
that give the source and target nodes of edges. Additional column names to
be used can be configured in config.cfg
as v_synonyms
and w
synonyms. The time information on edges can either be stored in an
additional timestamp
column (for instantaneous interactions) or in two
columns start
, end
or timestamp
, duration
respectively for networks
where edges appear and exist for a certain time. Synonyms for those column
names can be configured in config.cfg. Each row in the data frame is
mapped to one temporal edge. Additional columns in the data frame will be
mapped to edge attributes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
pandas.DataFrame
|
pandas.DataFrame with rows containing time-stamped edges and optional edge |
required |
timestamp_format
|
timestamp format |
'%Y-%m-%d %H:%M:%S'
|
|
time_rescale
|
time stamp rescaling factor |
1
|
|
**kwargs
|
typing.Any
|
Arbitrary keyword arguments that will be set as network-level attributes. |
{}
|
Example:
import pathpyG as pp
import pandas as pd
df = pd.DataFrame({
'v': ['a', 'b', 'c'],
'w': ['b', 'c', 'a'],
't': [1, 2, 3]})
g = pp.io.df_to_temporal_graph(df)
print(g)
df = pd.DataFrame([
['a', 'b', 'c'],
['b', 'c', 'a'],
[1, 2, 3]
])
g = pp.io.df_to_temporal_graph(df)
print(g)
Source code in src/pathpyG/io/pandas.py
graph_to_df
¶
Returns a pandas data frame for a given graph.
Returns a pandas dataframe data that contains all edges including edge attributes. Node and network-level attributes are not included. To facilitate the import into network analysis tools that only support integer node identifiers, node uids can be replaced by a consecutive, zero-based index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph
|
pathpyG.core.graph.Graph
|
The graph to export as pandas DataFrame |
required |
node_indices
|
typing.Optional[bool]
|
whether nodes should be exported as integer indices |
False
|
Example:
import pathpyG as pp
n = pp.Graph.from_edge_list([('a', 'b'), ('b', 'c'), ('c', 'a')])
df = pp.io.to_dataframe(n)
print(df)
Source code in src/pathpyG/io/pandas.py
read_csv_graph
¶
Reads a Graph or TemporalGraph from a csv file. To read a temporal graph, the csv file must have
a header with column t
containing time stamps of edges
Parameters:
Name | Type | Description | Default |
---|---|---|---|
loops
|
whether or not to add self_loops |
required | |
directed
|
whether or not to intepret edges as directed |
required | |
multiedges
|
bool
|
whether or not to add multiple edges |
False
|
sep
|
str
|
character separating columns in the csv file |
','
|
header
|
bool
|
whether or not the first line of the csv file is interpreted as header with column names |
True
|
timestamp_format
|
format of timestamps |
required | |
time_rescale
|
rescaling of timestamps |
required |
Example
Source code in src/pathpyG/io/pandas.py
read_csv_temporal_graph
¶
Reads a TemporalGraph from a csv file that minimally has three columns containin source, target and time.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sep
|
str
|
character separating columns in the csv file |
','
|
header
|
bool
|
whether or not the first line of the csv file is interpreted as header with column names |
True
|
directed
|
whether or not to intepret edges as directed |
required | |
timestamp_format
|
str
|
format of timestamps |
'%Y-%m-%d %H:%M:%S'
|
time_rescale
|
int
|
rescaling of timestamps |
1
|
Source code in src/pathpyG/io/pandas.py
temporal_graph_to_df
¶
Returns a pandas data frame for a given temporal graph.
Returns a pandas dataframe data that contains all edges including edge attributes. Node and network-level attributes are not included. To facilitate the import into network analysis tools that only support integer node identifiers, node uids can be replaced by a consecutive, zero-based index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph
|
pathpyG.core.temporal_graph.TemporalGraph
|
The graph to export as pandas DataFrame |
required |
node_indices
|
typing.Optional[bool]
|
whether nodes should be exported as integer indices |
False
|
Example:
import pathpyG as pp
n = pp.TemporalGraph.from_edge_list([('a', 'b', 1), ('b', 'c', 2), ('c', 'a', 3)])
df = pp.io.to_df(n)
print(df)
Source code in src/pathpyG/io/pandas.py
write_csv
¶
Stores all edges including edge attributes in a csv file.