PAX2GRAPHML python package documentation¶

Indices and tables¶

pax2graphml.pax_import¶

This module contains function to manipulate BIOPAX and GRAPHML files.

pax_import.annotation_dict(alias_file)[source]¶

Create a dictionary from an annotation json file:

Parameters: alias_file -- annotation json file
Returns: annotation dictionary
Return type: dict

pax_import.biopax_filter(biopax_file, datasources, output_file='output.owl')[source]¶

Remove Datasources from a BIOPAX file. The process use PAXTOOLS, need a lot of memory and can be slow for big BIOPAX files:

Parameters

biopax_file -- input BIOPAX files
datasources -- list of datasources to exclude
output_file -- output BIOPAX file

Returns

void

Return type

None

pax_import.biopax_merge(biopax_list, output_file='output.owl')[source]¶

Merge multiple BIOPAX (RDSF/XML) files. The process use PAXTOOLS, need a lot of memory and can be slow for big BIOPAX files:

Parameters

biopax_list -- a list of input BIOPAX files to be merged
output_file -- output BIOPAX file

Returns

void

Return type

None

pax_import.biopax_to_reaction_graph(biopax_file, graphml_file, black_list=None, control_mode=2)[source]¶

Generate a reaction graph with binary interactions as a GRAPHML file from a BIOPAX file. Tte BIOPAX file is filtered, keeping only the regulation part (metabolism and genes). The process use PAXTOOLS, need a lot of memory and can be slow for big BIOPAX files:

Parameters

biopax_file -- input BIOPAX file
graphml_file -- output GRAPHML reaction file
black_list -- entity black_list (e.g. hubs, h2o...)
control_mode -- control mecanism model representation, use default (2) for compressed control representation, or 1 for extended with entity duplication

Returns

void

Return type

None

pax_import.influence_graph(input_graph, output_graph, output_image)[source]¶

generate an influence graph as a graphml file from a checked reaction graphml file

Parameters

input_graph -- input graphml file containing the raw reaction graph
output_graph -- output graphml file containing the checked reaction graph

Returns

void

Return type

None

pax_import.influence_subgraph(input_graph, output_graph, output_image, min_node, max_node)[source]¶

Generate an influence graph as a graphml file from a checked reaction graphml file. The graph is generated from one connected component:

Parameters

input_graph -- input graphml file containing the raw reaction graph
output_graph -- output graphml file containing the checked reaction graph
output_image -- output png file for graph visualization
min_node -- minimum node count of tne connected component
max_node -- maximum node count of tne connected component

Returns

void

Return type

None

pax_import.join_annotation(g, alias_file, annot_field, dest_field, default_val, property_type='string')[source]¶

Populate a new node property with values extracted from an annotation json file:

Parameters

properties_file -- json annotation file
annot_field -- field in json data dictionary to be processed
dest_field -- new property name
default_val -- property default value for None
property_type -- new property type

Returns

annotation dictionary

Return type

dict

pax_import.name_alias(biopax_file, output_file='entities_aliases.json', opt='--uri-ids')[source]¶

#generate a json file with annottaions extracted from a BIOPAX file. The process use PAXTOOLS :

Parameters

biopax_file -- input BIOPAX files
output_file -- output json file
opt -- output generation options

Returns

void

Return type

None

pax_import.prepare_spaim(input_graph, output_graph, output_image, checkInvertP=True)[source]¶

generate an checked reaction graph from a raw reaction graph:

Parameters

input_graph -- input graphml file containing the raw reaction graph
output_graph -- output graphml file containing the checked reaction graph

Returns

void

Return type

None

pax_import.reaction_to_influence_graph(reaction_graph)[source]¶

generate an influence graph from a checked reaction graph:

Parameters: reaction_graph -- checked reaction graph
Returns: influence graph
Return type: graph object

pax2graphml.extract¶

This module contains function to extract graph and connected components from a graph in graphml and assemble such graphs

extract.connected_component_by_annotation(g, targ, annot_keys, add_void=False, extend_to_cc=True)[source]¶

Extract a subgraph build by merging a subset of connected components of the original graph. Each connected component contains nodes with properties matching provided values:

Parameters

g -- a graph
targ -- a list of values that must match the values of the properties defined by annot_keys
annot_keys -- a list of properties
add_void -- a boolean defining if we keeo void values.void values will have the string value ""
extend_to_cc -- if true add all members of a connectes compoenent which, at least one node matches a properties value defined in targ

Returns

a subgraph

Return type

graph

extract.define_boolean_filter(gr, att, val, usecase=True)[source]¶

Create a boolean filter property:

Parameters

gr -- a graph
att -- a existing property name
val -- a value for the property att. Each node with this property value will be selected by the filter. Optionaly, val can be a list of values

:param usecase:if False, when val is a string, upper or lower strings will match :return: a filter as a dictionnary (node index,True/False) :rtype: dict

extract.filter_by_node_attribute(gr, att, val)[source]¶

Create a subgraph where the nodes matches a property value:

Parameters

gr -- a graph
att -- an existing property name
val -- a value for the property att. Each node with this property value will be selected by the filter

Returns

a subgraph

Return type

graph

extract.filter_from_boolean_filter(gr, vfilter)[source]¶

Create a subgraph using a filter:

Parameters

gr -- a graph
vfilter -- a filter as a dictionnary (node index,True/False)

Returns

a subgraph

Return type

graph

extract.largest_connected_component(g, directed=False)[source]¶

Select the largest connected component:

Parameters: g -- a graph
Returns: a subgraph
Return type: graph

extract.merge_graph(gr1, gr2, properties_list, add_void=False, caseSensitive=True)[source]¶

Merge two graphs. all nodes of both graphs, that share the same value of a list of properties are merged:

Parameters

gr1 -- first graph
gr2 -- second graph
properties_list -- a list of node properties
add_void -- if True, nodes with void values are merged
caseSensitive -- if True, the value match is case sensitive

Returns

the merged graph

Return type

graph

extract.merge_node_by_property(gr1, properties_list, add_void=False, caseSensitive=True)[source]¶

Merge all nodes of a graph, that share the same value of a list of properties:

Parameters

gr -- a graph
properties_list -- a list of node properties
caseSensitive -- if True, the value match is case sensitive

Returns

the modified graph

Return type

graph

extract.merge_nodes(gr, first_node, second_node, remove_node=True)[source]¶

Merge two nodes of a graph, presserving edges and properties:

Parameters

gr -- a graph
first_node -- first node to ne merged
second_node -- second node to ne merged
remove_node -- if True, remove first_node

Returns

the modified graph

Return type

graph

extract.remove_largest_cc(g, directed=False)[source]¶

Remove the largest connected component:

Parameters: g -- a graph
Returns: a subgraph
Return type: graph

extract.sub_graph_by_value(g, targets, annot_keys, add_void=False, void_symbol=None)[source]¶

Extract a subgraph list. Each graph contains nodes with properties matching provided values:

Parameters

g -- a graph
targets -- a list of values that must match the values of the properties defined by annot_keys
annot_keys -- a list of properties
add_void -- a boolean defining if we keeo void values
void_symbol -- void_symbol represents the void value symbol, used if add_void is true

Returns

a list of dictionnary where the key "subgraph" represents the subgraph

Return type

list

extract.sub_graph_filter(g, iteration_count, central_node, direction=<built-in function all>, node_limit=None, neighbour_count=None)[source]¶

define a subgraph filter according to parameters:

Parameters

gr -- a graph
iteration_count -- number of iterations
central_node -- selected node id
direction -- edge direction (all,in,out)
node_limit -- maximum node number
neighbour_count -- number of neighbours

Returns

the filter

Return type

dict

extract.subgraph_by_direction(g, iteration_count, chosen_node_id=None, direction='all', node_limit=None, neighbour_count=None)[source]¶

extract a subgraph according to parameters:

Parameters

gr -- a graph
iteration_count -- number of iterations
chosen_node_id -- selected node id
direction -- edge direction (all,in,out)
node_limit -- maximum node number
neighbour_count -- number of neighbours

Returns

the subgraph

Return type

graph

extract.subgraph_by_node(input_graph, output_graph, nodeid, direction='in', neighbour_count=3)[source]¶

extract a connected component holding a node specified by node id:

Parameters

input_graph -- input graphml file
output_graph -- output graphml file
nodeid -- selected node id
direction -- edge direction (all,in,out)
neighbour_count -- number of neighbours

Returns

the subgraph

Return type

graph

extract.subgraphs_by_datasource(g, add_void=False, void_symbol='')[source]¶

Extract a subgraph list. Each graph contains nodes with datasource/provider matching input values:

Parameters

g -- a graph
add_void -- a boolean defining if we keep void values
void_symbol -- void_symbol represents the void value symbol, used if add_void is true

Returns

a list of dictionnary where the key "subgraph" represents the subgraph

Return type

list

pax2graphml.properties¶

This module contains function to manipulate edge and node properties

properties.annot_edge_from_file(g, annot_file, map_key, new_prop, new_prop_type='string', delimiter=',')[source]¶

Populate the edges with a new property. The values of the property are extract from a tabular file:

Parameters

g -- a graph
annot_file -- the tabular annotation file
map_key -- the node property holding the primary key that must be present as a named column in the file. 'index' references the edge index (from 0)
new_prop -- the new property to be created that must be present as a named column in the file
new_prop_type -- type of the new property ('string','int', 'float', 'long','bool')
delimiter -- tabular file delimiter

Return type

void

properties.annot_edge_to_file(g, output_prop_file, key_prop, annot_prop, defval=None, excluded_keys=[None, ''], delimiter=',')[source]¶

Export two properties to a tabular file. The first property act as a key to identify the edge, the second as an additionnal annotaion attribute. The Unicity of the key is not tested :

Parameters

g -- a graph
output_prop_file -- the tabular output file
key_prop -- the key edge property that will be present as a named column in the file. 'index' references the edge index (from 0)
annot_prop -- the additionnal property to be exported as a named column in the file
defval -- value to replace None values in output
excluded_keys -- list of key_prop values that will be excluded
delimiter -- tabular file delimiter

Return type

void

properties.annot_node_from_file(g, annot_file, map_key, new_prop, new_prop_type='string', delimiter=',')[source]¶

Populate the nodes with a new property. The values of the property are extract from a tabular file:

Parameters

g -- a graph
annot_file -- the tabular annotation file
map_key -- the node property holding the primary key that must be present as a named column in the file. 'index' references the node index (from 0)
new_prop -- the new property to be created that must be present as a named column in the file
new_prop_type -- type of the new property ('string','int', 'float', 'long','bool')
delimiter -- tabular file delimiter

Return type

void

properties.annot_node_to_file(g, output_prop_file, key_prop, annot_prop, defval=None, excluded_keys=[None, ''], delimiter=',')[source]¶

Export two properties to a tabular file. The first property act as a key to identify the node, the second as an additionnal annotaion attribute. The Unicity of the key is not tested :

Parameters

g -- a graph
output_prop_file -- the tabular output file
key_prop -- the key node property that will be present as a named column in the file. 'index' references the node index (from 0)
annot_prop -- the additionnal property to be exported as a named column in the file
defval -- value to replace None values in output
excluded_keys -- list of key_prop values that will be excluded
delimiter -- tabular file delimiter

Return type

void

properties.change_property_type(g, property_name, property_type, entity='node')[source]¶

change a node or edge property type :

Parameters

g -- a graph
property_name -- the name of the property to be affected
property_type -- the new primitive property type (string,int,bool,float,double), is None, the intial property is replaced
entity -- define is the properties are related to nodes or edges

Returns

void

properties.client_annot_impl(prot, conf=None)[source]¶

Configure a mart for Uniprot to GO annotation "":

Parameters

prot -- list of Uniprot gene symbols
conf -- the configuration dictionary

Returns

a dictionary of annotations

Return type

dict

properties.copy_edge_properties(g, source_edge, target_edge)[source]¶

Copy all properties of a source edge to a target edge:

Parameters

g -- a graph
source_edge -- source node
target_edge -- target node

Return type

void

properties.copy_node_properties(g, sourceNode, targetNode)[source]¶

Copy all properties of a source node to a target node:

Parameters

g -- a graph
sourceNode -- source node
targetNode -- target node

Return type

void

properties.count_edges_by_values(gr, att)[source]¶

Count edges for each value of an input property:

Parameters

gr -- a graph
att -- a existing property name

Returns

a dictionnary (property value/count)

Return type

dict

properties.count_nodes_by_values(gr, att)[source]¶

Count nodes for each value of an input property:

Parameters

gr -- a graph
att -- a existing property name

Returns

a dictionnary (property value/count)

Return type

dict

properties.create_property_from_map(g, annot_map, primary_key, new_property, case_sensitive=False)[source]¶

Create a new node property from a dictionary "":

Parameters

g -- a graph
annot_map -- a dictionary
primary_key -- the primary key property (e.g. uri, uniprot...)
new_property -- the new property name. The expected type is 'object'
case_sensitive -- define if the primary key mapping is case sensitive or not

Returns

void

properties.defaultNodeValue(gr, prop, default_val)[source]¶

asign a userdefined value to a node property when it is None ar equal to "":

Parameters

gr -- a graph
prop -- an existing property name
default_val -- the value to be used to replace None and "" string

Returns

void

Return type

None

properties.default_edge_value(gr, prop, default_val)[source]¶

asign a userdefined value to an edge property when it is None ar equal to "":

Parameters

gr -- a graph
prop -- an existing property name
default_val -- the value to be used to replace None and "" string

Returns

void

Return type

None

properties.define_biomart_server(url, mart_name)[source]¶

define a biomart server "":

Parameters

url -- the url of the biomartserver
mart_name -- the mart name

Returns

mart

Return type

mart Object

properties.describe_properties(g, name=None)[source]¶

Return a description of node and edge properties with names and types:

Parameters

g -- a graph
name -- property name (optional). If None, all properties are described

Returns

a description of edge and node properties

Return type

string

properties.edge_property_values(g, annot_key)[source]¶

Return a list of unique values corresponding to an existing edge property:

Parameters

g -- a graph
annot_key -- an existing propety name

Returns

a list of edge property values

Return type

list

properties.ensembl_api(in_list, conf=None, chunck_size=50)[source]¶

Configure a mart for any annotation "":

Parameters

in_list -- list of inputs identifiers
conf -- the configuration dictionary
chunck_size -- the size of each chunk of inputs to be submitted in one time

Returns

a dictionary of annotations

Return type

dict

properties.is_unique(g, key_prop, exclude_void=True)[source]¶

Evaluate if a property contains one unique value for each node:

Parameters

g -- a graph
key_prop -- the key node property to be evaluated
exclude_void -- define is we include None values

Return type

boolean

properties.list_to_string_property(g, string_prop, new_property=None, sep=';', entity='node')[source]¶

Convert a property contains a list to a concatened string property, for each node or edge:

Parameters

g -- a graph
string_prop -- initial property
new_property -- new property name, is None, the intial property is replaced
sep -- string separator used in the string property
entity -- define is the properties are related to nodes or edges

Returns

void

properties.node_property_values(g, annot_key)[source]¶

Return a list of unique values corresponding to an existing node property:

Parameters

g -- a graph
annot_key -- an existing propety name

Returns

a list of node property values

Return type

list

properties.populate_color(g, colors=None)[source]¶

define the node colors:

Parameters

g -- a graph
color -- the target property name
colors -- optional dictionnary of existing values (dict keys) assopiated with the new values (dict values)

Returns

count of modified nodes

properties.populate_shape(g, shapes=None)[source]¶

define the node shapes:

Parameters

g -- a graph
color -- the target property name
shapes -- optional dictionnary of existing values (dict keys) associated with the new values (dict values)

Returns

count of modified nodes

properties.property_values(g, annot_key)[source]¶: Alias of node_property_values

properties.replace_property_values(g, prop_name, map_values, entity_type='node')[source]¶

Replace the values of a property by the specified values:

Parameters

g -- a graph
prop_name -- the target property name
map_values -- dictionnary of existing values (dict keys) assopiated with the new values (dict values)
entity_type -- related entity type, "node" for node, "edge" for edge

Returns

count of modified entities

properties.string_to_list_property(g, string_prop, new_property=None, sep=';', entity='node')[source]¶

Convert a string property to a property contains a list, for each node or edge:

Parameters

g -- a graph
string_prop -- initial property
new_property -- new property name, is None, the intial property is replaced
sep -- string separator used in the string property
entity -- define is the properties are related to nodes or edges

Returns

void

properties.uniprot_to_go(protein_list, conf=None, chunck_size=50)[source]¶

Configure a mart for Uniprot to GO annotation "":

Parameters

prot -- list of Uniprot gene symbols
conf -- the configuration dictionary
chunck_size -- the size of each chunk of inputs to be submitted in one time

Returns

a dictionary of annotations

Return type

dict

pax2graphml.graph_explore¶

This module contains function to read and write graphml and compute topological statistics on graphs

graph_explore.color_edges(g)[source]¶

Generate a edge color property that Differentiates the edge semantic (subtsrat, product, activator, inhibitor, modulator) using the spaim edge property (s,p,a,i,m):

Parameters: g -- a graph instance
Returns: void
Return type: None

graph_explore.color_nodes(g)[source]¶

Generate a node color property that Differentiates the node entities (reaction, chemical):

Parameters: g -- a graph instance
Returns: void
Return type: None

graph_explore.compute_betweenness(g)[source]¶

Compute the graph Betweenness:

Parameters: g -- a graph instance
Returns: dictionary holding the metrics data
Return type: dict

graph_explore.compute_closeness(g)[source]¶

Compute the graph Closeness:

Parameters: g -- a graph instance
Returns: dictionary holding the metrics data
Return type: dict

graph_explore.compute_graph_metrics(g)[source]¶

Compute multiple topological graph metrics (degree distribution, betweenness, pagerank, closeness):

Parameters: g -- a graph instance
Returns: dictionary holding the metrics data
Return type: dict

graph_explore.compute_page_rank(g)[source]¶

Compute the graph PageRank:

Parameters: g -- a graph instance
Returns: dictionary holding the metrics data
Return type: dict

graph_explore.degree_distribution(g)[source]¶

Generate the distribution of degrees of the node of the graph:

Parameters: g -- a graph instance
Returns: distibution of the degrees of the nodes
Return type: DataFrame object

graph_explore.describe_graph(g)[source]¶

Return a string describing the graph will all edges and nodes with properties values :

Parameters: g -- a graph
Return type: string

graph_explore.graphml_xml_string(graphml_file, ids=1, entity='node')[source]¶

Return the XML content extract of the graphml file:

Parameters

graphml_file -- graphml file path
ids -- an intger or list or integers that correspondn to the id attribute values of the selected entities
entity -- "edge" or ""node" value to define which entity should be selected

Returns

an XML string

Return type

string

graph_explore.largest_cc_degree_dist(g)[source]¶

Generate the distribution of degrees of the nodes of the largest connected component:

Parameters: g -- a graph instance
Returns: distibution of the degrees of the nodes
Return type: DataFrame object

graph_explore.load_graphml(graphml_file, directed=True)[source]¶

Return a graph instance from a GRAPHML file :

Parameters

graphml_file -- a graphml file
directed -- a boolean that defines if the edges of the graph are oriented

Returns

graph

Return type

graph object

graph_explore.save_graphml(g, graphml_file, friendly=False)[source]¶

Save a graph instance as a graphml file:

Parameters

g -- a graph instance
graphml_file -- graphml output file path

Returns

void

Return type

None

graph_explore.save_image(g, image_file, size=3000, conf=None)[source]¶

Generate an image from a graph instance :

Parameters

g -- a graph instance
image_file -- png file path
size -- image size
conf -- image configuration dictionary with nodelabel and edgelabel keys

Returns

void

Return type

None

graph_explore.save_yed_graphml(g, graphmlOutFile)[source]¶

save graphml file enriched by graphics to be displayed by yEd editor

Parameters

g -- a graph instance
graphml_file -- a graphml file

Returns

void

Return type

None

graph_explore.summary(g)[source]¶

Return a string with graph nodes count and edges count :

Parameters: g -- a graph
Return type: string

pax2graphml.utils¶

This module contains utilitary functions related to graph and file manipulation, package management and execution.

utils.cc_by_node_count(g, min, max)[source]¶

select a sub graph from a graph, using the minimum and maximum node number: of each connected component as as a filter

Parameters

g -- a graph
min -- minimum node count of each connected component
max -- maximum node count of each connected component

Returns

a subgraph

Return type

graph

utils.color_range_hexa(color_number=20)[source]¶

utils.count_edge(n, mode='all')[source]¶

compute edges count from a selected node

Parameters

n -- graph node
mode -- count mode. values :"all","in", "out"

Returns

the edges count

Return type

int

utils.data_path()[source]¶

return the data folder path with example datasets

Returns: a string representing the data folder path containing example data files like BIOPAX

utils.defineXmx(xmx)[source]¶: redefine xmx java parameter for lareg BIOPAX file processing

utils.edge_description(g, e)[source]¶

return a string giving al details from an edge, incuding source an target description

Returns: a string

utils.edge_list(g)[source]¶

return a simple list of all edges of a graph (without iterator)

Returns: a list of edges

utils.edge_to_string(gh, e, sep='\n')[source]¶

return a string representing all the properties values from an edge

Returns: a string

utils.friendly_format_graphml(graphml_file, usetemp=False)[source]¶

modify in place a graphml_file to have more human readable properties data key

Parameters: graphml_file -- a graphml file file folling the graph.tools generation rules

utils.node_list(g)[source]¶

return a simple list of all nodes of a graph (without iterator)

Returns: a list of nodes

utils.node_shape_to_color(code, colors)[source]¶: convert biopax type numeric code as defined in shape node property to yEd compatible shape name

utils.node_to_string(gh, n, sep='\n')[source]¶

return a string representing all the properties values from a node

Returns: a string

utils.resource_path()[source]¶

return the resources path

Returns: a string representing the resource path containing additional files like template and jar

utils.spaim_edge_label(code)[source]¶: convert spaim code as defined in spaim edge property to human readable labels

utils.to_string(gh, n, sep='\n')[source]¶

alias of node_to_string

Returns: a string

PAX2GRAPHML python package documentation¶

Indices and tables¶

pax2graphml.pax_import¶

pax2graphml.extract¶

pax2graphml.properties¶

pax2graphml.graph_explore¶

pax2graphml.utils¶

pax2graphml

Navigation

Related Topics