Helper to explorer and describe the BIOPAX model¶

The core of Biopax Explorer is a Python implementation, Object oriented, of the BIOPAX model

In [1]:
#import
from biopax_explorer.biopax.doc import helper
from biopax_explorer.biopax.utils import gen_utils as gu
from biopax_explorer.biopax import BioSource
In [ ]:
 

Describe the BioSource class¶

In [2]:
dh1=helper.select('BioSource')
print(dh1.classInfo())
Definition: The biological source (organism, tissue or cell type) of an Entity. 

Usage: Some entities are considered source-neutral (e.g. small molecules), and the biological source of others can be deduced from their constituentss (e.g. complex, pathway).

Instances: HeLa cells, Homo sapiens, and mouse liver tissue.
    

Describe the PhysicalEntity class¶

In [3]:
dh2=helper.select('PhysicalEntity')
print(dh2.attributeNames())
['cellularLocation', 'feature', 'memberPhysicalEntity', 'notFeature', 'dataSource', 'evidence', 'xref', 'availability', 'comment', 'displayName', 'name', 'standardName']

Describe the attribute cellType of the PhysicalEntity class¶

In [4]:
print(dh1.attributeInfo("cellType"))
A cell type, e.g. 'HeLa'. This should reference a term in a controlled vocabulary of cell types. Best practice is to refer to OBO Cell Ontology. http://www.obofoundry.org/cgi-bin/detail.cgi?id=cell
    

List the attributes of the class BioSource¶

  • that are Entities of the model (Object)
  • that are attributes of a simple Type (String, Float...)
In [5]:
dh1=helper.select('BioSource')

print("----------------attributes of BioSource----------------------------")
print("objects")
print(dh1.objectAttributeNames())
print("simple")
print(dh1.typeAttributeNames())
print("----attribute_type_by_name-----")
for attn,tp in BioSource().attribute_type_by_name().items():
   print("   %s : %s" %(attn,tp))

print("--------------------------------------------")
print("information related to its 'cellType' attribute:")
attn="cellType"
print(attn," type:", dh1.attributeType(attn))


print(dh1.attributeInfo(attn))
----------------attributes of BioSource----------------------------
objects
['cellType', 'tissue', 'xref']
simple
['displayName', 'name', 'standardName', 'comment']
----attribute_type_by_name-----
   comment : str
   cellType : CellVocabulary
   tissue : TissueVocabulary
   xref : Xref
   displayName : str
   name : str
   standardName : str
--------------------------------------------
information related to its 'cellType' attribute:
cellType  type: CellVocabulary

A cell type, e.g. 'HeLa'. This should reference a term in a controlled vocabulary of cell types. Best practice is to refer to OBO Cell Ontology. http://www.obofoundry.org/cgi-bin/detail.cgi?id=cell
    

Show all the classes of the model and their children classes (inheritance)¶

In [6]:
for cl, children in gu.classes_children().items():
   print(cl, "=>", children)
EntityReference => ['RnaRegionReference', 'SmallMoleculeReference', 'DnaReference', 'DnaRegionReference', 'ProteinReference', 'RnaReference']
PathwayStep => ['BiochemicalPathwayStep']
Degradation => []
CellVocabulary => []
ModificationFeature => ['CovalentBindingFeature']
DnaRegion => []
Pathway => []
TemplateReaction => []
TransportWithBiochemicalReaction => []
TemplateReactionRegulation => []
PhysicalEntity => ['SmallMolecule', 'DnaRegion', 'RnaRegion', 'Dna', 'Rna', 'Protein', 'Complex']
TissueVocabulary => []
Conversion => ['TransportWithBiochemicalReaction', 'ComplexAssembly', 'BiochemicalReaction', 'Transport', 'Degradation']
PhenotypeVocabulary => []
CovalentBindingFeature => []
SequenceInterval => []
ChemicalStructure => []
RnaRegionReference => []
Evidence => []
RnaRegion => []
ProteinReference => []
Xref => ['RelationshipXref', 'PublicationXref', 'UnificationXref']
EvidenceCodeVocabulary => []
BiochemicalPathwayStep => []
EntityReferenceTypeVocabulary => []
KPrime => []
PublicationXref => []
MolecularInteraction => []
Catalysis => []
Dna => []
SequenceRegionVocabulary => []
SequenceModificationVocabulary => []
SmallMolecule => []
ControlledVocabulary => ['ExperimentalFormVocabulary', 'TissueVocabulary', 'EvidenceCodeVocabulary', 'CellVocabulary', 'SequenceRegionVocabulary', 'CellularLocationVocabulary', 'SequenceModificationVocabulary', 'RelationshipTypeVocabulary', 'PhenotypeVocabulary', 'InteractionVocabulary', 'EntityReferenceTypeVocabulary']
Stoichiometry => []
Gene => []
DnaRegionReference => []
RelationshipTypeVocabulary => []
GeneticInteraction => []
ExperimentalFormVocabulary => []
SequenceSite => []
CellularLocationVocabulary => []
Provenance => []
Protein => []
DnaReference => []
DeltaG => []
InteractionVocabulary => []
SmallMoleculeReference => []
Complex => []
Transport => ['TransportWithBiochemicalReaction']
Interaction => ['Control', 'TransportWithBiochemicalReaction', 'Modulation', 'Conversion', 'ComplexAssembly', 'BiochemicalReaction', 'Catalysis', 'Transport', 'TemplateReaction', 'TemplateReactionRegulation', 'GeneticInteraction', 'Degradation', 'MolecularInteraction']
Rna => []
ExperimentalForm => []
BioSource => []
RnaReference => []
EntityFeature => ['BindingFeature', 'FragmentFeature', 'ModificationFeature', 'CovalentBindingFeature']
ComplexAssembly => []
Score => []
BiochemicalReaction => ['TransportWithBiochemicalReaction']
RelationshipXref => []
BindingFeature => ['CovalentBindingFeature']
FragmentFeature => []
Modulation => []
SequenceLocation => ['SequenceSite', 'SequenceInterval']
Control => ['Modulation', 'Catalysis', 'TemplateReactionRegulation']
UnificationXref => []
Entity => ['Modulation', 'Dna', 'Rna', 'Catalysis', 'GeneticInteraction', 'Degradation', 'MolecularInteraction', 'DnaRegion', 'TemplateReactionRegulation', 'Protein', 'Transport', 'PhysicalEntity', 'Control', 'Gene', 'SmallMolecule', 'BiochemicalReaction', 'TemplateReaction', 'Complex', 'Interaction', 'Pathway', 'TransportWithBiochemicalReaction', 'Conversion', 'RnaRegion', 'ComplexAssembly']
UtilityClass => ['TissueVocabulary', 'ModificationFeature', 'EvidenceCodeVocabulary', 'SequenceRegionVocabulary', 'CellularLocationVocabulary', 'SequenceInterval', 'SequenceLocation', 'Provenance', 'Xref', 'Stoichiometry', 'RelationshipTypeVocabulary', 'DnaReference', 'RelationshipXref', 'PathwayStep', 'BioSource', 'FragmentFeature', 'DnaRegionReference', 'UnificationXref', 'RnaReference', 'EntityFeature', 'RnaRegionReference', 'KPrime', 'SmallMoleculeReference', 'CellVocabulary', 'InteractionVocabulary', 'SequenceSite', 'BiochemicalPathwayStep', 'SequenceModificationVocabulary', 'ChemicalStructure', 'ExperimentalFormVocabulary', 'DeltaG', 'ControlledVocabulary', 'PublicationXref', 'CovalentBindingFeature', 'Score', 'BindingFeature', 'ExperimentalForm', 'EntityReference', 'PhenotypeVocabulary', 'ProteinReference', 'EntityReferenceTypeVocabulary']
In [7]:
print("--------------------")
print(gu.class_children('Interaction'))
print("--------------------")
print(gu.class_children('Conversion'))
print("--------------------")
print(gu.class_children('Entity'))
 
 
--------------------
['Control', 'TransportWithBiochemicalReaction', 'Modulation', 'Conversion', 'ComplexAssembly', 'BiochemicalReaction', 'Catalysis', 'Transport', 'TemplateReaction', 'TemplateReactionRegulation', 'GeneticInteraction', 'Degradation', 'MolecularInteraction']
--------------------
['TransportWithBiochemicalReaction', 'ComplexAssembly', 'BiochemicalReaction', 'Transport', 'Degradation']
--------------------
['Modulation', 'Dna', 'Rna', 'Catalysis', 'GeneticInteraction', 'Degradation', 'MolecularInteraction', 'DnaRegion', 'TemplateReactionRegulation', 'Protein', 'Transport', 'PhysicalEntity', 'Control', 'Gene', 'SmallMolecule', 'BiochemicalReaction', 'TemplateReaction', 'Complex', 'Interaction', 'Pathway', 'TransportWithBiochemicalReaction', 'Conversion', 'RnaRegion', 'ComplexAssembly']

display the parent classes of a class of the BIOPAX model¶

In [8]:
st=gu.parentTree()
print(st['TransportWithBiochemicalReaction'])   
['Conversion', 'Transport', 'Interaction', 'BiochemicalReaction', 'Entity']

display the description of the PhysicalEntity class¶

In [9]:
s=helper.describe('PhysicalEntity')
print(s)
********************
PhysicalEntity

Definition: A pool of molecules or molecular complexes. 

Comments: Each PhysicalEntity is defined by a  sequence or structure based on an EntityReference AND any set of Features that are given. For example,  ser46 phosphorylated p53 is a physical entity in BioPAX defined by the p53 sequence and the phosphorylation feature on the serine at position 46 in the sequence.  Features are any combination of cellular location, covalent and non-covalent bonds with other molecules and covalent modifications.  

For a specific molecule to be a member of the pool it has to satisfy all of the specified features. Unspecified features are treated as unknowns or unneccesary. Features that are known to not be on the molecules should be explicitly stated with the "not feature" property. 
A physical entity in BioPAX  never represents a specific molecular instance. 

Physical Entity can be heterogenous and potentially overlap, i.e. a single molecule can be counted as a member of multiple pools. This makes BioPAX semantics different than regular chemical notation but is necessary for dealing with combinatorial complexity. 

Synonyms: part, interactor, object, species

Examples: extracellular calcium, ser 64 phosphorylated p53
    
--------------------
primitive type attributes:
----------
availability (str): 

 Describes the availability of this data (e.g. a copyright statement).
----------
comment (str): 

 Comment on the data in the container class. This property should be
used instead of the OWL documentation elements (rdfs:comment) for
instances because information in 'comment' is data to be exchanged,
whereas the rdfs:comment field is used for metadata about the
structure of the BioPAX ontology.
----------
displayName (str): 

 An abbreviated name for this entity, preferably a name that is short
enough to be used in a visualization application to label a graphical
element that represents this entity. If no short name is available, an
xref may be used for this purpose by the visualization application.
Warning:  Subproperties of name are functional, that is we expect to
have only one standardName and shortName for a given entity. If a user
decides to assign a different name to standardName or shortName, they
have to remove the old triplet from the model too. If the old name
should be retained as a synonym a regular "name" property should also
be introduced with the old name.
----------
name (str): 

 Synonyms for this entity.  standardName and shortName are
subproperties of this property and if declared they are automatically
considered as names.   Warning:  Subproperties of name are functional,
that is we expect to have only one standardName and shortName for a
given entity. If a user decides to assign a different name to
standardName or shortName, they have to remove the old triplet from
the model too. If the old name should be retained as a synonym a
regular "name" property should also be introduced with the old name.
----------
standardName (str): 

 The preferred full name for this entity, if exists assigned by a
standard nomenclature organization such as HUGO Gene Nomenclature
Committee.  Warning:  Subproperties of name are functional, that is we
expect to have only one standardName and shortName for a given entity.
If a user decides to assign a different name to standardName or
shortName, they have to remove the old triplet from the model too. If
the old name should be retained as a synonym a regular "name" property
should also be introduced with the old name.
--------------------
object attributes:
----------
cellularLocation (CellularLocationVocabulary): 

 A cellular location, e.g. 'cytoplasm'. This should reference a term
in the Gene Ontology Cellular Component ontology. The location
referred to by this property should be as specific as is known. If an
interaction is known to occur in multiple locations, separate
interactions (and physicalEntities) must be created for each different
location.  If the location of a participant in a complex is
unspecified, it may be assumed to be the same location as that of the
complex.    A molecule in two different cellular locations are
considered two different physical entities.
----------
feature (EntityFeature): 

 Sequence features of the owner physical entity.
----------
memberPhysicalEntity (PhysicalEntity): 

 This property stores the members of a generic physical entity.   For
representing homology generics a better way is to use generic entity
references and generic features. However not all generic logic can be
captured by this, such as complex generics or rare cases where feature
cardinality is variable. Usages of this property should be limited to
such cases.
----------
notFeature (EntityFeature): 

 Sequence features where the owner physical entity has a feature. If
not specified, other potential features are not known.
----------
dataSource (Provenance): 

 A free text description of the source of this data, e.g. a database
or person name. This property should be used to describe the source of
the data. This is meant to be used by databases that export their data
to the BioPAX format or by systems that are integrating data from
multiple sources. The granularity of use (specifying the data source
in many or few instances) is up to the user. It is intended that this
property report the last data source, not all data sources that the
data has passed through from creation.
----------
evidence (Evidence): 

 Scientific evidence supporting the existence of the entity as
described.
----------
xref (Xref): 

 Values of this property define external cross-references from this
entity to entities in external databases.
********************

All helper entries¶

In [10]:
print(helper.entries())
['EntityReference', 'PathwayStep', 'Degradation', 'CellVocabulary', 'ModificationFeature', 'DnaRegion', 'Pathway', 'TemplateReaction', 'TransportWithBiochemicalReaction', 'TemplateReactionRegulation', 'PhysicalEntity', 'TissueVocabulary', 'Conversion', 'PhenotypeVocabulary', 'CovalentBindingFeature', 'SequenceInterval', 'ChemicalStructure', 'RnaRegionReference', 'Evidence', 'RnaRegion', 'ProteinReference', 'Xref', 'EvidenceCodeVocabulary', 'BiochemicalPathwayStep', 'EntityReferenceTypeVocabulary', 'KPrime', 'PublicationXref', 'MolecularInteraction', 'Catalysis', 'Dna', 'SequenceRegionVocabulary', 'SequenceModificationVocabulary', 'SmallMolecule', 'ControlledVocabulary', 'Stoichiometry', 'Gene', 'DnaRegionReference', 'RelationshipTypeVocabulary', 'GeneticInteraction', 'ExperimentalFormVocabulary', 'SequenceSite', 'CellularLocationVocabulary', 'Provenance', 'Protein', 'DnaReference', 'DeltaG', 'InteractionVocabulary', 'SmallMoleculeReference', 'Complex', 'Transport', 'Interaction', 'Rna', 'ExperimentalForm', 'BioSource', 'RnaReference', 'EntityFeature', 'ComplexAssembly', 'Score', 'BiochemicalReaction', 'RelationshipXref', 'BindingFeature', 'FragmentFeature', 'Modulation', 'SequenceLocation', 'Control', 'UnificationXref', 'Entity', 'UtilityClass']
In [ ]:
 
In [ ]: