Helper to explorer and describe the BIOPAX model¶
The core of Biopax Explorer is a Python implementation, Object oriented, of the BIOPAX model
In [1]:
#import
from biopax_explorer.biopax.doc import helper
from biopax_explorer.biopax.utils import gen_utils as gu
from biopax_explorer.biopax import BioSource
In [ ]:
Describe the BioSource class¶
In [2]:
dh1=helper.select('BioSource')
print(dh1.classInfo())
Definition: The biological source (organism, tissue or cell type) of an Entity. Usage: Some entities are considered source-neutral (e.g. small molecules), and the biological source of others can be deduced from their constituentss (e.g. complex, pathway). Instances: HeLa cells, Homo sapiens, and mouse liver tissue.
Describe the PhysicalEntity class¶
In [3]:
dh2=helper.select('PhysicalEntity')
print(dh2.attributeNames())
['cellularLocation', 'feature', 'memberPhysicalEntity', 'notFeature', 'dataSource', 'evidence', 'xref', 'availability', 'comment', 'displayName', 'name', 'standardName']
Describe the attribute cellType of the PhysicalEntity class¶
In [4]:
print(dh1.attributeInfo("cellType"))
A cell type, e.g. 'HeLa'. This should reference a term in a controlled vocabulary of cell types. Best practice is to refer to OBO Cell Ontology. http://www.obofoundry.org/cgi-bin/detail.cgi?id=cell
List the attributes of the class BioSource¶
- that are Entities of the model (Object)
- that are attributes of a simple Type (String, Float...)
In [5]:
dh1=helper.select('BioSource')
print("----------------attributes of BioSource----------------------------")
print("objects")
print(dh1.objectAttributeNames())
print("simple")
print(dh1.typeAttributeNames())
print("----attribute_type_by_name-----")
for attn,tp in BioSource().attribute_type_by_name().items():
print(" %s : %s" %(attn,tp))
print("--------------------------------------------")
print("information related to its 'cellType' attribute:")
attn="cellType"
print(attn," type:", dh1.attributeType(attn))
print(dh1.attributeInfo(attn))
----------------attributes of BioSource---------------------------- objects ['cellType', 'tissue', 'xref'] simple ['displayName', 'name', 'standardName', 'comment'] ----attribute_type_by_name----- comment : str cellType : CellVocabulary tissue : TissueVocabulary xref : Xref displayName : str name : str standardName : str -------------------------------------------- information related to its 'cellType' attribute: cellType type: CellVocabulary A cell type, e.g. 'HeLa'. This should reference a term in a controlled vocabulary of cell types. Best practice is to refer to OBO Cell Ontology. http://www.obofoundry.org/cgi-bin/detail.cgi?id=cell
Show all the classes of the model and their children classes (inheritance)¶
In [6]:
for cl, children in gu.classes_children().items():
print(cl, "=>", children)
EntityReference => ['RnaRegionReference', 'SmallMoleculeReference', 'DnaReference', 'DnaRegionReference', 'ProteinReference', 'RnaReference'] PathwayStep => ['BiochemicalPathwayStep'] Degradation => [] CellVocabulary => [] ModificationFeature => ['CovalentBindingFeature'] DnaRegion => [] Pathway => [] TemplateReaction => [] TransportWithBiochemicalReaction => [] TemplateReactionRegulation => [] PhysicalEntity => ['SmallMolecule', 'DnaRegion', 'RnaRegion', 'Dna', 'Rna', 'Protein', 'Complex'] TissueVocabulary => [] Conversion => ['TransportWithBiochemicalReaction', 'ComplexAssembly', 'BiochemicalReaction', 'Transport', 'Degradation'] PhenotypeVocabulary => [] CovalentBindingFeature => [] SequenceInterval => [] ChemicalStructure => [] RnaRegionReference => [] Evidence => [] RnaRegion => [] ProteinReference => [] Xref => ['RelationshipXref', 'PublicationXref', 'UnificationXref'] EvidenceCodeVocabulary => [] BiochemicalPathwayStep => [] EntityReferenceTypeVocabulary => [] KPrime => [] PublicationXref => [] MolecularInteraction => [] Catalysis => [] Dna => [] SequenceRegionVocabulary => [] SequenceModificationVocabulary => [] SmallMolecule => [] ControlledVocabulary => ['ExperimentalFormVocabulary', 'TissueVocabulary', 'EvidenceCodeVocabulary', 'CellVocabulary', 'SequenceRegionVocabulary', 'CellularLocationVocabulary', 'SequenceModificationVocabulary', 'RelationshipTypeVocabulary', 'PhenotypeVocabulary', 'InteractionVocabulary', 'EntityReferenceTypeVocabulary'] Stoichiometry => [] Gene => [] DnaRegionReference => [] RelationshipTypeVocabulary => [] GeneticInteraction => [] ExperimentalFormVocabulary => [] SequenceSite => [] CellularLocationVocabulary => [] Provenance => [] Protein => [] DnaReference => [] DeltaG => [] InteractionVocabulary => [] SmallMoleculeReference => [] Complex => [] Transport => ['TransportWithBiochemicalReaction'] Interaction => ['Control', 'TransportWithBiochemicalReaction', 'Modulation', 'Conversion', 'ComplexAssembly', 'BiochemicalReaction', 'Catalysis', 'Transport', 'TemplateReaction', 'TemplateReactionRegulation', 'GeneticInteraction', 'Degradation', 'MolecularInteraction'] Rna => [] ExperimentalForm => [] BioSource => [] RnaReference => [] EntityFeature => ['BindingFeature', 'FragmentFeature', 'ModificationFeature', 'CovalentBindingFeature'] ComplexAssembly => [] Score => [] BiochemicalReaction => ['TransportWithBiochemicalReaction'] RelationshipXref => [] BindingFeature => ['CovalentBindingFeature'] FragmentFeature => [] Modulation => [] SequenceLocation => ['SequenceSite', 'SequenceInterval'] Control => ['Modulation', 'Catalysis', 'TemplateReactionRegulation'] UnificationXref => [] Entity => ['Modulation', 'Dna', 'Rna', 'Catalysis', 'GeneticInteraction', 'Degradation', 'MolecularInteraction', 'DnaRegion', 'TemplateReactionRegulation', 'Protein', 'Transport', 'PhysicalEntity', 'Control', 'Gene', 'SmallMolecule', 'BiochemicalReaction', 'TemplateReaction', 'Complex', 'Interaction', 'Pathway', 'TransportWithBiochemicalReaction', 'Conversion', 'RnaRegion', 'ComplexAssembly'] UtilityClass => ['TissueVocabulary', 'ModificationFeature', 'EvidenceCodeVocabulary', 'SequenceRegionVocabulary', 'CellularLocationVocabulary', 'SequenceInterval', 'SequenceLocation', 'Provenance', 'Xref', 'Stoichiometry', 'RelationshipTypeVocabulary', 'DnaReference', 'RelationshipXref', 'PathwayStep', 'BioSource', 'FragmentFeature', 'DnaRegionReference', 'UnificationXref', 'RnaReference', 'EntityFeature', 'RnaRegionReference', 'KPrime', 'SmallMoleculeReference', 'CellVocabulary', 'InteractionVocabulary', 'SequenceSite', 'BiochemicalPathwayStep', 'SequenceModificationVocabulary', 'ChemicalStructure', 'ExperimentalFormVocabulary', 'DeltaG', 'ControlledVocabulary', 'PublicationXref', 'CovalentBindingFeature', 'Score', 'BindingFeature', 'ExperimentalForm', 'EntityReference', 'PhenotypeVocabulary', 'ProteinReference', 'EntityReferenceTypeVocabulary']
In [7]:
print("--------------------")
print(gu.class_children('Interaction'))
print("--------------------")
print(gu.class_children('Conversion'))
print("--------------------")
print(gu.class_children('Entity'))
-------------------- ['Control', 'TransportWithBiochemicalReaction', 'Modulation', 'Conversion', 'ComplexAssembly', 'BiochemicalReaction', 'Catalysis', 'Transport', 'TemplateReaction', 'TemplateReactionRegulation', 'GeneticInteraction', 'Degradation', 'MolecularInteraction'] -------------------- ['TransportWithBiochemicalReaction', 'ComplexAssembly', 'BiochemicalReaction', 'Transport', 'Degradation'] -------------------- ['Modulation', 'Dna', 'Rna', 'Catalysis', 'GeneticInteraction', 'Degradation', 'MolecularInteraction', 'DnaRegion', 'TemplateReactionRegulation', 'Protein', 'Transport', 'PhysicalEntity', 'Control', 'Gene', 'SmallMolecule', 'BiochemicalReaction', 'TemplateReaction', 'Complex', 'Interaction', 'Pathway', 'TransportWithBiochemicalReaction', 'Conversion', 'RnaRegion', 'ComplexAssembly']
display the parent classes of a class of the BIOPAX model¶
In [8]:
st=gu.parentTree()
print(st['TransportWithBiochemicalReaction'])
['Conversion', 'Transport', 'Interaction', 'BiochemicalReaction', 'Entity']
display the description of the PhysicalEntity class¶
In [9]:
s=helper.describe('PhysicalEntity')
print(s)
******************** PhysicalEntity Definition: A pool of molecules or molecular complexes. Comments: Each PhysicalEntity is defined by a sequence or structure based on an EntityReference AND any set of Features that are given. For example, ser46 phosphorylated p53 is a physical entity in BioPAX defined by the p53 sequence and the phosphorylation feature on the serine at position 46 in the sequence. Features are any combination of cellular location, covalent and non-covalent bonds with other molecules and covalent modifications. For a specific molecule to be a member of the pool it has to satisfy all of the specified features. Unspecified features are treated as unknowns or unneccesary. Features that are known to not be on the molecules should be explicitly stated with the "not feature" property. A physical entity in BioPAX never represents a specific molecular instance. Physical Entity can be heterogenous and potentially overlap, i.e. a single molecule can be counted as a member of multiple pools. This makes BioPAX semantics different than regular chemical notation but is necessary for dealing with combinatorial complexity. Synonyms: part, interactor, object, species Examples: extracellular calcium, ser 64 phosphorylated p53 -------------------- primitive type attributes: ---------- availability (str): Describes the availability of this data (e.g. a copyright statement). ---------- comment (str): Comment on the data in the container class. This property should be used instead of the OWL documentation elements (rdfs:comment) for instances because information in 'comment' is data to be exchanged, whereas the rdfs:comment field is used for metadata about the structure of the BioPAX ontology. ---------- displayName (str): An abbreviated name for this entity, preferably a name that is short enough to be used in a visualization application to label a graphical element that represents this entity. If no short name is available, an xref may be used for this purpose by the visualization application. Warning: Subproperties of name are functional, that is we expect to have only one standardName and shortName for a given entity. If a user decides to assign a different name to standardName or shortName, they have to remove the old triplet from the model too. If the old name should be retained as a synonym a regular "name" property should also be introduced with the old name. ---------- name (str): Synonyms for this entity. standardName and shortName are subproperties of this property and if declared they are automatically considered as names. Warning: Subproperties of name are functional, that is we expect to have only one standardName and shortName for a given entity. If a user decides to assign a different name to standardName or shortName, they have to remove the old triplet from the model too. If the old name should be retained as a synonym a regular "name" property should also be introduced with the old name. ---------- standardName (str): The preferred full name for this entity, if exists assigned by a standard nomenclature organization such as HUGO Gene Nomenclature Committee. Warning: Subproperties of name are functional, that is we expect to have only one standardName and shortName for a given entity. If a user decides to assign a different name to standardName or shortName, they have to remove the old triplet from the model too. If the old name should be retained as a synonym a regular "name" property should also be introduced with the old name. -------------------- object attributes: ---------- cellularLocation (CellularLocationVocabulary): A cellular location, e.g. 'cytoplasm'. This should reference a term in the Gene Ontology Cellular Component ontology. The location referred to by this property should be as specific as is known. If an interaction is known to occur in multiple locations, separate interactions (and physicalEntities) must be created for each different location. If the location of a participant in a complex is unspecified, it may be assumed to be the same location as that of the complex. A molecule in two different cellular locations are considered two different physical entities. ---------- feature (EntityFeature): Sequence features of the owner physical entity. ---------- memberPhysicalEntity (PhysicalEntity): This property stores the members of a generic physical entity. For representing homology generics a better way is to use generic entity references and generic features. However not all generic logic can be captured by this, such as complex generics or rare cases where feature cardinality is variable. Usages of this property should be limited to such cases. ---------- notFeature (EntityFeature): Sequence features where the owner physical entity has a feature. If not specified, other potential features are not known. ---------- dataSource (Provenance): A free text description of the source of this data, e.g. a database or person name. This property should be used to describe the source of the data. This is meant to be used by databases that export their data to the BioPAX format or by systems that are integrating data from multiple sources. The granularity of use (specifying the data source in many or few instances) is up to the user. It is intended that this property report the last data source, not all data sources that the data has passed through from creation. ---------- evidence (Evidence): Scientific evidence supporting the existence of the entity as described. ---------- xref (Xref): Values of this property define external cross-references from this entity to entities in external databases. ********************
All helper entries¶
In [10]:
print(helper.entries())
['EntityReference', 'PathwayStep', 'Degradation', 'CellVocabulary', 'ModificationFeature', 'DnaRegion', 'Pathway', 'TemplateReaction', 'TransportWithBiochemicalReaction', 'TemplateReactionRegulation', 'PhysicalEntity', 'TissueVocabulary', 'Conversion', 'PhenotypeVocabulary', 'CovalentBindingFeature', 'SequenceInterval', 'ChemicalStructure', 'RnaRegionReference', 'Evidence', 'RnaRegion', 'ProteinReference', 'Xref', 'EvidenceCodeVocabulary', 'BiochemicalPathwayStep', 'EntityReferenceTypeVocabulary', 'KPrime', 'PublicationXref', 'MolecularInteraction', 'Catalysis', 'Dna', 'SequenceRegionVocabulary', 'SequenceModificationVocabulary', 'SmallMolecule', 'ControlledVocabulary', 'Stoichiometry', 'Gene', 'DnaRegionReference', 'RelationshipTypeVocabulary', 'GeneticInteraction', 'ExperimentalFormVocabulary', 'SequenceSite', 'CellularLocationVocabulary', 'Provenance', 'Protein', 'DnaReference', 'DeltaG', 'InteractionVocabulary', 'SmallMoleculeReference', 'Complex', 'Transport', 'Interaction', 'Rna', 'ExperimentalForm', 'BioSource', 'RnaReference', 'EntityFeature', 'ComplexAssembly', 'Score', 'BiochemicalReaction', 'RelationshipXref', 'BindingFeature', 'FragmentFeature', 'Modulation', 'SequenceLocation', 'Control', 'UnificationXref', 'Entity', 'UtilityClass']
In [ ]:
In [ ]: