hit
A Hit is created when hmmsearch find similarities between a profile and protein of the input dataset
A diagram showing the interaction between CoreGene, ModelGene, Model, HIt, ValidHit interactions The diagram above represents the models, genes and hit generated from the definitions below.
<model name="A" inter_gene_max_space="2">
<gene name="abc" presence="mandatory"/>
<gene name="def" presence="accessory"/>
</model>
<model name="B" inter_gene_max_space="5">
<gene name="def" presence="mandatory"/>
<exchangeables>
<gene name="abc"/>
</exchangeables>
<gene name="ghj" presence="accessory"
</model>
hit
- class macsypy.hit.Hit(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]
Handle the hits filtered from the Hmmer search. The hits are instanciated by
HMMReport.extract()
method- __eq__(other)[source]
Return True if two hits are totally equivalent, False otherwise.
- Parameters
other (
macsypy.report.Hit
object) – the hit to compare to the current object- Returns
the result of the comparison
- Return type
boolean
- __gt__(other)[source]
compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.
- Parameters
other (
macsypy.report.Hit
object) – the hit to compare to the current object- Returns
True if self is > other, False otherwise
- __init__(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]
- Parameters
gene (
macsypy.gene.CoreGene
object) – the gene corresponding to this profilehit_id (str) – the identifier of the hit
hit_seq_length (int) – the length of the hit sequence
replicon_name (str) – the name of the replicon
position_hit (int) – the rank of the sequence matched in the input dataset file
i_eval (float) – the best-domain evalue (i-evalue, “independent evalue”)
score (float) – the score of the hit
profile_coverage (float) – percentage of the profile that matches the hit sequence
sequence_coverage (float) – percentage of the hit sequence that matches the profile
begin_match (int) – where the hit with the profile starts in the sequence
end_match (int) – where the hit with the profile ends in the sequence
- __lt__(other)[source]
Compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.
- Parameters
other (
macsypy.report.Hit
object) – the hit to compare to the current object- Returns
True if self is < other, False otherwise
- __str__()[source]
- Returns
Useful information on the Hit: regarding Hmmer statistics, and sequence information
- Return type
str
- __weakref__
list of weak references to the object (if defined)
- class macsypy.hit.HitWeight(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, loner_multi_system: float = 0.7)[source]
The weight to compute the cluster and system score see user documentation macsyfinder functionning for further details by default
itself = 1
exchangeable = 0.8
mandatory = 1
accessory = 0.5
neutral = 0
loner_multi_system = 0.7
- __delattr__(name)
Implement delattr(self, name).
- __eq__(other)
Return self==value.
- __hash__()
Return hash(self).
- __init__(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, loner_multi_system: float = 0.7) None
- __repr__()
Return repr(self).
- __setattr__(name, value)
Implement setattr(self, name, value).
- __weakref__
list of weak references to the object (if defined)
- class macsypy.hit.ValidHit(hit, gene_ref, gene_status)[source]
Encapsulates a
macsypy.report.Hit
This class stores a Hit that has been attributed to a putative system. Thus, it also stores:the system,
the status of the gene in this system, (‘mandatory’, ‘accessory’, …
the gene in the model for which it’s an occurrence
- __hash__ = None
- __init__(hit, gene_ref, gene_status)[source]
- Parameters
hit (
macsypy.hit.Hit
object) – a match between a hmm profile and a replicongene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
gene_status (
macsypy.gene.GeneStatus
object) –
- __weakref__
list of weak references to the object (if defined)
- property loner
- Returns
True if the hit represent a loner
macsypy.Gene.ModelGene
, False otherwise.
- property multi_system
- Returns
True if the hit represent a multi_systems
macsypy.Gene.ModelGene
, False otherwise.
- macsypy.hit.get_best_hits(hits, key='score')[source]
If several hits match the same protein, keep only the best match based either on
score
i_evalue
profile_coverage
- Parameters
hits ([
macsypy.hit.Hit
object, …]) – the hits to filter, all hits must match the same protein.key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’
- Returns
the list of the best hits
- Return type
[
macsypy.hit.Hit
object, …]