propy.PyPro

Computing different types of protein descriptors.

Authors: Dongsheng Cao and Yizeng Liang. Date: 2012.9.4 Email: oriental-cds@163.com

class propy.PyPro.GetProDes(ProteinSequence='')[source]

Bases: object

Collect all descriptor calcualtion modules.

AALetter = ['A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']
GetAAComp() → Dict[str, float][source]

Amino acid compositon descriptors (20).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetAAComp()
GetAAindex1(name: str, path='.')[source]

Get the amino acid property values from aaindex1.

Parameters:name (str) – is the name of amino acid property (e.g., KRIW790103)
Returns:
Return type:result is a dict form containing the properties of 20 amino acids

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetAAindex1(name="KRIW790103")
GetAAindex23(name, path='.')[source]

Get the amino acid property values from aaindex2 and aaindex3.

Parameters:is the name of amino acid property (e.g. TANS760101, GRAR740104) (name) –
Returns:
Return type:result is a dict form containing the properties of 400 amino acid pairs

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetAAindex23(name="KRIW790103")
GetALL()[source]

Calcualte all descriptors except tri-peptide descriptors.

GetAPAAC(lamda=10, weight=0.5)[source]

Amphiphilic (Type II) Pseudo amino acid composition descriptors.

default is 30

Parameters:
  • lamda (int, optional (default: 10)) – reflects the rank of correlation and is a non-Negative integer, such as 15. Note that (1)lamda should NOT be larger than the length of input protein sequence; (2) lamda must be non-Negative integer, such as 0, 1, 2, …; (3) when lamda =0, the output of PseAA server is the 20-D amino acid composition.
  • weight (float, optional (default: 0.05)) – is designed for the users to put weight on the additional PseAA components with respect to the conventional AA components. The user can select any value within the region from 0.05 to 0.7 for the weight factor.

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetAPAAC(lamda=10, weight=0.5)
GetCTD()[source]

Composition Transition Distribution descriptors (147).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetCTD()
GetDPComp() → Dict[str, float][source]

Dipeptide composition descriptors (400).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetDPComp()
GetGearyAuto()[source]

Geary autocorrelation descriptors (240).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetGearyAuto()
GetGearyAutop(AAP=None, AAPName='p')[source]

Geary autocorrelation descriptors for the given property (30).

Parameters:AAP (Dict[Any, Any]) – contains physicochemical properities of 20 amino acids

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetGearyAutop(AAP={}, AAPName='p')
GetMoranAuto()[source]

Moran autocorrelation descriptors (240).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetMoranAuto()
GetMoranAutop(AAP=None, AAPName='p')[source]

Moran autocorrelation descriptors for the given property (30).

Parameters:AAP (Dict[Any, Any]) – contains physicochemical properities of 20 amino acids

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetMoranAutop(AAP={}, AAPName='p')
GetMoreauBrotoAuto()[source]

Normalized Moreau-Broto autocorrelation descriptors (240).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetMoreauBrotoAuto()
GetMoreauBrotoAutop(AAP: Dict[Any, Any] = None, AAPName='p')[source]

Normalized Moreau-Broto autocorrelation descriptors for the given property (30).

Parameters:AAP (Dict[Any, Any]) – contains physicochemical properities of 20 amino acids

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetMoreauBrotoAutop(AAP={}, AAPName='p')
GetPAAC(lamda=10, weight=0.05)[source]

Type I Pseudo amino acid composition descriptors (default is 30).

Parameters:
  • lamda (int, optional (default: 10)) – reflects the rank of correlation and is a non-Negative integer, such as 15. Note that (1)lamda should NOT be larger than the length of input protein sequence; (2) lamda must be non-Negative integer, such as 0, 1, 2, …; (3) when lamda =0, the output of PseAA server is the 20-D amino acid composition.
  • weight (float, optional (default: 0.05)) – is designed for the users to put weight on the additional PseAA components with respect to the conventional AA components. The user can select any value within the region from 0.05 to 0.7 for the weight factor.

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetPAAC(lamda=10, weight=0.05)
GetPAACp(lamda=10, weight=0.05, AAP=None)[source]

Type I Pseudo amino acid composition descriptors for the given properties (default is 30).

Parameters:
  • lamda (int, optional (default: 10)) – reflects the rank of correlation and is a non-Negative integer, such as 15. Note that (1)lamda should NOT be larger than the length of input protein sequence; (2) lamda must be non-Negative integer, such as 0, 1, 2, …; (3) when lamda =0, the output of PseAA server is the 20-D amino acid composition.
  • weight (float, optional (default: 0.05)) – is designed for the users to put weight on the additional PseAA components with respect to the conventional AA components. The user can select any value within the region from 0.05 to 0.7 for the weight factor.
  • AAP (List) – contains the properties, each of which is a dict form.

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetPAACp(lamda=10, weight=0.05, AAP=[])
GetQSO(maxlag=30, weight=0.1)[source]

Quasi sequence order descriptors default is 50.

Parameters:
  • = GetQSO(maxlag=30, weight=0.1) (result) –
  • is the maximum lag and the length of the protein should be larger (maxlag) –
  • maxlag. default is 45. (than) –
GetQSOp(maxlag=30, weight=0.1, distancematrix=None)[source]

Quasi sequence order descriptors default is 50.

Parameters:
  • = GetQSO(maxlag=30, weight=0.1) (result) –
  • is the maximum lag and the length of the protein should be larger (maxlag) –
  • maxlag. default is 45. (than) –
  • is a dict form containing 400 distance values (distancematrix) –
GetSOCN(maxlag: int = 45)[source]

Sequence order coupling numbers default is 45.

Parameters:maxlag (int) – is the maximum lag and the length of the protein should be larger than maxlag

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetSOCN(maxlag=45)
GetSOCNp(maxlag=45, distancematrix=None)[source]

Sequence order coupling numbers default is 45.

Parameters:
  • is the maximum lag and the length of the protein should be larger (maxlag) –
  • maxlag. default is 45. (than) –
  • is a dict form containing 400 distance values (distancematrix) –

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetSOCN(maxlag=45)
GetSubSeq(ToAA='S', window=3)[source]

Obtain the sub sequences wit length 2*window+1, whose central point is ToAA.

ToAA is the central (query point) amino acid in the sub-sequence.

window is the span.

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetSubSeq(ToAA='S', window=3)
GetTPComp() → Dict[str, int][source]

Tri-peptide composition descriptors (8000).

Examples

>>> from propy.GetProteinFromUniprot import GetProteinSequence
>>> protein = GetProteinSequence(ProteinID="Q9NQ39")
>>> result = GetProDes(protein).GetTPComp()
Version = 1.0