Skip to content

HW4_Sapozhnikov #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d98ccaf
Initial commit
NSapozhnikov Sep 26, 2023
42c89b3
Add local alignment function
NSapozhnikov Sep 26, 2023
873f7d7
Add local alignment functionality
NSapozhnikov Sep 28, 2023
86f281b
start development from_proteins_seqs_to_rna function
Sep 28, 2023
231efcc
add cycles converting proteins to RNA in from_proteins_seqs_to_rna fu…
Sep 28, 2023
09d4711
add from_proteins_seqs_to_rna and isoelectric_point_determination fun…
Sep 29, 2023
1e98426
add gitignore
Sep 29, 2023
d6f1bfd
remove excess files
Sep 29, 2023
f5e4308
Minor code revision
NSapozhnikov Sep 29, 2023
c80ea15
Merge pull request #3 from NSapozhnikov/HW4_Nekrasova
NSapozhnikov Sep 30, 2023
3e337f7
Merge branch 'dev' into local_alignment
NSapozhnikov Sep 30, 2023
197d0a4
Merge pull request #2 from NSapozhnikov/local_alignment
NSapozhnikov Sep 30, 2023
c318a3a
Add recode() function
NSapozhnikov Sep 30, 2023
96e209d
add raise ValueError in from_proteins_seqs_to_rna function, add line …
Sep 30, 2023
ca4847a
Merge pull request #4 from NSapozhnikov/HW4_Nekrasova
NSapozhnikov Sep 30, 2023
703249c
Add back_transcribe function
AlinaPotyseva Sep 30, 2023
1ba062b
Add gc_content function
AlinaPotyseva Sep 30, 2023
b21741c
Add count_protein_molecular_weigh function
AlinaPotyseva Sep 30, 2023
463dbf3
Add recode() function
NSapozhnikov Sep 30, 2023
f19c48e
Merge branch 'dev' into recode_sequences
NSapozhnikov Sep 30, 2023
2d5e2de
Merge pull request #5 from NSapozhnikov/recode_sequences
NSapozhnikov Sep 30, 2023
a29692d
changed order of functions
AlinaPotyseva Sep 30, 2023
6ce8cf8
Changed order of functions
AlinaPotyseva Sep 30, 2023
5cc5a9b
changed order of functions
AlinaPotyseva Sep 30, 2023
3d32430
Merge branch 'dev' into dev_Alina
AlinaPotyseva Sep 30, 2023
e359ad1
Merge pull request #7 from NSapozhnikov/dev_Alina
AlinaPotyseva Sep 30, 2023
be1abc5
Major code review and merging all functions together
NSapozhnikov Oct 1, 2023
05caf3c
Update README.md
NSapozhnikov Oct 1, 2023
1bdbb2f
Update README.md
NSapozhnikov Oct 1, 2023
3d76bb5
Update README.md
NSapozhnikov Oct 1, 2023
f32641a
Update README.md
NSapozhnikov Oct 1, 2023
9d6f687
Update README.md
NSapozhnikov Oct 1, 2023
9563f39
Update README.md
NSapozhnikov Oct 1, 2023
597f21b
Update README.md
NSapozhnikov Oct 1, 2023
f6a34ef
Update README.md
NSapozhnikov Oct 1, 2023
f6e10e8
Update README.md
NSapozhnikov Oct 1, 2023
fc97f7d
Update README.md
NSapozhnikov Oct 1, 2023
ca5cf54
Update HW4_Sapozhnikov/README.md
NSapozhnikov Oct 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.idea/*
109 changes: 109 additions & 0 deletions HW4_Sapozhnikov/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# HW 4. Functions 2
> *This is the repo for the fourth homework of the BI Python 2023 course*

### Prototool
`prototool.py` is a special script for working with polyaminoacid sequences

***

### Overview
`prototool.py` includes 7 methods to treatment of polyaminoacid sequences.
`prototool.py` can be used for the following purposes:
- recoding 1-letter coded polyaminoacid seqeunces into 3-letter coded and vice versa;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- recoding 1-letter coded polyaminoacid seqeunces into 3-letter coded and vice versa;
- recoding 1-letter coded polyaminoacid seqeunces into 3-letter coded and *vice versa*;

- polyaminoacid sequences aligment with Smith-Waterman algorithm [^1];
- finding possinle RNA sequences for given polyaminoacid sequences;
- determining polyaminoacid isoelectric point;
- calculating polyaminoacid molecular weight;
Comment on lines +15 to +16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- determining polyaminoacid isoelectric point;
- calculating polyaminoacid molecular weight;
- calculating polyaminoacid isoelectric point;
- calculating polyaminoacid molecular weight;

- finding possinle DNA sequences for given polyaminoacid sequences;
- determining GC-content of a corresponding DNA sequence to a given polyaminoacid sequence

***

### Usage
This tool can be used both standalone and as module.
- to use `prototool` standalone you will have to add these lines in the code
![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/5fa3cf7f-e6f3-4294-9e81-b1ebe17c8514)
- where *args are sequences you want to process and method is a specified algorithm to use
- your result will be written in a variable (test on a picture)
- to use `prototool` as module (recomended) you should import it as any other module (check the path: prototools.py should be in the same directory as your script). Then you can freely use any of its functions (see examples).

***

### Options
Arguments:
- `*args[str]` sequences to work with. You can pass several arguments into all functions
- `method` - a method to use

output: All functions return a dict, where keys are original sequenses, values are results after using a corresponding method.

***

### Examples

def recode allows to translate 1-letter to 3-letters polyaminoacids code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def recode allows to translate 1-letter to 3-letters polyaminoacids code
Function `recode` translates 1-letter to 3-letters polyaminoacids code

- `main('AlaValTyr', 'DNT', method = 'recode')`
- `recode('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/117befa5-feaa-433a-9ac9-23cffe9b024f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Круто что вы вставили скриншоты! Тем не менее, в реальных тулах и README в таком случае лучше переключаться на светлую тему

***
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут между примерами имхо эти линии не нужны, а то совсем README получается рсчерченым в линеечку


def local_alignmen perform a local alignment of 2 given sequences. Needs at least two sequences to be passed
- `main('MetAsnTrp', 'MNT', method='local_alignment')`
- `local_alignmen('MetAsnTrp', 'MNT')`
- Note that local_alignment function has a flag prettify (default = True) that prints out aligned sequences on each another
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/4dd36d24-a177-4419-9053-a5e2923a980c)
***

def from_proteins_seqs_to_rna allows to decode polyaminoacid sequences in RNA sequences
- `main('AlaValTyr', 'DNT', method = 'from_proteins_seqs_to_rna')`
- `from_proteins_seqs_to_rna('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/9ee92d0d-68a4-471b-b65a-2fa6b46ab844)
***

def isoelectric_point_determination allows to determine isoelectric point of polyaminoacid sequences
- `main('AlaValTyr', 'DNT', method = 'isoelectric_point_determination')`
- `isoelectric_point_determination('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/24027a07-b20b-42d4-bb10-4ca7189038d4)
***

def back_transcribe allows to decode polyaminoacid sequences in DNA sequences
- `main('AlaValTyr', 'DNT', method = 'back_transcribe')`
- `back_transcribe('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/71f07616-a37d-48da-9e63-82b81836b9d7)
***

def count_gc_content allows to count the ratio of GC in the entire DNA sequence
- `main('AlaValTyr', 'DNT', method = 'count_gc_content')`
- `count_gc_content('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/d2705714-a3e8-4054-8998-61d922a4feb6)
***

def count_protein_molecular_weight allows to calculate the molecular weight of the polyaminoacid
- `main('AlaValTyr', 'DNT', method = 'count_protein_molecular_weight')`
- `count_protein_molecular_weight('AlaValTyr', 'DNT')`
- ![image](https://github.com/NSapozhnikov/HW4_Sapozhnikov/assets/81642791/cc1eff9a-1b39-4232-98e4-80f622101083)

***

### Troubleshooting
If you have `ValueError("No input defined.")` it means, that you have an empty input. Please, enter the correct input.
***
If you have `ValueError(method, " is not a valid method.")` it means, that your tool is not correct. Please, enter the right tool.
***
If you have `ValueError('Non-protein aminoacids in sequence')` it means, that your sequences contain non-protein aminoacids. Please, check your sequences and enter the correct input.

***

### Contributions and contacts

Feel free to report any bugs and problems encountered.
Email: [email protected] developed recode(), prettify_alignment(), local_alignmen(), check_input()
***
[email protected] developed from_proteins_seqs_to_rna(), isoelectric_point_determination()
***
[email protected] developed back_transcribe(), count_gc_content(), count_protein_molecular_weight()
Comment on lines +99 to +103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Email: [email protected] developed recode(), prettify_alignment(), local_alignmen(), check_input()
***
[email protected] developed from_proteins_seqs_to_rna(), isoelectric_point_determination()
***
[email protected] developed back_transcribe(), count_gc_content(), count_protein_molecular_weight()
Nikita Sapozhnikov ([[email protected]](mailto:[email protected])) `developed recode()`, `prettify_alignment()`, `local_alignmen()`, check_input()`
***
Daria Nekrasova ([[email protected]](mailto:[email protected])) developed `from_proteins_seqs_to_rna()`, `isoelectric_point_determination()`
***
Potyseva Alina ([[email protected]](mailto:[email protected])) developed `back_transcribe()`, `count_gc_content()`, `count_protein_molecular_weight()`


***

### References

[^1]: T.F. Smith, M.S. Waterman, (1981). [Identification of common molecular subsequences](https://doi.org/10.1016/0022-2836(81)90087-5). Journal of Molecular Biology.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Loading