-
Notifications
You must be signed in to change notification settings - Fork 0
Review RNF34 #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Review RNF34 #41
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,134 @@ | ||||||||||||||
from abc import ABC, abstractmethod | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
class BiologicalSequence(ABC): | ||||||||||||||
@abstractmethod | ||||||||||||||
def __init__(self, sequence): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
@abstractmethod | ||||||||||||||
def __len__(self): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
@abstractmethod | ||||||||||||||
def __getitem__(self, slc): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
@abstractmethod | ||||||||||||||
def __str__(self): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
@abstractmethod | ||||||||||||||
def __repr__(self): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
@abstractmethod | ||||||||||||||
def is_valid_alphabet(self): | ||||||||||||||
pass | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
class NucleicAcid(BiologicalSequence): | ||||||||||||||
def __init__(self, sequence): | ||||||||||||||
self.sequence = sequence | ||||||||||||||
|
||||||||||||||
def __len__(self): | ||||||||||||||
return len(self.sequence) | ||||||||||||||
|
||||||||||||||
def __getitem__(self, slc): | ||||||||||||||
return self.sequence[slc] | ||||||||||||||
|
||||||||||||||
def __str__(self): | ||||||||||||||
return str(self.sequence) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Хорошо, что перестраховались и точно определили строковую переменную) Но, тут это не обязательно, так как self.sequence - уже строка.
Suggested change
|
||||||||||||||
|
||||||||||||||
def __repr__(self): | ||||||||||||||
return self.sequence | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. В методе repr мы должны скорее передавать техническую информацию о классе. Я тоже в своей дз передала self.sequence, но при ревью подумала, что правильнее было бы использовать строку c указанием класса.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Верно подмечено! |
||||||||||||||
|
||||||||||||||
def is_valid_alphabet(self): | ||||||||||||||
alphabet = type(self).ALPHABET | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Класс! Учтено условие задания по полиморфизма классов NucleicAcid, DNASequence и RNASequence. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Я бы еще добавил что поскольку алфавит больше не внешняя константа, а классновый атрибут то его больше не надо писать в капсе |
||||||||||||||
if set(self.sequence).issubset(alphabet): | ||||||||||||||
return True | ||||||||||||||
else: | ||||||||||||||
return False | ||||||||||||||
Comment on lines
+48
to
+51
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Очень разумная проверка, однако в данном случае нет необходимости возвращать явным образом True и False. Вот так будет отлично работать:
Suggested change
|
||||||||||||||
|
||||||||||||||
def complement(self): | ||||||||||||||
if type(self) == NucleicAcid: | ||||||||||||||
raise NotImplementedError("Cannot complement NucleicAcid instance") | ||||||||||||||
Comment on lines
+54
to
+55
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Отлично, что делаете проверки на тип подаваемой последовательности! Однако, в рамках данной проверки не учитывается различие ДНК и РНК последовательностей. На мой взгляд, было бы корректней их разделить на каждый класс. Также проверки по типу лучше делать через конструкцию is
Suggested change
|
||||||||||||||
|
||||||||||||||
map_dict = type(self).MAP | ||||||||||||||
comp_seq = "".join([map_dict[base] for base in self.sequence]) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Классная конструкция! Я взяла себе на заметку) |
||||||||||||||
|
||||||||||||||
return type(self)(comp_seq) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Супер! Возвращаются не просто строковые переменные, но именно экземпляры определённого класса. |
||||||||||||||
|
||||||||||||||
def gc_content(self): | ||||||||||||||
if type(self) == NucleicAcid: | ||||||||||||||
raise NotImplementedError("Cannot gc_content NucleicAcid instance") | ||||||||||||||
Comment on lines
+63
to
+64
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
gc_count = sum([1 for base in self.sequence if base in ["C", "G"]]) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Тоже хорошая конструкция. Возможно, тут стоило бы на всякий случай учесть c и g в нижнем регистре. Кто ж знает, что пользователь может подавать на вход. |
||||||||||||||
gc_content = (gc_count / len(self)) * 100 | ||||||||||||||
|
||||||||||||||
return gc_content | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
class DNASequence(NucleicAcid): | ||||||||||||||
ALPHABET = set("ATGC") | ||||||||||||||
MAP = {"A": "T", "T": "A", "C": "G", "G": "C"} | ||||||||||||||
Comment on lines
+72
to
+73
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Отлично, что константы были определены, как атрибуты класса. Но, на мой взгляд, по синтаксису они не отличаются от переменных, поэтому должны быть указаны в нижнем регистре.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. а о, супер, да |
||||||||||||||
|
||||||||||||||
def transcribe(self): | ||||||||||||||
transcribed = self.sequence.replace("T", "U") | ||||||||||||||
|
||||||||||||||
return RNASequence(transcribed) | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
class RNASequence(NucleicAcid): | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Классы DNASequence и RNASequence содержат все необходимые методы, определённые по условию задачи. Однако, в них не определена проверка на соответствие последовательности классу. Грубо говоря, конструкция RNASequence('ATGC') отлично работает, но это биологически не корректно. Ранее мы определили метод is_valid_alphabet(), который отлично бы использовался в данных целях. Для того, чтобы его реализовать мы можем переопределить init нового класса. Например, я сделала так:
Suggested change
|
||||||||||||||
ALPHABET = set("AUGC") | ||||||||||||||
MAP = {"A": "U", "U": "A", "C": "G", "G": "C"} | ||||||||||||||
Comment on lines
+82
to
+83
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
|
||||||||||||||
class AminoAcidSequence(BiologicalSequence): | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Тоже было бы неплохо сделать проверку на соответствие АМК последовательности |
||||||||||||||
ALPHABET = set("ACDEFGHIKLMNPQRSTVWY") | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
def __init__(self, sequence): | ||||||||||||||
self.sequence = sequence | ||||||||||||||
|
||||||||||||||
def __len__(self): | ||||||||||||||
return len(self.sequence) | ||||||||||||||
|
||||||||||||||
def __getitem__(self, slc): | ||||||||||||||
return self.sequence[slc] | ||||||||||||||
|
||||||||||||||
def __str__(self): | ||||||||||||||
return str(self.sequence) | ||||||||||||||
Comment on lines
+98
to
+99
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
def __repr__(self): | ||||||||||||||
return self.sequence | ||||||||||||||
Comment on lines
+101
to
+102
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
def is_valid_alphabet(self): | ||||||||||||||
alphabet = type(self).ALPHABET | ||||||||||||||
if set(self.sequence).issubset(alphabet): | ||||||||||||||
return True | ||||||||||||||
else: | ||||||||||||||
return False | ||||||||||||||
Comment on lines
+106
to
+109
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
amino_acid_frequency = {} | ||||||||||||||
|
||||||||||||||
def calculate_aa_freq(self): | ||||||||||||||
""" | ||||||||||||||
Calculates the frequency of each amino acid in a protein sequence or sequences. | ||||||||||||||
|
||||||||||||||
:param sequences: protein sequence or sequences | ||||||||||||||
:type sequences: str or list of str | ||||||||||||||
:return: dictionary with the frequency of each amino acid | ||||||||||||||
:rtype: dict | ||||||||||||||
""" | ||||||||||||||
Comment on lines
+114
to
+121
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Можно было бы маленько подправить докстринг при переносе функции в новую программу :) |
||||||||||||||
|
||||||||||||||
# Creating a dictionary with aminoacid frequencies: | ||||||||||||||
amino_acid_frequency = {} | ||||||||||||||
|
||||||||||||||
for amino_acid in self.sequence: | ||||||||||||||
# If the aminoacid has been already in: | ||||||||||||||
if amino_acid in amino_acid_frequency: | ||||||||||||||
amino_acid_frequency[amino_acid] += 1 | ||||||||||||||
# If the aminoacid hasn't been already in: | ||||||||||||||
else: | ||||||||||||||
amino_acid_frequency[amino_acid] = 1 | ||||||||||||||
|
||||||||||||||
return amino_acid_frequency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Прекрасная работа с абстрактным классом. Все необходимые абстрактные методы учтены!