Added a clustal test file, an example notebook, and alignment parsing… #1

biophyser · 2017-10-29T02:28:27Z

… in the _read function. Also made two hidden funcitons to iterate sequence and alignment biopython objects.

Did a bit of hacking to get the dataframes to have the same form as previously. Multiple sequence alignments are spit out as a multiindexed dataframe.

… in the _read function. Also made two hidden funcitons to iterate sequence and alignment biopython objects.

Zsailer · 2017-10-29T18:15:43Z

Awesome! This looks good. I'm testing it now and will leave comments if I see things that need fixing.

Zsailer · 2017-10-29T18:24:38Z

One general comment about coding style:

I typically use a "lowercase and underscore" convention when naming instance variables, functions,
and methods (except in the unusual scenario where a function is a factory for a class).

This convention was set by PEP 8. I'd like to follow PEP 8 as best as we can.

I'll leave comments in the PR where this convention is broken.

Zsailer

I only marked a few of the places where PEP 8 was broken. There a few more that need fixing as well.

Zsailer · 2017-10-29T18:25:40Z

phylopandas/dataframe.py

+
+    # Distinguish sequence vs. alignment, pass SeqRecords
+    if schema == 'clustal':
+    	alignIter = AlignIO.parse(filename, format=schema, **kwargs)


follow PEP 8: alignIter -> align_iter

Zsailer · 2017-10-29T18:26:10Z

phylopandas/dataframe.py

+    if schema == 'clustal':
+    	alignIter = AlignIO.parse(filename, format=schema, **kwargs)
+    	# Parse the multiple sequence alignments
+    	alignDict = _parseAlignRec(alignIter, seq_label)


follow PEP 8: alignDict -> align_dict

Zsailer · 2017-10-29T18:27:22Z

phylopandas/dataframe.py

+    # Return DataFrame.
+    return data
+
+def _parseAlignRec(alignIter, seq_label):


follow PEP 8:

_parseAlignRec -> _parse_align_record

alignIter -> align_iter

Zsailer · 2017-10-29T18:28:03Z

phylopandas/dataframe.py

+
+	return alignData
+
+def _parseSeqRec(seqIter, seq_label):


follow PEP 8:

_parseSeqRec -> _parse_seq_rec

seqIter -> seq_iter

Zsailer · 2017-10-29T18:28:30Z

phylopandas/dataframe.py

+def _parseSeqRec(seqIter, seq_label):
+	"""A Bio.SequenceRecord  parser.
+	"""
+	seqDict = dict()


follow PEP 8: seqDict -> seq_dict

Continue below.

Zsailer · 2017-10-29T19:16:59Z

It might be nice to be able to name the alignments instead of having each alignment be multiindexed by a number but maybe that's not a huge issue.

I think I agree with this statement... give each item a value in the name column (like alignment_1 and alignment_2) rather than multi-index.

Could you explain the output in the example clustal? Are each mouse/opossum line-pairs just segments of one longer sequence? Or are each pair a different sequence?

Added a clustal test file, an example notebook, and alignment parsing…

90330e1

… in the _read function. Also made two hidden funcitons to iterate sequence and alignment biopython objects.

Zsailer requested changes Oct 29, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a clustal test file, an example notebook, and alignment parsing… #1

Added a clustal test file, an example notebook, and alignment parsing… #1

biophyser commented Oct 29, 2017

Zsailer commented Oct 29, 2017

Zsailer commented Oct 29, 2017

Zsailer left a comment

Zsailer Oct 29, 2017

Zsailer Oct 29, 2017

Zsailer Oct 29, 2017

Zsailer Oct 29, 2017

Zsailer Oct 29, 2017

Zsailer commented Oct 29, 2017

Added a clustal test file, an example notebook, and alignment parsing… #1

Are you sure you want to change the base?

Added a clustal test file, an example notebook, and alignment parsing… #1

Conversation

biophyser commented Oct 29, 2017

Zsailer commented Oct 29, 2017

Zsailer commented Oct 29, 2017

Zsailer left a comment

Choose a reason for hiding this comment

Zsailer Oct 29, 2017

Choose a reason for hiding this comment

Zsailer Oct 29, 2017

Choose a reason for hiding this comment

Zsailer Oct 29, 2017

Choose a reason for hiding this comment

Zsailer Oct 29, 2017

Choose a reason for hiding this comment

Zsailer Oct 29, 2017

Choose a reason for hiding this comment

Zsailer commented Oct 29, 2017