Note
This is an introductory python course for Biology Master students at the University of Würzburg. The material is based on books that are freely available or accessible to the University Library. The course material can be re-used under CC-BY and the source code is under MIT license. The original course material is hosted on WueCampus.
An announcement forum (hosted in WueCampus)
Dear students,
this message contains important information about our course Programmieren mit Python, please read carefully:
- please fill out the very short pre-course survey in WueCampus until Friday, February 14
- there will be a beginner and an advanced group (you can decide for yourself, which one to join)
- the course starts next week on Monday, February 17 at 9:00 am for everyone
- after that, the beginner and advanced groups will meet at different times via Zoom (see tentative schedule in wuecampus)
- between meetings you will work through assignments on your own machine
- please perform the setup steps outlined in wuecampus and verify that it works
- if you have any problems with the setup, let me know until Friday, February 14 (so best do the setup now)
- join the Zoom meetings with the device you use for programming so you can potentially share your screen
I'm looking forward to seeing you all next week. Markus Ankenbrand
- English or German: Do you prefer to have this course in English or German?
- Prior experience: Have you ever worked with Python before? yes/no
- Other languages: Have you programmed in any other programming languages? Which ones?
- Reading code: Do you understand what the following Python code does? And would you feel comfortable explaining it to the class? yes (understanding)/yes (understanding and explaining)/no
l = [1, 4, 6, 9, 2, 4, 3]
m = l[0]
for i in l:
if i > m:
m = i
print(m)
- Writing code: Do you feel comfortable writing Python code to solve the following problem and explaining it to the class? yes (coding)/yes (coding and explaining)/no
Given two lists (l1 and l2) with an equal number of elements, create a third list, l3, that contains the smaller of the two numbers from l1 and l2 at that position:
l1 = [1, 4, 6, 9, 2, 4, 3]
l2 = [3, 2, 1, 6, 3, 7, 2]
# task: write some code that produces
l3 = [1, 2, 1, 6, 2, 4, 2]
https://github.com/BioMeDS/Python_Course_WS24
All lectures happen in the same Zoom room (accessible through WueCampus)
A question forum (hosted in WueCampus)
This is a hands-on course, so you will practice programming in Python using your own computer. Therefore, you need to install some required software:
- Miniforge
- Visual Studio Code (with python and jupyter extensions)
- Git (advanced group)
Please follow the instructions linked above, to first install Minforge (follow the instructions closely) and then VS Code. Then install the extensions from within VS Code. If you plan to attend the advanced group, install Git, as well.
To finalize and test the installation, create a folder for the python course and open this folder in VS Code. Create a new file called "test.ipynb" and create a new "Code" cell in the created notebook. Enter "1+1" and click the little play icon to run the code. You will be prompted to install required packages. After confirming and waiting a minute, the result "2" should show up under the cell (see screenshot). 🎉 Congratulations, you are all set 🎉

Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
09:00 | Welcome | Beginner | Beginner | Beginner | Exam Assignments |
10:00 | Beginner | ||||
11:00 | Advanced | ||||
12:00 | |||||
13:00 | Advanced | Advanced | Advanced | Final Question Round | |
14:00 | |||||
15:00 | Beginner | Beginner | Beginner | Beginner | |
16:00 |
Python for the Life Sciences - A Gentle Introduction to Python for Life Scientists by Alexander Lancaster and Gordon Webster
Free full text from the university network | SpringerLink
- work through Chapter 2, "Python at the Lab Bench"
- follow along with the code examples by manually typing them into a jupyter notebook (in VS Code)
- execute the code examples and try some modifications
- write down any questions and problems that occur, and we'll discuss them in the afternoon session
- when you are done, try to solve the exercises
- Write a function
fahrenheit_to_celsius(temperature)
that can convert Fahrenheit to Celsius and a functioncelsius_to_fahrenheit(temperature)
that does the reverse - Write a function
is_leap_year(year)
that tells whether any given year is a leap year or not
- Writing and running code (with Jupyter Notebooks)
- Variables and Types (float, int, string, boolean (True/False))
- Comments
- Code blocks (indentation)
- Functions
- Conditionals (if/elif/else)
- work through Chapter 3, "Making Sense of Sequences"
- follow along with the code examples, execute them and try some modifications
- write down any questions and problems that occur, and we'll discuss them in the afternoon session
- try to implement the final
restrictionDigest
function (from the end of the chapter) without typing it 1-to-1 - when you are done, try to solve the exercises
- Write a function
subsequence_positions(sequence, subsequence)
that takes a sequence and a subsequence as a string and returns a list of all positions where the subsequence occurs within the sequence - Write a function
subsequences_positions(sequence, subsequences)
that takes a sequence as a string and a list of subsequences and returns a dictionary with each subsequence as keys and a list of all positions where that subsequence occurs within the sequence as values
- Loops (for)
- lists
- dictionaries
- string searching
- methods
- Code blocks (indentation)
- Functions
- Conditionals (if/elif/else)
- work through Chapter 4, "A Statistical Interlude"
- follow along with the code examples, execute them, and try some modifications
- write down any questions and problems that occur, and we'll discuss them in the afternoon session
- when you are done, try to solve the exercises, you'll need the following additional tutorials:
- walk through the first two
pandas
tutorials: "What kind of data does pandas handle" and "How do I read and write tabular data" - walk through the basic
matplotlib
tutorial
- Calculate the results of
biomarker(pDisease, 0.8, 0.04)
withpDisease
taking all values from 0 to 1 in steps of 0.05. Store the results in a list. - Create a pandas
DataFrame
with two columnspDisease
andbiomarker
with the values from the previous exercise and save it asbiomarker.csv
. - Plot the result from exercise 1 using
matplotlib
, withpDisease
on the x-axis andbiomarker
on the y-axis.
mamba install pandas
in Miniforge Prompt (Windows), Terminal (mac OS, Linux)
- function arguments
- string formatting
- tables with
pandas
- plotting with
matplotlib
- work through Chapter 5, "Opening Doors to Your Data"
- follow along with the code examples, execute them, and try some modifications
- write down any questions and problems that occur, and we'll discuss them in the afternoon session
- when you are done, try to solve the exercises
- Re-write the fasta parser to a function
parse(file)
that takes a file name and returns the dictionary - Add an optional argument
save_stats=False
to the function that, if set toTrue
, will save a filestats.csv
with the three columns: sequence id, length, GC %
- reading text files
- type conversion
- exception handling (try/except)
Bioinformatics Algorithms: An Active Learning Approach by Phillip Compeau and Pavel Pevzner
- work through Chapter 1.1-1.3, "A journey of 1000 miles...", "Hidden messages in the replication origin", and "Some Hidden Messages are More Surprising than Others"
- create a git repository for this course (optionally link it to a public GitHub repository)
- implement code to solve the "coding challenges"
- put all functions in a python file and use them with example input from a jupyter notebook (you might want to use autoreload)
- write down any questions and problems that occur, and we'll discuss them in the session tomorrow
- make git commits after every meaningful step (e.g. implementing a new function)
- Implement
PatternCount
- Solve the Frequent Words Problem
- Solve the Reverse Complement Problem
- Solve the Pattern Matching Problem
- Version Control with Git: An introduction to Git in VS Code
- Pro Git Book: A deep-dive into git. The basics are covered in the first few chapters. You don't need to read this book to follow along.
- Version control (with git)
- Interactive code execution (with jupyter notebooks)
- Importing code from a local python file
- work through Chapter 1.4, "An Explosion of Hidden Messages"
- add documentation for all functions of this course (past and future)
- add type hints for all functions of this course (past and future)
- add extensive tests for all functions of this course (past and future)
- benchmark your functions with
%%timeit
- implement code to solve the "coding challenge"
- write down any questions and problems that occur, and we'll discuss them in the session tomorrow continue to make git commits after every meaningful step (e.g. implementing a new function, test, etc.)
- Solve the Clump Finding Problem
- Find all (500,3)-clumps in the E. coli genome (how long does your code take for this task?)
- Documentation (docstrings)
- Type hints
- Testing (
unittest
) - Benchmarking (
%%timeit
)
- work through Chapters 1.5-1.7, "The Simplest Way to Replicate DNA", "Asymmetry of Replication", and "Peculiar Statistics of the Forward and Reverse Half-Strands"
- select and install a formatter and let it format your python files and notebooks (PEP-8 style guide)
- select and install a linter and solve any problems it reports
- implement code to solve the "coding challenge"
- write down any questions and problems that occur, and we'll discuss them in the session tomorrow
- continue to
- make regular git commits
- document your code (including type hints)
- test your code
- Solve the Minimum Skew Problem
- Write a program that takes a single-sequence fasta file as an input and creates a skew diagram as a png
- Extend the above program for multi-sequence fasta files and instead create a pdf file with one plot per sequence (multi-page pdf with matplotlib)
- Try out the debugging feature of VS Code to run your code step-by-step
- Inspect the test coverage in your repository
- Salmonella_enterica.fa
- Prochlorococcus.fa (multi fasta)
- Enforcing a consistent style (formatting)
- Avoid potential problems (linting)
- Debugging
- Test Coverage
- work through Chapters 1.8-1.9, "Some Hidden Messages are More Elusive than Others", and "A Final Attempt at Finding DnaA Boxes in E. coli"
- implement code to solve the "coding challenge"
- transform your repository into a python package
pycourse24
that you can install into your environment withpip install -e .
and then import from anywhere in the file system - continue to
- make regular git commits
- document your code (including type hints)
- test your code
- Solve the Hamming Distance Problem
- Solve the Approximate Pattern Matching Problem
- Implement
ApproximatePatternCount
- Solve the Frequent Words with Mismatches Problem
- Solve the Frequent Words with Mismatches and Reverse Complements Problem
- (Optionally) Solve the Final Challenge: Find a DnaA box in Salmonella enterica.
- Python Packaging Guide. Minimal Structure for a python package. Don't publish your package on PyPI.
- Python packages
- Codingame: Level up your coding with games, puzzles, and challenges.
- Introduction to Machine Learning with Python by Andreas C. Müller, Sarah Guido
- Python Data Science Handbook: Introduction to Data Science with Python after you have learned the language basics.
[Link to evaluation form]
Please take the time to (anonymously) evaluate this course.
Solve all six assignments in the Rosalind course room by March 7, 2025. As this is an official examination, you must solve the assignments independently. You can use Google but don't copy-paste solutions from someone else (including Chat-GPT). You need to upload your code for each challenge. Be prepared to explain your solution to Markus. In order to pass the exam:
- sign up for the exam in WueStudy
- solve all six assignments before the deadline
- Markus checks the code you submitted
- if the code looks suspicious you have to explain your solutions to Markus
- some students will be selected at random to explain their solutions to Markus
- if you are unable to explain your solution, you need to hand in another solution
[Course Sign-up on Rosalind] Set your full name to your real name.