Skip to content

Commit

Permalink
Misc small edits, beef up explanation of some new concepts
Browse files Browse the repository at this point in the history
  • Loading branch information
projectgus committed Jun 12, 2013
1 parent 825d75a commit 4a14105
Show file tree
Hide file tree
Showing 5 changed files with 63 additions and 36 deletions.
22 changes: 18 additions & 4 deletions core/charts.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ So far we haven't done anything to really explore IPython Notebook's features, b

## Inline Charts

Start by running the following snippet in an IPython Notebook cell:
If you're using Windows, you'll need to run the following special IPython command in an IPython Notebook cell:

%pylab inline

... this tells IPython Notebook that you want charts to be shown "inline" inside your notebook.

NB: You can also specify this on the command line by launching notebook with the arguments "--pylab inline"
If you're not using Windows and you started Notebook from a command line with the arguments `--pylab inline`, then this is already done.

## Simple Example

Expand Down Expand Up @@ -95,9 +95,23 @@ We create a range of indexes for the X values in the graph, one entry for each e

`plt.xticks()` specifies a range of values to use as labels ("ticks") for the X axis.

`x + 0.5` is a special expression because x is a NumPy array. NumPy arrays have some special capabilities that normal lists or `range()` objects don't have. `x + 0.5` for a normal Python range would be erroneous (you can't add a plain number to a list), but for [NumPy arrays this means](http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#arithmetic-and-comparison-operations) "add 0.5 to all of the numbers in the array."
`x + 0.5` is a special expression because x is a NumPy array. NumPy arrays have some special capabilities that normal lists or `range()` objects don't have.

This means that `0,1,2,3`,etc. becomes `0.5,1.5,2.5,3.5`,etc. This is what positions the X axis labels in the middle of each bar. If you remove the `+ 0.5` then the labels move across to the left hand side of each bar. Try it and see!
Doing this with a normal range is an error (try it and see):

x = range(5)
print(x)
print(x + 0.5)

However, for [NumPy arrays this means](http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#arithmetic-and-comparison-operations) "add 0.5 to all of the numbers in the array."

x = np.arange(5)
print(x)
print(x + 0.5)

Run the above code in IPython Notebook and see what it prints out.

This is what positions the X axis labels in the middle of each bar (0.5 across from the left hand side.) If you remove the `+ 0.5` from the bar graph example then the labels move across to the left hand side of each bar. Try it and see!

Finally, `rotation=90` ensures that the labels are drawn sideways (90 degree angle) not straight. You can experiment with different rotations to create different effects.

Expand Down
26 changes: 14 additions & 12 deletions core/csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ title: Reading and writing comma-separated data

Comma-separated values (CSV) is a way of expressing structured data in flat text files:

"Coffee","Water,Milk,Icecream"
"Espresso","No,No,No"
"Long Black","Yes,No,No"
"Flat White","No,Yes,No"
"Cappuccino","No,Yes - Frothy,No"
"Affogato","No,No,Yes"
"Coffee","Water","Milk","Icecream"
"Espresso","No","No","No"
"Long Black","Yes","No","No"
"Flat White","No","Yes","No"
"Cappuccino","No","Yes - Frothy","No"
"Affogato","No","No","Yes"

It's a commonly used format to get data in and out of programs like Spreadsheet software, where the data is tabular.

Expand Down Expand Up @@ -44,7 +44,9 @@ We're going to do some processing of real-world data now, using freely available

**TIP:** As we're moving on from radishes to aircraft, now is a good time to start a new notebook in IPython Notebook (under File->New) to keep everything organised. Don't forget to save your old notebook!

Visit the [OpenFlights data page](http://openflights.org/data.html) and download their airports data file - "airports.dat". This is a file in CSV format.
Visit the [OpenFlights data page](http://openflights.org/data.html) and download their airports data file - "airports.dat". This is a file in CSV format, open it in a text editor if you want to have a look at it.

## Challenge

Can you use this file to print all of the airport names for a particular country (say, Australia or Russia)?

Expand Down Expand Up @@ -83,9 +85,9 @@ By using both data sources, we can calculate how far each route travels and then

This a multiple stage problem:

* Read the airport database (airports.dat) and build a dictionary mapping the unique airport ID to the geographical coordinates (latitude & longitude.) This allows you to look up the location of each airport by its ID.
* Read the airports file (airports.dat) and build a dictionary mapping the unique airport ID to the geographical coordinates (latitude & longitude.) This allows you to look up the location of each airport by its ID.

* Read the routes database (routes.dat) and get the IDs of the source and destination airports. Look up the latitude and longitude based on the ID. Using those coordinates, calculate the length of the route and append it to a list of all route lengths.
* Read the routes file (routes.dat) and get the IDs of the source and destination airports. Look up the latitude and longitude based on the ID. Using those coordinates, calculate the length of the route and append it to a list of all route lengths.

* Plot a histogram based on the route lengths, to show the distribution of different flight distances.

Expand Down Expand Up @@ -165,7 +167,7 @@ Now we're ready to create a histogram displaying the frequency of flights by dis
import numpy as np
import matplotlib.pyplot as plt

plt.hist(distances, 100, facecolor='r', alpha=0.75)
plt.hist(distances, 100, facecolor='r')
plt.xlabel("Distance (km)")
plt.ylabel("Number of flights")

Expand All @@ -175,9 +177,9 @@ Now we're ready to create a histogram displaying the frequency of flights by dis

`plt.hist()` does most of the work here. The first argument we supply is the dataset (list of distances.)

The second argument is the number of bins to divide the histogram up into. You can increase this number to see more distinct bars and a more detailed picture, or reduce it to see a coarser picture. Try setting it to some other values and see what happens to the histogram plot.
The second argument (100) is the number of bins to divide the histogram up into. You can increase this number to see more distinct bars and a more detailed picture, or reduce it to see a coarser picture. Try setting it to some other values and see what happens to the histogram plot.

The third argument sets the colour of the graph, "r" for red. There are a lot of ways to specify colours in matplotlib, [the documentation explains them all](http://matplotlib.org/api/colors_api.html).
The third argument, `facecolor`, sets the colour of the graph, "r" for red. There are a lot of ways to specify colours in matplotlib, [the documentation explains them all](http://matplotlib.org/api/colors_api.html).

The [full arguments available for hist() can be viewed in the matplotlib documentation](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist).

Expand Down
14 changes: 7 additions & 7 deletions core/notebook.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,12 @@ Python 2 and Python 3 have some minor incompatible differences in language synta

You interact with IPython Notebook using your web browser. The Notebook program creates a "web server" locally on your computer that you then connect to.

To start IPython Notebook on Windows or OS X, there should be a clickable launcher under the Start menu (Windows) or in the Applications folder (OS X.)
To start IPython Notebook on Windows (with Anaconda), there should be a clickable launcher under the Start menu.

Otherwise, you can start it from a command line terminal by running this command:
On Linux or OS X, you can start IPython Notebook from the command line. First open a terminal window, use 'cd' to navigate to the directory where you want to store your notebooks and other Python files. Then run this command:

ipython notebook --pylab inline

You should see some output like this:

[NotebookApp] Using existing profile dir: u'/home/gus/.ipython/profile_default'
Expand All @@ -110,7 +110,7 @@ Try typing something like `print("Hello World")` into the cell. To run the code
<img src="../images/notebook_hello_world.png" alt="IPython Notebook Hello World">
</img>

You'll see that whenever you run a cell, a new cell appears where you can enter another set of Python statements. Try assigning a variable. Let's make another shopping list:
You'll see that whenever you run a cell, a new empty cell appears where you can enter another set of Python statements. Try assigning a variable. Let's make another shopping list:

<img src="../images/assign_shopping_list.png" alt="IPython Notebook Assign Variable">
</img>
Expand All @@ -136,9 +136,9 @@ You can also load a pre-existing Python file into an IPython Notebook cell by ty

%load "myprogram.py"

and running it, which loads up a new cell containing the contents of *myprogram.py*.
Into a cell and running it. This loads up a new cell containing the contents of *myprogram.py*.

Test this feature out by loading one of the scripts you wrote before. You may have to specify the full path to script file, depending on the directory IPython Notebook started up from.
Test this feature out by loading one of the scripts you wrote during the recap session. You may have to specify the full path to the script file, depending on the directory IPython Notebook started up from.

There is one other useful built-in tool for working with Python files:

Expand Down Expand Up @@ -168,7 +168,7 @@ If you're using the command line on Windows, you can use Explorer to find your d

* In previous workshops we used `help()` to view help information in the Python interpreter. IPython Notebook makes this even simpler, you can just type the name of a Python function or module and end it with a `?`. Try it now, type `print?` into a cell and run it.

* Using a nifty tool called NBViewer you can easily share IPython Notebooks on the internet, rendered as web pages (but still downloadable to play with in IPython.) Check out the [NBViewer home page](http://nbviewer.ipython.org/) or the [IPython Notebook gallery](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks) for some interesting starting points
* Using a nifty tool called NBViewer you can easily share IPython Notebooks on the internet, rendered as web pages (but still downloadable to play with in IPython.) Check out the [NBViewer home page](http://nbviewer.ipython.org/) or the [IPython Notebook gallery](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks) for some interesting ones.

## Next Chapter

Expand Down
12 changes: 8 additions & 4 deletions core/strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ title: Working with Strings

# A problem

Now we know how to read information from text files, we'll use that knowledge to solve a problem:
Now we know how to work with text files, we'll use that knowledge to solve a problem:

Suppose you're a greengrocer, and you run a survey to see what radish varieties your customers prefer the most. You have your assistant type up the survey results into a text file on your computer, so you have 300 lines of survey data in the file [radishsurvey.txt](../files/radishsurvey.txt). Each line consists of a name, a hyphen, then a radish variety:

Expand Down Expand Up @@ -188,7 +188,7 @@ Using your function, can you write a program which counts votes for White Icicle

# Counting All The Votes

Counting votes for each radish variety is a bit time consuming, you have to know all the names in advance and you have to loop through the file multiple times. How about if you could automatically find all the varieties that were votes for, and count them all in one pass?
Counting votes for each radish variety is a bit time consuming, you have to know all the names in advance and you have to loop through the file multiple times. How about if you could automatically find all the varieties that were voted for, and count them all in one pass?

You'll need a data structure where you can associate a radish variety with the number of votes counted for it. A dictionary would be perfect!

Expand All @@ -208,7 +208,7 @@ Imagine a program that can count votes to create a dictionary with contents like
'April Cross': 72
}

Meaning 65 votes for White Icicle, 63 votes for Snow Belle, etc, etc.
Meaning the key 'White Icicle' is associated with the value of 65 votes, the key 'Snow Belle' is associated with the value of 63 votes, 'Champion' has 76 votes, etc, etc.

Can you create such a program? Start with one of your previous vote-counting programs and try to modify it to count all varieties.

Expand Down Expand Up @@ -346,7 +346,7 @@ They all have something in common, a double space " " between the first and sec

strip() only cleaned up additional whitespace at the start and end of the string.

The `replace` function can be used to replace all double spaces " " with a single space " ":
The `replace` function can be used to replace all double spaces with a single space:

vote = vote.replace(" ", " ")

Expand Down Expand Up @@ -477,6 +477,10 @@ Our program prints the number of votes cast for each radish variety, but it does

(You may want to add this as a totally separate cell, after the previous cells, rather than modifying your existing loops.)

## Hint

You can make a for loop which iterates over all of the keys in a dictionary by using the syntax `for key in dictionary:`. In this case it might be `for name in counts:`.

## Solution

You can do something like this:
Expand Down
25 changes: 16 additions & 9 deletions core/text-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,17 @@ title: Working With Text Files

A text file is any file containing only readable characters.


<a href="http://www.flickr.com/photos/mwichary/2355783479/" title="IBM 1403 printout (from the power-of-two program) by Marcin Wichary, on Flickr"><img src="http://farm3.staticflickr.com/2087/2355783479_ba8837ac18_q.jpg" width="150" height="150" alt="IBM 1403 printout (from the power-of-two program)"></a>

<div style="width: 33%; float: right;">
<a href="http://www.flickr.com/photos/dplmedmss/8644579302/" title="Book of Hours, Latin with additions in Middle English; second page of the second Middle English prayer to Christ. Southern Netherlands (probably Bruges), ca. 1440, f.45r by Dunedin Public Libraries Medieval Manuscripts, on Flickr"><img src="http://farm9.staticflickr.com/8529/8644579302_879e79aff1_q.jpg" width="150" height="150" alt="Book of Hours, Latin with additions in Middle English; second page of the second Middle English prayer to Christ. Southern Netherlands (probably Bruges), ca. 1440, f.45r"></a>
</div>

<div style="width: 33%; float: right;">
<a href="http://www.flickr.com/photos/mwichary/3249196669/" title="Untitled by Marcin Wichary, on Flickr"><img src="http://farm4.staticflickr.com/3051/3249196669_7f313c2fa7_q.jpg" width="150" height="150" alt="Untitled"></a>
</div>

<div style="width: 33%;">
<a href="http://www.flickr.com/photos/mwichary/2355783479/" title="IBM 1403 printout (from the power-of-two program) by Marcin Wichary, on Flickr"><img class="text-left" src="http://farm3.staticflickr.com/2087/2355783479_ba8837ac18_q.jpg" width="150" height="150" alt="IBM 1403 printout (from the power-of-two program)"></a>
</div>

A character can be a number like 3 or 6, or a letter of the alphabet like M or p. Taken together, programmers call numbers and letters the the set of *alphanumeric* characters.

Expand Down Expand Up @@ -54,7 +59,7 @@ Try it out in an IPython Notebook cell.

### Solution:

It prints the contents of the text file out on the console.
It prints out the contents of the text file.

## What's really happening here?

Expand Down Expand Up @@ -98,6 +103,8 @@ It prints the contents of the file, one character at a time, until the end of th

What is the `while` statement in the above code doing? When does the program exit the while loop?

Think about the value that the variable `next` has each time the while loop is evaluated. What happens when the end of the file is reached?

### Bonus Question #2

What would happen if you replaced the `read(1)`s in the code above with `read(2)`s? Think about it first, then try it and see what happens!
Expand Down Expand Up @@ -126,9 +133,9 @@ Control characters like `\n` date from the days when computers had typewriter st

### Enough about typewriters!

Yes, back to Python files! To read a file line by line you could just keep reading one character a time with `.read(1)`, until you run into a newline character `\n`.
Yes, back to Python files! To read a file line by line you could just keep reading one character at a time with `.read(1)`, until you run into a newline character `\n`.

There's an easier way though, which os to use the `.readline()` method in place of `.read()`.
There's an easier way though, which is to use the `.readline()` method in place of `.read()`.

Have another look at the one-character-per-line code example from earlier in this chapter. Can you modify it to read from the file line by line instead of character by character?

Expand Down Expand Up @@ -244,11 +251,11 @@ Alternatively, if you're using OS X or Linux you can type `cat <filename>` in a

### Exercise!

If you want to try all this out, here's something simple to make sure you've got everything down pat. First, make a file with a few lines of random text. Then write a program that:
If you want to try all this out, here's a quick exercise to make sure you've got everything down pat. First, use a text editor to create a plain text file with a few lines of random text. Then write a Python program that:

1. Reads all the lines in the file into a list
1. Reads all the lines from your text file into a list.
2. Appends something crazy to each line in the list. " Ya mum!" is nicely innapropriate, if you're struggling for ideas.
3. Writes all lines in that list into a new file. Check out your handy work!
3. Writes all lines in that list into a new file. Check out your handy work by looking in the new file!

Hopefully pretty simple, but that should make sure you have all the above ideas down-pat.

Expand Down

0 comments on commit 4a14105

Please sign in to comment.