Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions IMC_Python/Storing_Data_In_Python/Storing_Data_in_Python.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
# How to Store Data in Python

Python has the ability to produce useful applications for our potential code; we can create desktop applications, build websites, analyze data, and much more - the possibilities are endless. However, in order to do anything of that caliber with this programming language, you must start from the bottom and master the basics. One essential building block of Python is to learn to store pieces of data that will be useful to our program and, more importantly, decide how we will store this data. Lists, sets, and dictionaries are the most common ways to store data. In this blog post, I will show you how you can use them, why you would want to use them, and what features each data structure has to offer.

The first data structure I will talk about is probably the most commonly used in Python - lists. With this method, you can store variables of the same type in a specific order. The, you can access and modify them with an index.

## Lists

### Initialization

*list= [element1, element2, …, elementn]*

When you initialize a list, you must give it a name. If you choose to add elements to the list at this time (optional), they must be separated by commas and wrapped in square brackets, [].

Ex: Take a list of integers called **numbers**. Let’s suppose you want it to contain numbers 0-4. This is how we would initialize the list:

numbers = [0, 1, 2, 3, 4]

If we wanted **numbers** initialized to be empty, its declaration would look like this:

numbers = []

### Access an Element

*list[index]*

The most important thing to keep in mind when using a Python list is how indices are numbered. When we count in programming, we always start at 0, not 1. When we keep track of the index of a list, the first index is always numbered ‘0’. This means that if your list has six elements, the last element is stored at index ‘5’.

With that being said, we can access an element through the use of square brackets, [], and its index.

Ex: If we want to access the fourth element (index 3) and save it to a variable **x**, we would call:

x = numbers[3]

“x =” assigns the value of x and “numbers[3]” returns the item at the third index of the list. If you are new to coding, you should keep in mind that **“=” does not indicate an equivalence relation, it means that a value is being assigned to the variable on the left.** Looking at the example above, we can say that *x is numbers[3]* or *x is 3*, BUT, we CANNOT say that *numbers[3] = x*. The variable on the right is not modified at all. Therefore in this case, although we change what ‘x’ means, the value of numbers[3] remains the same.

### Modify an Element

*list[index] = new element*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this format you have of of
Heading
Code general
Information
Code example and comments

However, the code and comments seem a little mixed with the bold/italicized to reference different info. Is there a way you could indent or highlight code areas? Just to make them more distinguishable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! When I did it on my computer, I used

to center the code but I guess it didn't translate when I uploaded to GitHub. I've fixed it now


Just as we can access an element and assign its value to a variable (like we did above), we can also modify the value of an element given its index.

Ex: If we wanted to change the third value of the list (index 2) to 10, we would call the following:

numbers[2] = 10

The “numbers[2]” accesses the list at index 2 using the brackets, []. Placing an equals sign, “=”, to the right of it indicates that we want to reassign numbers[2]’s value to the value on the right of the “=” – in this instance, it is 10.

### Add an Element

*append(new_element)*

*insert(new_element, index)*

Say we want to add the number ‘5’ to our **numbers** list. If we had a specific position in mind, we would need to find that position’s index and call insert() using ‘5’ and the index as parameters.

Ex: If we wanted to place 5 at index 2, we would use the following statement:

numbers.insert(5, 2)

Now, the contents of numbers is: **[0, 1, 2, 3, 4, 5]**. However, if we wanted to tack ‘5’ onto the end of the list, we would simply use the append() function and pass 5 as a parameter.

numbers.append(5)

Here, the contents of numbers is: **[0, 1, 2, 3, 4, 5]**

The append() function is very user-friendly because it lets you append list onto one another.

letters = [‘a’, ‘b’, ‘c’]
numbers.append(letters)

Now the contents of numbers is: **[0, 1, 2, 3, 4, ‘a’, ‘b’, ‘c]**

Because of how Python is designed, if we insert an element at an index, the list automatically expands and shifts all the elements to the right of the given index one position to the right.

### Remove an Element

*pop(index)*

*remove(index)*

There are two ways in which we can remove an element - according to its index or according to its value. In the first case, we use the pop() function on the list to remove an element at a given index.

Ex: If we wanted to remove the fifth element (index 4) of our original list, we would call:

numbers.pop(4)

Now the contents of numbers is: **[0, 1, 2, 3]**

However, if we wanted to remove an element without using its index, then we would use remove() and pass the element as the parameter.

To better show this technique, let’s use a list of strings as an example:

Let’s define a list called **fruits** where fruits = **[“apple”, “strawberry”, “orange”, “banana”].**

If we wanted to remove “orange” from the list, we would call

fruits.remove(“orange”)

Which would result in the list: **[“apple”, “strawberry”, “banana”]**

The next method for storing data that we will go over is sets. Sets allow you to store variables of the same type in a group – however, a set is unordered and each element must be unique.

## Sets

### Initialization

*set = {element1, element2, …, elementn}*

Similar to a list, you can initialize a set to be empty, or you can add elements. However, with a set, we use curly brackets, {}, instead of square brackets, [], to initialize its values.

Ex: fruit = {“apple”, “strawberry”, “banana”}
fruit = {}

### Access an Element

Since sets are unordered and do not have indices, you cannot access a specific item in it as you would in a list.

### Modify an Element

Unfortunately, once an item is added to a set, it cannot be modified, only removed. This is because there is no way to access the element to modify it since there are no indices.

### Add an Element

*set.add(new_element)*

*set.update(new_element1, new_element2, …, new_elementn)*

Sets have an add() function that takes a variable of the same type as the rest of the set and appends it to the set. Since sets are unordered, there is no need to worry about where you should place it like you would when adding an item to a list.

Ex: fruits.add(“orange”)

Now the contents of fruits is: **{“apple”, “strawberry”, “banana”, “orange”}**

You can also append multiple items at once using update() by passing several elements.

Ex:fruits.add([“orange”, “pineapple”, “mango”])

Now the contents of fruits is: **{“apple”, “strawberry”, “banana”, “orange”, “pineapple”, “mango”}**

### Remove an Element

*set.remove(element)*

*set.discard(element)*

There are two functions that you may use to remove an element from a set. Both require that you pass the element you wish to remove since there is no other way to reference it. The two methods do the same exact thing, *except*, remove() will raise an error if the element you are trying to remove is not in the set while discard() will not.

If **fruits** = {“apple”, “strawberry”, “banana”, “orange”}, then

remove(“strawberry”)
and
discard(“strawberry”)

will both give you fruits = **{“apple”, banana”, “orange”}**

Finally, let’s address dictionaries. This data structure functions using a key-value pair which allows you to find an element given a key, such as a character or a string, rather than using a typical numerical index. Though this type of array is unordered, it is still addressable since you have a way of finding an element using its unique key.

## Dictionaries

### Initialization

*dictionary = {key1:value1, key2:value2}*

Because of the way dictionaries are designed, you can use different data types as keys and values. We use curly brackets, {}, to define the dictionary, a colon, ‘:’, to assign a key to a value, and a comma, ‘,’, to separate entries.

Ex: If we wanted to create a dictionary about a person with attributes “name”, “age”, and “college”, we could use the attribute names as keys and the personal attributes as values.

my_dictionary = {“name” : “Danielle”,
“age” : 20,
“college” : “Binghamton University"
}

Though you do not have to insert a new line and tab for each input, it may make your dictionary easier to read.

### Access an Element

*dictionary[key]*

To get the value stored at a key, you use square brackets, [], just as you would with a list - except, this time you pass a key as the parameter rather than an index.

Ex: my_dictionary[“name”] would return **“Danielle”**

### Modify an Element

*dictionary[key] = new_value*

Similar to the way we update an element of a list, you would also use square brackets, [], and an equals sign, =, to update the value of a dictionary at a certain key. It should be noted that you can only update the value in a key-value pair – if you would like to update the key, you must create a new key-value pair and delete the old one.

my_dictionary[“age”] = 21

Now, my_dictionary would look like this:

{“name”: “Danielle”, “age”: 21, “college”: “Binghamton University”}

### Add an Element

*dictionary[new_key] = new_value*

The way in which you add a new key-value pair to a dictionary is the same as how you would update an existing pair. The only difference is that the key in the square brackets is not expected to be in the dictionary. **This is where you need to keep in mind that keys are unique.** If you mistakenly pass a key that is already in the dictionary when you mean to pass a new one, no error will be thrown and the existing key’s value will be updated.

Ex: If you wanted to add a phone number to my_dictionary, the statement would look like this:

my_dictionary[“phone number”] = “(888)123-4567”

Now, my_dictionary’s contents would be: **{“name”: “Danielle”, “age”: 21, “college”: “Binghamton University”, “phone number” = “(888)123-4567”}**

### Remove an Element

*dictionary.pop(key)*

If you would like to delete a specific key-value pair, you would use the pop() function and pass the key as the parameter.

Ex: If you wanted to remove the “college” key, you would call the following:

my_dictionary.pop(“college”)

Now, the dictionary would look like this:

{name: Danielle, age: 21, phone number = (888)123-4567}

## When to Use Which

Now that we have laid out the basic functions of lists, sets, and dictionaries, we can differentiate the three and list out some scenarios to use each one. First, let's find what they all have in common and what makes them unique.

Below is a Venn diagram that shows the differences and similarities between lists, sets, and dictionaries. As you can see, the only similarity they all share in functionality is storing data. This graph will help you narrow down your choice simply based on whether you need something indexable, modifiable, or if you need your variables to be unique.

Now that we know what makes each data structure unique, let’s get into features that will further assist you in your choice.

![alt text](https://i.ibb.co/LrNQ2D1/bit-project-diagram.png)


#### Lists

Because a list is ordered, it features functions such as *sort()*, which re-orders your list either alphanumerically or according to a given function, and *reverse()* which puts your list in the order opposite to what it is in now. It also has a method called *count()* which returns the amount of times a passed variable is found in the list. These features make lists useful for a dataset that you would want to arrange in a certain manner, especially if your data features duplicate values.

#### Sets

Though sets seem difficult to work with as they aren’t indexable, they can be quite useful. Python sets are similar to mathematical sets – they are unordered and have unique elements. Just as you can find the union and intersection of sets in math, you can also find the union and intersection of Python sets using the *union()* function and *intersection()* function respectively. Both methods can take several sets as parameters and return the resulting set. These two functions only scratch the surface of what Python sets are capable of – you can determine if a set is disjoint from another, if one set is a subset of another, and so much more. Therefore, this option useful if you plan to perform such operations on your datasets.

#### Dictionaries

Dictionaries give you the best of both sets and lists. They are indexable and keep your information organized with the key-value pair design. Dictionaries can also be nested so that the value that a key maps to is a dictionary rather than a number or a string. This data structure is best if you want to store several pieces of information about one item. A good example would be if you were to simulate a phone book or an address book. You could set each key to be a contact’s name and each value as their phone number or address. This problem could also be approached using a nested dictionary where each nested dictionary represents a contact with several key-value pairs to store more information on each of them.

Overall, none of these data structures are better than another – each has their own qualities and capabilities that make them useful for a certain purpose. Hopefully you have figured out which method is best for storing the information that you will be using in your Python program. Otherwise, I advise you pick one out and search for where you may run into a problem. To learn more about Python, visit bitproject.org or e-mail me at [email protected].