Ironhack Data Analytics Bootcamp

Inspired by my collegue: Ricardo Zacarias

This repo contains all of the practical exercises I did during the Data Analytics Bootcamp @ Ironhack. The entire course lasted for 9 weeks (20-Jan, 20-March 2020) with an additional career week. It was divided into 3 modules:

Git, Python and SQL;
Statistics and probability;
Machine Learning;

Projects

Mod/Week	Project	Language	Libraries	Topics/Methods
M1-W3	Covid-19 & Global Awareness	Python	regex, os, numpy, pandas, requests, getpass, GetOldTweets3, csv, tweepy, wordcloud, matplotlib, seaborn	Team project with Ana Frias and Tristan Piat. We collected data from Johns Hopkins University, scrapped the Twitter activity of the World Health Organization (WHO) user account, collected data from Google trends and the Chinese National Health Center data and analysed the relationship between them.
M2-W6	CAN PRISONERS, HORSE KICKS AND A MAN CALLED FISH PREDICT FOOTBALL SCORES?	Python	pandas, numpy, os, statsmodels, scipy, matplotlib, seaborn; stats, poisson, chi2, chisquare, norm	Used Hypothesis Testing and the Chi Squared Test to verify if the number of goals scored in a football (soccer) match fits the Poisson Distribution.
M3-W9	O_Lemma	Python	eli5, io, itertools, json, nltk, numpy, os, pandas, pdfminer, random, re, sklearn.metrics, sklearn.model_selection, sklearn_crfsuite, spacy, sys, tqdm	Created a Residual CNN (Convolutional Neural Network) to auto-Redacted documents — identifing and replacing sensitive words in context and return to a redacted copy of the documents.

Lab Index

In the table below is an index of each exercise ordered by bootcamp module and week, a link to the exercises, the programming language, libraries used and the main topics covered or methods used by me to solve the problems.

Mod/Week	Lab	Language	Libraries	Topics/Methods
M1-W1	resolving-git-conflicts	Git, Command Line, Bash	-	GitHub, add, commit, push, pull, merge, conflicts, pull requests
M1-W1	tuple-set-dict	Python	random, operator, pandas	random.sample, operator.itemgetter, pd.DataFrame
M1-W1	list-comprehensions	Python	os, numpy, pandas	os.listdir, os.path.join, pd.concat,np.array, _get_numeric_data
M1-W1	string-operations	Python	re, math	f-strings, str.lower, str.endswith, str.join, str.split, str.replace, re.findall, re.search, bag of words
M1-W1	lambda-functions	Python	-	functions, lambda, zip, sorted, dict.items
M1-W1	numpy	Python	numpy,	np.random (random, rand, sample), np.ones, size, shape, np.reshape, np.transpose, np.array_equal, max, min, mean, np.empty, np.nditer,
M1-W1	functions	Python	iter	functions, iterators, generators, yield
M1-W1	intro-pandas	Python	pandas, numpy	pd.Series, pd.DataFrame, df.columns, subsetting, df.mean, df.max, df.median, df.sum
M1-W2	map-reduce-filter	Python	numpy, pandas, functools	functions, map, reduce, filter
M1-W2	import-export	Python	pandas	pd.read_csv, pd.to_csv, pd.read_excel, df.head, df.value_counts
M1-W2	dataframe-calculations	Python	pandas, numpy, zipfile	df.shape, df.unique, str.contains, df.astype, df.isnull, df.apply, df.sort_values, df.equals, pd.get_dummies, df.corr, df.drop, pd.groupby.agg, df.quantile,
M1-W2	first-queries	SQL	-	create db, create table, select, distinct, group by, order by, where, limit, count
M1-W2	my-sql-select	SQL	-	aliases, inner join, left join, sum, coalesce,
M1-W2	my-sql	SQL	-	db design, table relationships, db seeding, forward engineering schemas, one-to-many, many-to-one, many-to-many, linking tables
M1-W2	advanced-mysql	SQL	-	temporary tables, subqueries, permanent tables
M1-W2	data-cleaning	Python	pandas, numpy, scipy.stats	pd.rename, df.dtypes, pd.merge, df.fillna, np.abs, stats.zscore
M1-W3	api-scavenger	Python, APIs, Command Line	pandas, pandas.io.json	curl, pd.read_json, json_normalize, pd.to_datetime
M1-W3	web-scraping	Python, APIs	requests, beautifulsoup, tweepy	requests.get, requests.get.content, BeautifulSoup, soup.find_all, soup.tag.text, soup.tag.get, soup.tag.find, tweepy.get_user, tweepy.user_timeline, tweepy.user.statuses_count, tweepy.user.follower_count
M1-W3	advanced-regex	Python	re	re.findall, re.sub,
M1-W3	matplotlib-seaborn	Python	matplotlib.pyplot, seaborn, numpy, pandas	plt.plot, plt.show, plt.subplots, plt.legend, plt.bar, plt.barh, plt.pie, plt.boxplot, plt.xticks, ax.set_title, ax.set_xlabel, sns.set, sns.distplot, sns.barplot, sns.despine, sns.violinplot, sns.catplot, sns.heatmap, np.linspace, pd.select_dtypes, pd.Categorical, df.cat.codes, np.triu, sns.diverging_palette
M1-W3	pandas-deep-dive	Python	pandas	df.describe, df.groupby.agg, df.apply
M2-W4	subsetting-and-descriptive-stats	Python	pandas, matplotlib, seaborn	df.loc, df.groupby.agg, df.quantile, df.describe,
M2-W4	understanding-descriptive-stats	Python	pandas, random, matplotlib, numpy	random.choice, plt.hist, plt.vlines, np.mean, np.std
M2-W4	regression-analysis	Python	numpy, pandas, scipy, sklearn.linear_model, matplotlib, seaborn	plt.scatter, df.corr, scipy.stats.linregress, sns.heatmap, sklearn.LinearRegression, lm.fit, lm.score, lm.coef_, lm.intercept
M2-W4	advanced-pandas	Python	pandas, numpy, random	df.isnull, df.set_index, df.reset_index, random.choices, df.lookup, pd.cut
M2-W4	mini-project1	Python	pandas, numpy, matplotlib, seaborn, scipy.stats	EDA, df.map, df.info, df.apply (with lambda), df.replace, df.dropna, sns.boxplot, plt.subplots_adjust, df.drop, sns.pairplot, sns.regplot, sns.jointplot, stats.linregress
M2-W4	pivot-table-and-correlation	Python	pandas, scipy.stats	df.pivot_table(index, columns, aggfunc), stats.linregress, plt.scatter, stats.pearsonr, stats.speamanr
M2-W4	tableau	Tableau	-	mini project: analyzed the relationship between the number of characters in the title and description of apps and umber of downloads
M2-W5	intro-probability	Probability	-	probability space, conditional probability, contingency tables
M2-W5	reading-stats-concepts	Statistics	-	p-values, AB testing, means and expected values
M2-W5	probability-distributions	Python	scipy.stats, numpy	discrete: stats.binom, stats.poisson. continuous: stats.uniform, stats.norm, stats.expon, np.random.exponential, stats.rvs, stats.cdf, stats.pdf, stats.ppf
M2-W5	confidence-intervals	Python	scipy.stats, numpy	stats.norm.interval, calculating sample sizes
M2-W5	intro-to-scipy	Python	scipy, numpy	stats.tmean, stats.fisher_exact, scipy.interpolate, interpolate.interp1d, np.arange
M2-W5	hypothesis-testing-1	Python	scipy.stats, numpy, pandas, statsmodels	stats.ttest_1samp, stats.sem, stats.t.interval, pd.crosstab, statsmodels.proportions_ztest
M2-W5	hypothesis-testing-2	Python	pandas, scipy.stats	stats.f_oneway, stats.ttest_ind, stats.ttest_rel, pd.concat
M2-W5	mini-project2	Python	pandas, numpy, scipy.stats, matplotlib	stats.norm, stats.ppf, stats.t.interval, stats.pdf, np.linspace, stats.shapiro
M2-W6	two-sample-hyp-test	Python	pandas, scipy.stats, numpy	stats.ttest_ind, stats.ttest_rel, stats.ttest_1samp, stats.chi2_contingency, np.where
M2-W6	goodfit-indeptests	Python	scipy.stats, numpy	stats.poisson, stats.pmf, stats.chisquare, stats.norm, stats.kstest, stats.cdf, stats.chi2_contingency, stats.binom
M3-W7	intro-to-ml	Python	pandas, numpy, datetime, sklearn.model_selection	pd.to_numeric, df.interpolate, np.where, dt.strptime, dt.toordinal, train_test_split
M3-W7	supervised-learning-feature-extraction	Python	pandas, numpy	pd.to_numeric, df.apply, pd.to_datetime, np.where, pd.merge
M3-W7	supervised-learning	Python	pandas, seaborn, sklearn.model_selection, sklearn.linear_model, LogisticRegression, sklearn.neighbors, sklearn.preprocessin	df.corr, sns.heatmap, df.drop, df.dropna, pd.get_dummies, train_test_split, LogisticRegression, confusion_matrix, accuracy_score, KNeighborsClassifier, RobustScaler
M3-W7	supervised-learning-sklearn	Python	sklearn.linear_model, sklearn.datasets, sklearn.preprocessing, sklearn.model_selection, statsmodels.api, sklearn.metrics, sklearn.feature_selection	LinearRegression, load_diabetes, PolynomialFeatures, StandardScaler, train_test_split, sm.OLS, r2_score, RFE
M3-W7	unsupervised-learning	Python	sklearn.preprocessing, sklearn.cluster, sklearn.metrics, yellowbrick.cluster	StandardScaler, KMeans, silhouette_score, KElbowVisualizer, DBSCAN
M3-W7	unsupervised-learning-and-sklearn	Python	sklearn.preprocessing, sklearn.cluster, mpl_toolkits.mplot3d	LabelEncoder, KMeans, fig.gca(projection='3d')
M3-W8	problems-in-ml	Python	sklearn.metrics, sklearn.model_selection, sklearn.ensemble, sklearn.datasets, sklearn.svm, matplotlib.colors	r2_score, mean_squared_error, train_test_split, RandomForestRegressor, load_boston, SVC, ListedColormap
M3-W8	imbalance	Python	sklearn.model_selection, sklearn.preprocessing, sklearn.linear_model, sklearn.tree, sklearn.preprocessing, sklearn.metrics	train_test_split, LabelEncoder, LogisticRegression, DecisionTreeClassifier, RobustScaler, StandardScaler, PolynomialFeatures, MinMaxScaler, confusion_matrix, accuracy_score
M3-W8	deep-learning	Python	tensorflow, keras.models, keras.layers, keras.utils, sklearn.model_selection	keras.Sequential, keras.Dense, keras.to_categorical, save_weights, load_weights
M3-W8	nlp	Python	re, nltk, nltk.stem, nltk.corpus, sklearn.feature_extraction.text, nltk.probability	WordNetLemmatizer, stopwords, CountVectorizer, TfidfVectorizer, ConditionalFreqDist, nltk.word_tokenize, nltk.PorterStemmer, nltk.WordNetLemmatizer, nltk.NaiveBayesClassifier, nltk.classify.accuracy, classifier.show_most_informative_features

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
M0-W0-prework		M0-W0-prework
M1-W1-functions		M1-W1-functions
M1-W1-intro-pandas		M1-W1-intro-pandas
M1-W1-lambda-functions		M1-W1-lambda-functions
M1-W1-list-comprehensions		M1-W1-list-comprehensions
M1-W1-numpy		M1-W1-numpy
M1-W1-resolving-git-conflicts		M1-W1-resolving-git-conflicts
M1-W1-string-operations		M1-W1-string-operations
M1-W1-tuple-set-dict		M1-W1-tuple-set-dict
M1-W2-advanced-mysql		M1-W2-advanced-mysql
M1-W2-data-cleaning		M1-W2-data-cleaning
M1-W2-dataframe-calculations		M1-W2-dataframe-calculations
M1-W2-import-export		M1-W2-import-export
M1-W2-map-reduce-filter		M1-W2-map-reduce-filter
M1-W2-mysql-first-queries		M1-W2-mysql-first-queries
M1-W2-mysql-select		M1-W2-mysql-select
M1-W2-mysql		M1-W2-mysql
M1-W3-advanced-regex		M1-W3-advanced-regex
M1-W3-api-scavenger		M1-W3-api-scavenger
M1-W3-matplotlib-seaborn		M1-W3-matplotlib-seaborn
M1-W3-pandas-deep-dive		M1-W3-pandas-deep-dive
M1-W3-project-data-thieves		M1-W3-project-data-thieves
M1-W3-web-scraping		M1-W3-web-scraping
M2-W4-advanced-pandas		M2-W4-advanced-pandas
M2-W4-mini-project1		M2-W4-mini-project1
M2-W4-pivot-table-and-correlation		M2-W4-pivot-table-and-correlation
M2-W4-regression-analysis		M2-W4-regression-analysis
M2-W4-subsetting-and-descriptive-stats		M2-W4-subsetting-and-descriptive-stats
M2-W4-tableau		M2-W4-tableau
M2-W4-understanding-descriptive-stats		M2-W4-understanding-descriptive-stats
M2-W5-confidence-intervals		M2-W5-confidence-intervals
M2-W5-hypothesis-testing-1		M2-W5-hypothesis-testing-1
M2-W5-hypothesis-testing-2		M2-W5-hypothesis-testing-2
M2-W5-intro-prob		M2-W5-intro-prob
M2-W5-intro-to-scipy		M2-W5-intro-to-scipy
M2-W5-mini-project2		M2-W5-mini-project2
M2-W5-probability-distributions		M2-W5-probability-distributions
M2-W5-reading-stats-concepts		M2-W5-reading-stats-concepts
M2-W6-goodfit-indeptests		M2-W6-goodfit-indeptests
M2-W6-two-sample-hyp-test		M2-W6-two-sample-hyp-test
M3-W7-intro-to-ml		M3-W7-intro-to-ml
M3-W7-supervised-learning-feature-extraction		M3-W7-supervised-learning-feature-extraction
M3-W7-supervised-learning-sklearn		M3-W7-supervised-learning-sklearn
M3-W7-supervised-learning		M3-W7-supervised-learning
M3-W7-unsupervised-learning-and-sklearn		M3-W7-unsupervised-learning-and-sklearn
M3-W7-unsupervised-learning		M3-W7-unsupervised-learning
M3-W8-deep-learning		M3-W8-deep-learning
M3-W8-imbalance		M3-W8-imbalance
M3-W8-nlp		M3-W8-nlp
M3-W8-problems-in-ml		M3-W8-problems-in-ml
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ironhack Data Analytics Bootcamp

Git, Python and SQL;

Statistics and probability;

Machine Learning;

Projects

Lab Index

About

Uh oh!

Releases

Packages

Uh oh!

Languages

duarteharris/ironhack-labs

Folders and files

Latest commit

History

Repository files navigation

Ironhack Data Analytics Bootcamp

Git, Python and SQL;

Statistics and probability;

Machine Learning;

Projects

Lab Index

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages