From f53228721d49c757be4299d365d3255b647da5b4 Mon Sep 17 00:00:00 2001 From: Soumayan-pal01 <66107748+Soumayan-pal01@users.noreply.github.com> Date: Wed, 21 Jul 2021 23:21:56 +0530 Subject: [PATCH] Create ML_Assignment_1_1929121.ipynb --- .../ML_Assignment_1_1929121.ipynb | 142 ++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 Assignments/ML Assignment-1/ML_Assignment_1_1929121.ipynb diff --git a/Assignments/ML Assignment-1/ML_Assignment_1_1929121.ipynb b/Assignments/ML Assignment-1/ML_Assignment_1_1929121.ipynb new file mode 100644 index 0000000..c549d20 --- /dev/null +++ b/Assignments/ML Assignment-1/ML_Assignment_1_1929121.ipynb @@ -0,0 +1,142 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ML Assignment-1(1929121_Soumayan)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q1. How would you define `Machine Learning` ?**\n", + "\n", + "**Ans:** Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.\n", + "\n", + "**Q2. Can you name four `types of problems` where it shines ?**\n", + "\n", + "**Ans:**\n", + "A) Problems in which existing solutions require a lot of repetitive work or long lists of instructions. ML can often simplify code and/or perform better.\n", + "\n", + "B) Complex Problems in which a good solution does not exist. ML can find a solution.\n", + "\n", + "C) Problems that have a wide range of input variables, i.e. a fluctuating environment. ML can adapt to new data.\n", + "\n", + "D) ML can help us get deeper insights into complex problems and large data sets, by identifying non-obvious (highly complex) patterns.\n", + "\n", + "**Q3. What is a `labeled training` set ?**\n", + "\n", + "**Ans:** Labeled data sets are sets of data that are already ‘classified’. \n", + "\n", + "**Q4. What are the two most `common supervised` tasks ?**\n", + "\n", + "**Ans:** Classification and regression are the two most common supervised tasks.\n", + "\n", + "Classification is where a machine learning algorithm is trained with a large sample size of labeled data sets, so it will successfully be able to label or classify new data sets through the derived rule set or algorithm.\n", + "\n", + "Regression is a task in which the algorithm tries to predict a target numeric value of the attribute of data type, i.e. it’s feature. A feature is the actual numerical value of an attribute of a data type. For example: The data type ‘car’ has the attribute ‘value’ and it’s numerical value would be ‘1000 US Dollars’. The attribute and it’s numerical value is declared as a feature.\n", + "**Q5. Can you name four `common unsupervised` tasks ?**\n", + "\n", + "**Ans:** The most common unsupervised algorithms of ML are:\n", + "\n", + "A) Clustering algorithms, is a common example of unsupervised machine learning in which at no point do you tell the algorithm which label a data set point belongs to. It finds certain connections between data points itself.\n", + "\n", + "B) Visualisation algorithms are another good example. You feed an algorithm a lot of complex and unlabelled data which is then represented in a visual 2D or 3D representation.\n", + "\n", + "C) Anomaly Detection is also an important example of an unsupervised machine learning task. The system is trained with a set of normal instances, and when it sees a new instance it can tell whether it looks like a normal one or whether it is likely an anomaly.\n", + "\n", + "D) Association rule learning. A machine learning task in which the goal is to analyse complex sets of data and discover relations between attributes. Compared to Clustering or Visualisation, Association rule learning focuses on the attribute or feature relations of data set points, rather then the data set points themselves.\n", + "\n", + "**Q.6 What type of `Machine Learning algorithm` would you use to allow a robot to walk in various unknown terrins ?**\n", + "\n", + "**Ans:** We will be using reinforcement learningto cater this purpose.In reinforcement learning, the learning system or agent can observe it’s environment. Based on it’s observations it can choose to select and perform actions and will measure those actions, by measuring rewards or penalties. So it must learn by itself the best strategy or policy to score the highest ‘reward’ count over time.\n", + "\n", + "**Q.7 What type of algorithm would you use to segment your `customers into multiple groups` ?**\n", + "\n", + "**Ans:** Will use a clustering algorithm (unsupervised learning) to segment your customers into clusters of similar customers. However, if you know what groups you would like to have, then you can feed many examples of each group to a classification algorithm (supervised learning), and it will classify all your customers into these groups.\n", + "\n", + "**Q.8 Would you frame the probelm of `spam detection` as a `supervised learning` problem or an `unsupervised learning` problem ?**\n", + "\n", + "**Ans:** Spam detection is a typical supervised learning problem: the algorithm is fed many emails along with their label (spam or not spam).\n", + "\n", + "**Q.9 What is an `online learning system` ?**\n", + "\n", + "**Ans:** An online learning system can learn incrementally, as opposed to a batch learn‐ ing system. This makes it capable of adapting rapidly to both changing data and autonomous systems, and of training on very large quantities of data.\n", + "\n", + "**Q.10 What is `out-of-core` learning ?**\n", + "\n", + "**Ans:** Out-of-core algorithms can handle vast quantities of data that cannot fit in a computer’s main memory. An out-of-core learning algorithm chops the data into mini-batches and uses online learning techniques to learn from these mini-batches.\n", + "\n", + "**Q.11 What type of learning algorithm relies on a `similarity measure` to make predictions ?**\n", + "\n", + "**Ans:** An instance-based learning system learns the training data by heart; then, when given a new instance, it uses a similarity measure to find the most similar learned instances and uses them to make predictions.\n", + "\n", + "**Q.12 What is the `difference` between a `model parameter` and a learning algorithm's `hyperparameters` ?**\n", + "\n", + "**Ans:** A model has one or more model parameters that determine what it will predict given a new instance. A learning algorithm tries to find optimal values for these parameters such that the model generalizes well to new instances. A hyperparameter is a parameter of the learning algorithm itself, not of the model.\n", + "\n", + "**Q.13 What do `model-based learning` algorithms search for ? what is they most common `strategy` the use to succeed ? how do the `make predictions` ?**\n", + "\n", + "**Ans:** \n", + "\n", + "Model-based learning algorithms search for an optimal value for the model parameters such that the model will generalize well to new instances. We usually train such systems by minimizing a cost function that measures how bad the system is at making predictions on the training data, plus a penalty for model complexity if the model is regularized. To make predictions, we feed the new instance’s features into the model’s prediction function, using the parameter values found by the learning algorithm.\n", + "\n", + "**Q.14 Can you name four of the main challenges in `Machine Learning` ?**\n", + "\n", + "**Ans:** The four main challenges in Machine Learning are:\n", + "\n", + "Some of the main challenges in Machine Learning are the lack of data, poor data quality, nonrepresentative data, uninformative features, excessively simple models that underfit the training data, and excessively complex models that overfit the data.\n", + "\n", + "**Q.15 If your `model performs great` on the `training data` but `generalizes poorly to new instances`, what is happening ? Can you name `three possible solutions` ?**\n", + "\n", + "**Ans:** If a model performs great on the training data but generalizes poorly to new instances, the model is likely overfitting the training data (or we got extremely lucky on the training data). Possible solutions to overfitting are getting more data, simplifying the model (selecting a simpler algorithm, reducing the number of parameters or features used, or regularizing the model), or reducing the noise in the training data.A test set is used to estimate the generalization error that a model will make on new instances before the model is launched in production.\n", + "\n", + "**Q.16 What is a `test set`, and why would you want to use it ?**\n", + "\n", + "**Ans:** A test set is used to estimate the generalization error that a model will make on new instances before the model is launched in production.\n", + "\n", + "**Q.17 What is the purpose of a `validation set` ?**\n", + "\n", + "**Ans:** A validation set is used to compare models. After training, the fitted model is used to predict the responses for the observations in the validation dataset. It makes it possible to select the best model and tune the hyperparameters. Validation datasets can be used for regularization by early stopping (stopping training when the error on the validation dataset increases, as this is a sign of overfitting to the training dataset). The test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset.\n", + "\n", + "**Q.18 What is the `train-dev set`, when do you need it, and how do you use it?**\n", + "\n", + "**Ans:** If you tune hyperparameters using the test set, you risk overfitting the test set, and the generalization error measured will be optimistic.\n", + "\n", + "**Q.19 What can go wrong if you tune `hyperparameters` using the test set?**\n", + "\n", + "**Ans:** Cross-validation is a technique that makes it possible to compare models (for model selection and hyperparameter tuning) without the need for a separate validation set. This saves precious training data.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}