Skip to content

Fixed some spelling errors. #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions mlops-roadshow/1-data-prep-feature-store.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Ok, so you can process data locally, but this is a smaller dataset. What if you need to process hundreds of gigabytes or even terabytes of data? The processing done so far has been constrained by local resources; this notebook is being run on a single instance type that has memory and compute contraints so we can only process so much data with it.\n",
"Ok, so you can process data locally, but this is a smaller dataset. What if you need to process hundreds of gigabytes or even terabytes of data? The processing done so far has been constrained by local resources; this notebook is being run on a single instance type that has memory and compute constraints so we can only process so much data with it.\n",
"\n",
"In order to process larger amounts of data in a reasonable time, we really need to distribute our processing across a cluster of instances. Fortunately, SageMaker has a feature called SageMaker Processing that can help us with this task."
]
Expand All @@ -563,9 +563,9 @@
"source": [
"## SageMaker Processing\n",
" \n",
"To process large amounts of data, we fortunately will not need to write distributed code oursleves. Instead, we can use [SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html) which will do all the processing _outside_ of this notebook's resources and will apply our processing script to multiple data files in parallel.\n",
"To process large amounts of data, we fortunately will not need to write distributed code ourselves. Instead, we can use [SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html) which will do all the processing _outside_ of this notebook's resources and will apply our processing script to multiple data files in parallel.\n",
" \n",
"Keep in mind that inn a typical SageMaker workflow, notebooks are only used for prototyping and can be run on relatively inexpensive and less powerful instances, while processing, training and model hosting tasks are run on separate, more powerful SageMaker-managed instances. SageMaker Processing includes off-the-shelf support for [scikit-learn](https://docs.aws.amazon.com/sagemaker/latest/dg/use-scikit-learn-processing-container.html), [PySpark](https://docs.aws.amazon.com/sagemaker/latest/dg/use-spark-processing-container.html), and [other frameworks](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job-frameworks.html) like Hugging Face, MXNet, PyTorch, TensorFlow, and XGBoost. You can even a Bring Your Own Container if one our our built-in containers does not suit your use case."
"Keep in mind that in a typical SageMaker workflow, notebooks are only used for prototyping and can be run on relatively inexpensive and less powerful instances, while processing, training and model hosting tasks are run on separate, more powerful SageMaker-managed instances. SageMaker Processing includes off-the-shelf support for [scikit-learn](https://docs.aws.amazon.com/sagemaker/latest/dg/use-scikit-learn-processing-container.html), [PySpark](https://docs.aws.amazon.com/sagemaker/latest/dg/use-spark-processing-container.html), and [other frameworks](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job-frameworks.html) like Hugging Face, MXNet, PyTorch, TensorFlow, and XGBoost. You can even a Bring Your Own Container if one of our built-in containers does not suit your use case."
]
},
{
Expand Down