Lesson 1
An Introduction to Your Nanodegree Program
Welcome! We're so glad you're here. Join us in learning a bit more about what to expect and ways to succeed.
Predictive analytics
scikit-learn
Lesson 1
An Introduction to Your Nanodegree Program
Welcome! We're so glad you're here. Join us in learning a bit more about what to expect and ways to succeed.
Lesson 2
Getting Help
You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.
Lesson 3
The Skills That Set You Apart
Learn the data science process, including how to build effective data visualizations, and how to communicate with various stakeholders.
Lesson 1
Welcome to the Course
This lesson will give you an overview of the course, discuss pre-requisites and stakeholders.
Lesson 2
The Data Science Process
In this lesson, you will learn about CRISP-DM and how you can apply it to many data science problems.
Lesson 3
Communicating to Stakeholders
In this lesson, you will be creating a post to communicate your findings via Medium.
Lesson 4 • Project
Project: Writing a Data Scientist Blog Post
In this project, you will create a blog post and Github repository that you can use as you build your data science portfolio.
Software engineering skills are increasingly important for data scientists. In this course, you'll learn best practices for writing software. Then you'll work on your software skills by coding a Python package and a web data dashboard.
Lesson 1
Introduction to Software Engineering
Welcome to Software Engineering for Data Scientists! Learn about the course and meet your instructors.
Lesson 2
Software Engineering Practices Pt I
Learn software engineering practices and how they apply in data science. Part one covers clean and modular code, code efficiency, refactoring, documentation, and version control.
Lesson 3
Software Engineering Practices Pt II
Learn software engineering practices and how they apply in data science. Part two covers testing code, logging, and conducting code reviews.
Lesson 4
Introduction to Object-Oriented Programming
Learn the basics of object-oriented programming so that you can build your own Python package.
Lesson 5
Portfolio Exercise: Upload a Package to PyPi
Create your own Python package and upload your package to PyPi.
Lesson 6
Web Development
Develop a data dashboard using Flask, Bootstrap, Plotly and Pandas.
Lesson 7
Portfolio Exercise: Deploy a Data Dashboard
Customize the data dashboard from the previous lesson to make it your own. Upload the dashboard to the web.
In data engineering for data scientists, you will practice building ETL, NLP, and machine learning pipelines. This will prepare you for the project with our industry partner Figure 8.
Lesson 1
Introduction to Data Engineering
You will get an introduction to the data engineering for data scientists course and project. The lessons include ETL pipelines, natural language pipelines, and machine learning pipelines.
Lesson 2
ETL Pipelines
ETL stands for extract, transform, and load. This is the most common type of data pipeline, and you will practice each step in this lesson.
Lesson 3
NLP Pipelines
In order to complete the project at the end of the course, you will need some natural language processing skills. Here you will practice engineering machine learning features from text data.
Lesson 4
Machine Learning Pipelines
You'll use the Scikit-Learn package to code a machine learning pipeline. With these skills, you can ingest data, create features, and train a machine learning algorithm in just one step.
Lesson 5 • Project
Project: Disaster Response Pipeline
You’ll build a machine learning pipeline to categorize emergency messages based on the needs communicated by the sender.
Learn to design experiments and analyze A/B test results. Explore approaches for building recommendation systems.
Lesson 1
Intro to Experimental Design & Recommendations Engines
Why do we care about experiment design and recommendation engines? In this lesson, you'll get an overview of the topics you'll learn in this course.
Lesson 2
Concepts in Experiment Design
In this lesson, you will learn about conceptual topics that must be considered when designing and running an experiment, in order to ensure good, interpretable results.
Lesson 3
Statistical Considerations in Testing
In this lesson, you will learn how statistics can be used to benefit the design of an experiment, as well as additional statistical tests that can be used to analyze results.
Lesson 4
A/B Testing Case Study
In this lesson, you will go through an A/B Testing case study to see how the conceptual and statistical concepts covered in the previous lessons can be applied in experiment designs.
Lesson 5
Portfolio Exercise: Starbucks
In this lesson, you will analyze data that was originally used in screening interviews for data scientists at Starbucks.
Lesson 6
Introduction to Recommendation Engines
In this lesson, you will learn about the different methods used to create recommendation engines.
Lesson 7
Matrix Factorization for Recommendations
In this lesson, you will learn how machine learning is being used to make recommendations.
Lesson 8 • Project
Project: Recommendation Engines
Put your skills to work to make recommendations for IBM Watson Studio's data platform.
Leverage what you’ve learned throughout the program to build your own open-ended Data Science project. This project will serve as a demonstration of your valuable abilities as a Data Scientist.
Lesson 1 • Project
Data Scientist Capstone
Now you will put your Data Science skills to the test by solving a real world problem using all that you have learned throughout the program.
Congratulations on finishing your program!
Lesson 1
Congratulations!
Congratulations on your graduation from this program! Please join us in celebrating your accomplishments.
Optional Courses
Lesson 1
Neural Networks
Luis will give you an overview of logistic regression, gradient descent, and the building blocks of neural networks.
Lesson 2
Deep Neural Networks
A deeper dive into backpropagation and the training process of neural networks, including techniques to improve the training.
Lesson 3
Convolutional Neural Networks
Alexis explains the theory behind Convolutional Neural Networks and how they help us dramatically improve performance in image classification.
Lesson 1
Introduction to the Course
In this lesson, you will learn more about this course - what will be covered, and who you will be learning from - let's get started!
Lesson 2
The Power of Spark
In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.
Lesson 3
Data Wrangling with Spark
In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.
Lesson 4
Debugging and Optimization
In this lesson, we will cover various troubleshooting techniques and potential ways of optimizing the performance of your Spark applications.
Lesson 5
Machine Learning with Spark
In this lesson, we'll explore Spark's ML capabilities and build ML models and pipelines.
Lesson 6 • Project
[DSND Capstone] Cloud Deployment Instructions
Lesson 1
Why Python Programming
Welcome to Introduction to Python! Here's an overview of the course.
Lesson 2
Data Types and Operators
Familiarize yourself with the building blocks of Python! Learn about data types and operators, compound data structures, type conversion, built-in functions, and style guidelines.
Lesson 3
Control Flow
Build logic into your code with control flow tools! Learn about conditional statements, repeating code with loops and useful built-in functions, and list comprehensions.
Lesson 4
Functions
Learn how to use functions to improve and reuse your code! Learn about functions, variable scope, documentation, lambda expressions, iterators, and generators.
Lesson 5
Scripting
Setup your own programming environment to write and run Python scripts locally! Learn good scripting practices, interact with different inputs, and discover awesome tools.
Lesson 6
NumPy
Learn the basics of NumPy and how to use it to create and manipulate arrays.
Lesson 7
Pandas
Learn the basics of Pandas Series and DataFrames and how to use them to load and process data.
Lesson 1
Basic SQL
In this section, you will gain knowledge about SQL basics for working with a single table. You will learn the key commands to filter a table in many different ways.
Lesson 2
SQL Joins
In this lesson, you will learn how to combine data from multiple tables together.
Lesson 3
SQL Aggregations
In this lesson, you will learn how to aggregate data using SQL functions like SUM, AVG, and COUNT. Additionally, CASE, HAVING, and DATE functions provide you an incredible problem solving toolkit.
Lesson 4
SQL Subqueries & Temporary Tables
In this lesson, you will be learning to answer much more complex business questions using nested querying methods - also known as subqueries.
Lesson 5
SQL Data Cleaning
Cleaning data is an important part of the data analysis process. You will be learning how to perform data cleaning using SQL in this lesson.
Lesson 6
[Advanced] SQL Window Functions
Compare one row to another without doing any joins using one of the most powerful concepts in SQL data analysis: window functions.
Lesson 7
[Advanced] SQL Advanced JOINs & Performance Tuning
Learn advanced joins and how to make queries that run quickly across giant datasets. Most of the examples in the lesson involve edge cases, some of which come up in interviews.
Lesson 1
Data Visualization in Data Analysis
In this lesson, see the motivations for why data visualization is an important part of the data analysis process and where it fits in.
Lesson 2
Design of Visualizations
Learn about elements of visualization design, especially to avoid those elements that can cause a visualization to fail.
Lesson 3
Univariate Exploration of Data
In this lesson, you will see how you can use matplotlib and seaborn to produce informative visualizations of single variables.
Lesson 4
Bivariate Exploration of Data
In this lesson, build up from your understanding of individual variables and learn how to use matplotlib and seaborn to look at relationships between two variables.
Lesson 5
Multivariate Exploration of Data
In this lesson, see how you can use matplotlib and seaborn to visualize relationships and interactions between three or more variables.
Lesson 6
Explanatory Visualizations
Previous lessons covered how you could use visualizations to learn about your data. In this lesson, see how to polish up those plots to convey your findings to others!
Lesson 7
Visualization Case Study
Put to practice the concepts you've learned about exploratory and explanatory data visualization in this case study on factors that impact diamond prices.
Lesson 1
Shell Workshop
The Unix shell is a powerful tool for developers of all sorts. In this lesson, you'll get a quick introduction to the very basics of using it on your own computer.
Lesson 1
What is Version Control?
Version control is an incredibly important part of a professional programmer's life. In this lesson, you'll learn about the benefits of version control and install the version control tool Git!
Lesson 2
Create A Git Repo
Now that you've learned the benefits of Version Control and gotten Git installed, it's time you learn how to create a repository.
Lesson 3
Review a Repo's History
Knowing how to review an existing Git repository's history of commits is extremely important. You'll learn how to do just that in this lesson.
Lesson 4
Add Commits To A Repo
A repository is nothing without commits. In this lesson, you'll learn how to make commits, write descriptive commit messages, and verify the changes you're about to save to the repository.
Lesson 5
Tagging, Branching, and Merging
Being able to work on your project in isolation from other changes will multiply your productivity. You'll learn how to do this isolated development with Git's branches.
Lesson 6
Undoing Changes
Help! Disaster has struck! You don't have to worry, though, because your project is tracked in version control! You'll learn how to undo and modify changes that have been saved to the repository.
Lesson 7
Working With Remotes
You'll learn how to create remote repositories on GitHub and how to get and send changes to the remote repository.
Lesson 8
Working On Another Developer's Repository
In this lesson, you'll learn how to fork another developer's project. Collaborating with other developers can be a tricky process, so you'll learn how to contribute to a public project.
Lesson 9
Staying In Sync With A Remote Repository
You'll learn how to send suggested changes to another developer by using pull requests. You'll also learn how to use the powerful `git rebase` command to squash commits together.
Lesson 1
Introduction
Take a sneak peek into the beautiful world of Linear Algebra and learn why it is such an important mathematical tool.
Lesson 2
Vectors
Learn about vectors, the basic building block of Linear Algebra.
Lesson 3
Linear Combination
Learn how to scale and add vectors and how to visualize the process.
Lesson 4
Linear Transformation and Matrices
What is a linear transformation and how is it directly related to matrices? Learn how to apply the math and visualize the concept.
Lesson 1
Descriptive Statistics - Part I
In this lesson, you will learn about data types, measures of center, and the basics of statistical notation.
Lesson 2
Descriptive Statistics - Part II
In this lesson, you will learn about measures of spread, shape, and outliers as associated with quantitative data. You will also get a first look at inferential statistics.
Lesson 3
Admissions Case Study
Learn to ask the right questions, as you learn about Simpson's Paradox.
Lesson 4
Probability
Gain the basics of probability using coins and die.
Lesson 5
Binomial Distribution
Learn about one of the most popular distributions in probability - the Binomial Distribution.
Lesson 6
Conditional Probability
Not all events are independent. Learn the probability rules for dependent events.
Lesson 7
Bayes Rule
Learn one of the most popular rules in all of statistics - Bayes rule.
Lesson 8
Python Probability Practice
Take what you have learned in the last lessons and put it to practice in Python.
Lesson 9
Normal Distribution Theory
Learn the mathematics behind moving from a coin flip to a normal distribution.
Lesson 10
Sampling distributions and the Central Limit Theorem
Learn all about the underpinning of confidence intervals and hypothesis testing - sampling distributions.
Lesson 11
Confidence Intervals
Learn how to use sampling distributions and bootstrapping to create a confidence interval for any parameter of interest.
Lesson 12
Hypothesis Testing
Learn the necessary skills to create and analyze the results in hypothesis testing.
Lesson 13
Case Study: A/B tests
Work through a case study of how A/B testing works for an online education company called Audacity.
Lesson 14
Regression
Use python to fit linear regression models, as well as understand how to interpret the results of linear models.
Lesson 15
Multiple Linear Regression
Learn to apply multiple linear regression models in python. Learn to interpret the results and understand if your model fits well.
Lesson 16
Logistic Regression
Learn to apply logistic regression models in python. Learn to interpret the results and understand if your model fits well.
Lesson 1 • Project
Take 30 Min to Improve your LinkedIn
Find your next job or connect with industry peers on LinkedIn. Ensure your profile attracts relevant leads that will grow your professional network.
Lesson 2 • Project
Optimize Your GitHub Profile
Other professionals are collaborating on GitHub and growing their network. Submit your profile to ensure your profile is on par with leaders in your field.