Description

You will learn the skills required to be a successful Data Scientist. Working on projects designed by industry experts, you'll learn how to run data pipelines, design experiments, build recommendation systems, and deploy solutions to the cloud.

Syllabus:

Course 1: Solving Data Science Problems

The Data Science Process

Apply the CRISP-DM process to business applications
Wrangle, explore, and analyze a dataset
Apply machine learning for prediction
Apply statistics for descriptive and inferential understanding
Draw conclusions that motivate others to act on your results

Communicating with Stakeholders

Implement best practices in sharing your code and written summaries
Learn what makes a great data science blog
Learn how to create your ideas with the data science community

Project: Write a Data Science Blog Post

You will select a dataset, identify three questions, and analyse the data to find answers to these questions in this project. To communicate your findings to the appropriate audience, you will create a GitHub repository for your project and write a blog post. This project will assist you in reinforcing and expanding your knowledge of machine learning, data visualisation, and communication.

Course 2: Software Engineering for Data Scientists

Software Engineering Practices

Write clean, modular, and well-documented code
Refactor code for efficiency • Create unit tests to test programs
Write useful programs in multiple scripts • Track actions and results of processes with logging
Conduct and receive code reviews

Object-Oriented Programming

Understand when to use object oriented programming
Build and use classes
Understand magic methods
Write programs that include multiple classes, and follow good code structure
Learn how large, modular Python packages, such as pandas and scikit-learn, use object oriented programming
Portfolio Exercise: Build your own Python package

Web Development

Learn about the components of a web app
Build a web application that uses Flask, Plotly, and the Bootstrap framework
Portfolio Exercise: Build a data dashboard using a dataset of your choice and deploy it to a web application

Course 3: Data Engineering for Data Scientists

ETL Pipelines

Understand what ETL pipelines are
Access and combine data from CSV, JSON, logs, APIs, and databases
Standardize encodings and columns
Normalize data and create dummy variables
Handle outliers, missing values, and duplicated data
Engineer new features by running calculations
Build a SQLite database to store cleaned data

Natural Language Processing

Prepare text data for analysis with tokenization, lemmatization, and removing stop words
Use scikit-learn to transform and vectorize text data
Build features with bag of words and tf-idf
Extract features with tools such as named entity recognition and part of speech tagging
Build an NLP model to perform sentiment analysis

Machine Learning Pipelines

Understand the advantages of using machine learning pipelines to streamline the data preparation and modeling process
Chain data transformations and an estimator with scikit- learn’s Pipeline
Use feature unions to perform steps in parallel and create more complex workflows
Grid search over pipeline to optimize parameters for entire workflow
Complete a case study to build a full machine learning pipeline that prepares data and creates a model for a dataset

Project: Build Disaster Response Pipelines with Figure Eight

Figure Eight (formerly Crowdflower) used crowdsourcing to tag and translate messages in order to apply artificial intelligence to disaster relief. In this project, you will create a data pipeline to prepare message data from major natural disasters worldwide. You will create a machine learning pipeline to categorise emergency text messages based on the sender's expressed need.

Course 4: Experiment Design and Recommendations

Experiment Design

Understand how to set up an experiment, and the ideas associated with experiments vs. observational studies
Defining control and test conditions
Choosing control and testing groups

Statistical Concerns of Experimentation

Applications of statistics in the real world
Establishing key metrics
SMART experiments: Specific, Measurable, Actionable, Realistic, Timely

A/B Testing

How it works and its limitations
Sources of Bias: Novelty and Recency Effects
Multiple Comparison Techniques (FDR, Bonferroni, Tukey) • Portfolio Exercise: Using a technical screener from Starbucks to analyze the results of an experiment and write up your findings

Introduction to Recommendation Engines

Distinguish between common techniques for creating recommendation engines including knowledge based, content based, and collaborative filtering based methods.
Implement each of these techniques in python.
List business goals associated with recommendation engines, and be able to recognize which of these goals are most easily met with existing recommendation techniques.

Matrix Factorization for Recommendations

Understand the pitfalls of traditional methods and pitfalls of measuring the influence of recommendation engines under traditional regression and classification techniques.
Create recommendation engines using matrix factorization and FunkSVD
Interpret the results of matrix factorization to better understand latent features of customer data
Determine common pitfalls of recommendation engines like the cold start problem and difficulties associated with usual tactics for assessing the effectiveness of recommendation engines using usual techniques, and potential solutions.

Project: Design a Recommendation Engine with IBM

Members of IBM's online data science community can share tutorials, notebooks, articles, and datasets. In this project, you will create a recommendation engine in IBM Watson Studio's data platform based on user behaviour and social networks to surface content that is most likely to be relevant to a user.

Course 5: Data Science Projects

Elective 1: Dog Breed Classification

Use convolutional neural networks to classify different dogs according to their breeds
Deploy your model to allow others to upload images of their dogs and send them back the corresponding breeds.
Complete one of the most popular projects in Udacity history, and show the world how you can use your deep learning skills to entertain an audience!

Elective 2: Starbucks

Use purchasing habits to arrive at discount measures to obtain and retain customers
Identify groups of individuals that are most likely to be responsive to rebates.

Elective 3: Arvato Financial Services

Work through a real-world dataset and challenge provided by Arvato Financial Services, a Bertelsmann company
Top performers have a chance at an interview with Arvato or another Bertelsmann company!

Elective 4: Spark for Big Data

Take a course on Apache Spark and complete a project using a massive, distributed dataset to predict customer churn
Learn to deploy your Spark cluster on either AWS or IBM Cloud

Elective 5: Your Choice

Use your skills to tackle any other project of your choice

Project: Data Science Capstone Project

You will use what you've learned throughout the programme to create a data science project of your choice for this capstone project. You will define the problem you want to solve, identify and explore the data, then conduct your analyses and draw conclusions. You will present your findings and analysis in a blog post and a GitHub repository.

Learn to Become a Data Scientist Online

Gain real-world data science experience with projects from industry experts. Take the first step to becoming a data scientist. Learn online, with Udacity.

4.7

Discount Offer

Description

Similar Courses

Reviews

No Reviews Available yet

Course Features

Enrollment options

Nanodegree Program

Learn to Become a Data Scientist Online

Gain real-world data science experience with projects from industry experts. Take the first step to becoming a data scientist. Learn online, with Udacity.

4.7

Discount Offer

Description

Similar Courses

Data Science and Machine Learning Bootcamp with R

Business Statistics and Analysis Capstone

Data Science And Analysis: Make DataFrames in Pandas And Python

Python Data Science Toolbox : Part 1

IBM Data Science

Data Science Interview Prep

Introduction to Genomic Technologies

Python Data Science Toolbox : Part 2

Want to be a Big Data Scientist?

Reviews

No Reviews Available yet

Course Features

Enrollment options

Nanodegree Program