Machine Learning, Data Science and Generative AI with Python

Video description

This course begins with a Python crash course and then guides you on setting up Microsoft Windows-based PCs, Linux desktops, and Macs. After the setup, we delve into machine learning, AI, and data mining techniques, which include deep learning and neural networks with TensorFlow and Keras; generative models with variational autoencoders and generative adversarial networks; data visualization in Python with Matplotlib and Seaborn; transfer learning, sentiment analysis, image recognition, and classification; regression analysis, K-Means Clustering, Principal Component Analysis, training/testing and cross-validation, Bayesian methods, decision trees, and random forests.

Additionally, we will cover multiple regression, multilevel models, support vector machines, reinforcement learning, collaborative filtering, K-Nearest Neighbors, the bias/variance tradeoff, ensemble learning, term frequency/inverse document frequency, experimental design, and A/B testing, feature engineering, hyperparameter tuning, and much more! There's a dedicated section on machine learning with Apache Spark to scale up these techniques to "big data" analyzed on a computing cluster.

The course will cover the Transformer architecture, delve into the role of self-attention in AI, explore GPT applications, and practice fine-tuning Transformers for tasks such as movie review analysis. Furthermore, we will look at integrating the OpenAI API for ChatGPT, creating with DALL-E, understanding embeddings, and leveraging audio-to-text to enhance AI with real-world data and moderation.

What you will learn

  • Implement machine learning on a massive scale with Apache Spark’s MLLib
  • Data visualization with Matplotlib and Seaborn
  • Understand reinforcement learning and how to build a Pac-Man bot
  • Use train/test and K-Fold cross-validation to choose and tune models
  • Build artificial neural networks with TensorFlow and Keras
  • Design and evaluate A/B tests using T-Tests and P-Values

Audience

Software developers or programmers who want to transition into the lucrative data science career path will learn a lot from this course. Data analysts in finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools. You will need some prior experience in coding or scripting to be successful. If you have no prior coding or scripting experience, you should not take this course as we have covered the introductory Python course in the earlier sections.

About the Author

Frank Kane: Frank Kane has spent nine years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers all the time. He holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology and teaches others about big data analysis.

Table of contents

  1. Chapter 1 : Getting Started
    1. Introduction
    2. [Activity] Windows: Installing and Using Anaconda and Course Materials
    3. [Activity] MAC: Installing and Using Anaconda and Course Materials
    4. [Activity] Linux: Installing and Using Anaconda and Course Materials
    5. Python Basics, Part 1 [Optional]
    6. [Activity] Python Basics, Part 2 [Optional]
    7. [Activity] Python Basics, Part 3 [Optional]
    8. [Activity] Python Basics, Part 4 [Optional]
    9. Introducing the Pandas Library [Optional]
  2. Chapter 2 : Statistics and Probability Refresher, and Python Practice
    1. Types of Data (Numerical, Categorical, Ordinal)
    2. Mean, Median, Mode
    3. [Activity] Using Mean, Median, and Mode in Python
    4. [Activity] Variation and Standard Deviation
    5. Probability Density Function; Probability Mass Function
    6. Common Data Distributions (Normal, Binomial, Poisson, and So On)
    7. [Activity] Percentiles and Moments
    8. [Activity] A Crash Course in matplotlib
    9. [Activity] Advanced Visualization with Seaborn
    10. [Activity] Covariance and Correlation
    11. [Exercise] Conditional Probability
    12. Exercise Solution: Conditional Probability of Purchase by Age
    13. Bayes' Theorem
  3. Chapter 3 : Predictive Models
    1. [Activity] Linear Regression
    2. [Activity] Polynomial Regression
    3. [Activity] Multiple Regression and Predicting Car Prices
    4. Multi-Level Models
  4. Chapter 4 : Machine Learning with Python
    1. Supervised Versus Unsupervised Learning, and Train/Test
    2. [Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
    3. Bayesian Methods: Concepts
    4. [Activity] Implementing a Spam Classifier with Naive Bayes
    5. K-Means Clustering
    6. [Activity] Clustering People Based on Income and Age
    7. Measuring Entropy
    8. [Activity] Windows: Installing GraphViz
    9. [Activity] MAC: Installing GraphViz
    10. [Activity] Linux: Installing GraphViz
    11. Decision Trees: Concepts
    12. [Activity] Decision Trees: Predicting Hiring Decisions
    13. Ensemble Learning
    14. [Activity] XGBoost
    15. Support Vector Machines (SVM) Overview
    16. [Activity] Using SVM to Cluster People Using Scikit-Learn
  5. Chapter 5 : Recommender Systems
    1. User-Based Collaborative Filtering
    2. Item-Based Collaborative Filtering
    3. [Activity] Finding Movie Similarities Using Cosine Similarity
    4. [Activity] Improving the Results of Movie Similarities
    5. [Activity] Making Movie Recommendations with Item-Based Collaborative Filtering
    6. [Exercise] Improve the Recommender's Results
  6. Chapter 6 : More Data Mining and Machine Learning Techniques
    1. K-Nearest-Neighbors: Concepts
    2. [Activity] Using KNN to Predict a Rating for a Movie
    3. Dimensionality Reduction; Principal Component Analysis (PCA)
    4. [Activity] PCA Example with the Iris Dataset
    5. Data Warehousing Overview: ETL and ELT
    6. Reinforcement Learning
    7. [Activity] Reinforcement Learning and Q-Learning with Gym
    8. Understanding a Confusion Matrix
    9. Measuring Classifiers (Precision, Recall, F1, ROC, AUC)
  7. Chapter 7 : Dealing with Real-World Data
    1. Bias/Variance Tradeoff
    2. [Activity] K-Fold Cross-Validation to Avoid Overfitting
    3. Data Cleaning and Normalization
    4. [Activity] Cleaning Web Log Data
    5. Normalizing Numerical Data
    6. [Activity] Detecting Outliers
    7. Feature Engineering and the Curse of Dimensionality
    8. Imputation Techniques for Missing Data
    9. Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE
    10. Binning, Transforming, Encoding, Scaling, and Shuffling
  8. Chapter 8 : Apache Spark: Machine Learning on Big Data
    1. [Activity] Installing Spark - Part 1
    2. [Activity] Installing Spark - Part 2
    3. Spark Introduction
    4. Spark and the Resilient Distributed Dataset (RDD)
    5. Introducing MLLib
    6. Introduction to Decision Trees in Spark
    7. [Activity] K-Means Clustering in Spark
    8. TF / IDF
    9. [Activity] Searching Wikipedia with Spark
    10. [Activity] Using the Spark DataFrame API for MLLib
  9. Chapter 9 : Experimental Design / ML in the Real World
    1. Deploying Models to Real-Time Systems
    2. A/B Testing Concepts
    3. T-Tests and P-Values
    4. [Activity] Hands-On with T-Tests
    5. Determining How Long to Run an Experiment
    6. A/B Test Gotchas
  10. Chapter 10 : Deep Learning and Neural Networks
    1. Deep Learning Prerequisites
    2. The History of Artificial Neural Networks
    3. [Activity] Deep Learning in the TensorFlow Playground
    4. Deep Learning Details
    5. Introducing TensorFlow
    6. [Activity] Using TensorFlow, Part 1
    7. [Activity] Using TensorFlow, Part 2
    8. [Activity] Introducing Keras
    9. [Activity] Using Keras to Predict Political Affiliations
    10. Convolutional Neural Networks (CNNs)
    11. [Activity] Using CNNs for Handwriting Recognition
    12. Recurrent Neural Networks (RNNs)
    13. [Activity] Using a RNN for Sentiment Analysis
    14. [Activity] Transfer Learning
    15. Tuning Neural Networks: Learning Rate and Batch Size Hyperparameters
    16. Deep Learning Regularization with Dropout and Early Stopping
    17. The Ethics of Deep Learning
  11. Chapter 11 : Generative Models
    1. Variational Auto-Encoders (VAEs) - How They Work
    2. Variational Auto-Encoders (VAE) - Hands-On with Fashion MNIST
    3. Generative Adversarial Networks (GANs) - How They Work
    4. Generative Adversarial Networks (GANs) - Playing with Some Demos
    5. Generative Adversarial Networks (GANs) - Hands-On with Fashion MNIST
    6. Learning More about Deep Learning
  12. Chapter 12 : Generative AI: GPT, ChatGPT, Transformers, Self-Attention Based Neural Networks
    1. The Transformer Architecture (encoders, decoders, and self-attention.)
    2. Self-Attention, Masked Self-Attention, and Multi-Headed Self Attention in depth
    3. Applications of Transformers (GPT)
    4. How GPT Works, Part 1: The GPT Transformer Architecture
    5. How GPT Works, Part 2: Tokenization, Positional Encoding, Embedding
    6. Fine Tuning / Transfer Learning with Transformers
    7. [Activity] Tokenization with Google CoLab and HuggingFace
    8. [Activity] Positional Encoding
    9. [Activity] Masked, Multi-Headed Self Attention with BERT, BERTViz, and exBERT
    10. [Activity] Using small and large GPT models within Google CoLab and HuggingFace
    11. [Activity] Fine Tuning GPT with the IMDb dataset
    12. From GPT to ChatGPT: Deep Reinforcement Learning, Proximal Policy Gradients
    13. From GPT to ChatGPT: Reinforcement Learning from Human Feedback and Moderation
  13. Chapter 13 : The OpenAI API (Developing with GPT and ChatGPT)
    1. [Activity] The OpenAI Chat Completions API
    2. [Activity] Using Functions in the OpenAI Chat Completion API
    3. [Activity] The Images (DALL-E) API in OpenAI
    4. [Activity] The Embeddings API in OpenAI: Finding similarities between words
    5. [Activity] The Completions API in OpenAI
    6. The Legacy Fine-Tuning API for GPT Models in OpenAI
    7. [Demo] Fine-Tuning OpenAI's Davinci Model to simulate Data from Star Trek
    8. The New OpenAI Fine-Tuning API; Fine-Tuning GPT-3.5 to simulate Commander Data!
    9. [Activity] The OpenAI Moderation API
    10. [Activity] The OpenAI Audio API (speech to text)
  14. Chapter 14 : Final Project
    1. Your Final Project Assignment: Mammogram Classification
    2. Final Project Review
  15. Chapter 15 : You Made It!
    1. More to Explore

Product information

  • Title: Machine Learning, Data Science and Generative AI with Python
  • Author(s): Frank Kane
  • Release date: August 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781787127081