Solid machine learning foundations presented by a world leading expert. Full life cycle of machine learning development applied to enterprise-grade projects. It is based on the instructor’s book Intuitive Machine Learning. Includes Python coding, scientific computing, optimization algorithms, explainable AI and state-of-the-art methods favoring simplicity, scalability, reusability, replicability, fast implementation, and easy maintenance.
Vincent Granville is a pioneering data scientist and machine learning expert, co-founder of Data Science Central (acquired by TechTarget in 2020), former VC-funded executive, author and patent owner. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, CNET, InfoSpace. Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS).
Vincent published in Journal of Number Theory, Journal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is also the author of multiple books, available here. He lives in Washington state, and enjoys doing research on stochastic processes, dynamical systems, experimental math and probabilistic number theory.
Software engineers, Data analysts, data scientists.
Be able to complete machine learning projects from beginning to end and ranging from NLP, clustering, regression to computer vision.
Covers the core of machine learning, including classification, clustering, regression, structuring unstructured data, cross-validation, model-fitting, feature selection, and ensemble methods such as boosted trees. Nearest neighbor graphs and deep neural networks are discussed in the context of GPU machine learning: classifying data using image processing techniques, after turning tabular data into images. New, simple clustering and mode-finding algorithm with exact solution (comparison to K-means).
Testing black-box systems, designing better ones, and leveraging rich synthetic data to improve the robustness of predictions, minimize overfitting, and assess when an algorithm does well, or not. Useful to deal with wide data and fraud analysis. This module also covers bootstrapping, alternatives to R-squared, minimum contrast estimation and dual confidence regions.
Producing high quality visualizations, including animated gifs, data videos, and even soundtracks to present insights that are easy to grasp by non-experts. Topics include optimum palettes, leveraging color transparency, video processing in Python, visualizing high-dimensional data, scatterplots for high dimensional data, and sound processing in Python.
Including random walks, 2D Brownian motions with strong clustering structure, integrated Brownian motions, smooth and chaotic processes, parameter estimation for non-periodic time series, pseudo-random numbers and prime test of randomness, auto-regressive processes, special time series and an introduction to discrete dynamical systems. Special topics: long-range autocorrelations, optimization techniques in the presence of numerical instability using hybrid Monte-Carlo simulations and fixed-point algorithms.
Enterprise-grade web crawling and text parsing techniques are used to create keyword taxonomies, with numerous practical applications. Besides solving original real-life problems, the goal is to structure unstructured data, and to develop distributed algorithms that can be resumed without data loss after computer crashes. Computational complexity and fast clustering of text data is discussed.