## Probability Theory and Introduction to Machine Learning

Probability spaces, Random Variables and Stochastic Processes, Conditional Probability, Expected Values, Conditional Expectations, Independence of Random Variables. The Bernoulli Stochastic Process and Sums of Independent Random Variables: Bernoulli Process, Number of Successes, Times of Successes, Sums of Independent Random Variables, Chebyshev Inequality, Weak and Strong Law of Large Numbers, Central Limit Theorem. Basics of Inference and Testing (including maximum likelihood estimates, hypothesis testing, likelihood ratio test, and Bayesian inference), Linear regression models, Generalized linear regression models (including logistic regression), Nonparametric Regression (including Gaussian Process regression), Tree methods and Forests, Bagging and Ensemble methods, Statistical Computing.

## Optimization

Brief summary of concepts from Linear Algebra and Calculus of Several Variables concepts: vector, line, plane, functions of several variables, gradient, Hessian. Introduction to MATLAB. Implementation of linear algebra functions and tools in MATLAB. Vectors and matrices in MATLAB. Convex sets: definition, examples, operations on convex sets that preserve convexity. Convex functions: definition, examples, operations on convex functions that preserve convexity. Convex optimization problems: definition, examples, optimal points. Unconstrained convex optimization problems: characterization of optimal solution, gradient descent method, convergence analysis, Newton method, local convergence analysis. Convex optimization with constraints: Farkas’s Lemma, conditions Fritz John (FJ), conditions Karush-Kuhn-Tucker (KKT), examples. Duality, Lagrangian, dual function, weak and strong duality, examples. Convex optimization with affine equality constraints: conditions KKT, convex quadratic problem with affine equality constraints, feasible-point Newton method, method primal-dual. General convex optimization problems: conditions KKT, logarithmic barrier function, interior point method, primal-dual method, examples. Alternating Direction Method of multipliers: definition, examples.

## Practical Data Science and Applications

Fundamentals of programming with Python and Python libraries for data manipulation. This is a hands-on course with a substantial amount of project work. No prior knowledge of Python is needed, but a basic understanding of programming concepts (variables, types, functions, files, etc) as well as computer use skills are required.

Part I: We will cover principles of object-oriented and structured programming, data representation via Python data structures, and fundamental data manipulation in idiomatic Python. Then, we will examine Python libraries for scientific computing (numpy and scipy).

Part II: We will learn to work with basic software development technologies (version control, package installation, package publication) and interactive environments. A practical introduction to Jupyter, pandas, scikit-learn and matplotlib will be followed by students demonstrating their use on introductory data analysis tasks.

Part III: An introduction to high-performance computing techniques via Python, including dask and tensorflow will be given. Students will be asked to implement all stages of a scientific workflow, demonstrating the skills acquired in the previous stages of the course.

Requirements: basic knowledge of programming in any programming language

## Programming and Database Fundamentals

Database design and use of databases in applications. Design and implementation issues in databases. Design and implementation of relational systems. Design and implementation of object–oriented systems. XML databases. Query optimization in databases. Optimizing the performance of applications with design at the physical level, cost optimization for transactions, recovery. Distributed databases. Data Warehousing. Data mining on databases. Continuous Databases. Stream Processing. Big Data Systems and Frameworks. SQL.

## Machine Learning

Supervised learning: least mean squares (LMS), logistic regression, perceptron, Gaussian discriminant analysis, naive Bayes, support vector machines, model selection and feature selection, ensemble methods (bagging, boosting). Deep Neural Networks & Modern Tools (e.g., Tensorflow, AutoML). Generative Adversarial Neural Networks (GANN) & Applications. Learning theory: bias/variance tradeoff, union and Chernoff/Hoeffding bounds, VC dimension. Unsupervised learning: clustering, k‐means, EM, mixture of Gaussians, factor analysis, principal components analysis (PCA), independent component analysis (ICA). Reinforcement learning: Markov decision processes (MDPs), algorithms for POMDPs.

## Big Data Processing and Analysis

Effective compression techniques for high‐volume data sets: sampling, histograms, wavelets; Approximate query processing; Continuous data streams: basic models, problems, and applications; Algorithms and tools for data‐stream processing: reservoir sampling, basic sketch structures (AMS, FM, etc.); Distributed data streaming: Models and techniques; Modern big data management systems.

## Time Series Modeling and Analysis

Stochastic processes (discrete and continuous), stationarity, mean and autocorrelation functions, frequency‐domain analysis, ARMA(p,q) Models, SARIMA models for time series with complex trends and periodicities, Nonlinear auto‐regressive, conditionally heteroskedastic models (ARCH/GARCH), Parameter estimation (methods of moments, least squares and maximum likelihood), Optimal model selection and residual analysis, Cross validation methodologies, Forecasting methodologies (for stationary time series, time series with trends and periodicities, exponentially weighted smoothing), Analysis of multivariate time series, Estimation of cross‐covariance function, Transfer function models, Introduction to nonlinear time series analysis with dynamical systems theory. Applications in R/MATLAB.

## Detection and Estimation Theory

Revision of Linear Algebra and Probability. Binary Hypothesis Testing Examples; Sufficient Statistics, Receiver Operating Characteristic (ROC) & Neyman‐Pearson Tests. Gaussian Detection. M‐ary Hypothesis Testing and Performance Analysis Bounds. Bayesian Estimation, Properties of Mean Squared Error and Linear Least Squares Estimators. Estimation of Non‐random parameters, Cramer‐Rao Bound (theorem, proof, examples). Uniform Minimum Variance Unbiased (UMVU) Estimators, RBLS Theorem. Asymptotic Behavior of Maximum Likelihood (ML) Estimators, BLUE Estimators. Composite Hypothesis Testing: UMP Tests, GLR Tests (GLRT) and Asymptotic Properties of GLRT. Standard Kalman/Wiener Filtering. Iterative parameter estimation: Expectation‐Maximization (EM). Introduction to non‐parametric estimation: particle filtering. Examples in Machine Learning and Data Science.

## Probabilistic Graphical Models & Inference Algorithms

PGMs encode (conditional) dependencies among random variables on carefully crafted graphs. Such description is powerful enough to describe a variety of many famous algorithms, such as (Gaussian) Belief Propagation, Kalman Filtering, Viterbi, Expectation-Maximization. This class offers an introduction in representation with PGMs, algorithms for exact inference, approximate inference, and learning/estimation: Directed acyclic graphs (DAGs) (Bayesian Nets) factorization theorem and semantics (I-map, d-separation, p-map). Undirected graphs (Markov Blanket, Hammersley-Clifford theorem), factor graphs (and techniques to convert), Gaussian Graphical Models. Exact Inference (elimination algorithm, sum-product/belief propagation, max-product on Trees, HMMs and Kalman Filtering, Junction Tree algorithm). Approximate Inference: Loopy Belief Propagation, Sampling Methods (Particle Filtering, Metropolis-Hastings). Intro to learning graphs: ML Techniques, Chow-Liu, BIC-based Techniques, Expectation-Maximization. Applications in Machine Learning & Data Science.

## Quantum Information and Quantum Estimation

State vector Hilbert spaces, Qubit. Theory of quantum measurements, orthonormal complete bases, projectors, positive operator‐valued probability measures (POVM). Density matrix, spectral decomposition, convex decomposition. Bloch sphere and vector. Introduction to quantum entanglement. Quantum correlations, Biorthogonal analysis, Schmidt numbers. Measures of entanglement, Quantum entropy measures. Quantum information. The Schroedinger–HJW theorem. Quantum channels (single qubit, collective channels). Quantum algorithms, Computational and communicational algorithms. The Deutsch–Jozsa algorithm. Quantum teleportation algorithm for states, gates, channels. The LOCC protocol. Quantum walks (QW). Coin‐walker Hilbert spaces. The QW channel maps. Quadratic speed ups. Introduction to the Helstrom‐Holevo quantum estimation theory. Cramer‐Rao bound and quantum Fisher information. Optimal measurements and the symmetric logarithmic derivative operator. Phase estimation problems of quantum states. Temperature estimation and qubit thermometry for closed and open quantum systems.

## Nonlinear Systems

Phase portrait. Second-order systems. Existence and uniqueness of solutions. Sensitivity equations. Comparison principle. Lyapunov stability. LaSalle theorem. Linearization. Center manifold theorem. Stability of perturbed systems. Input-to-state stability. Input-output stability. Perturbation theory and averaging. Singular perturbations. Circle and Popov criteria. Nonlinear control design using backstepping.

## Reinforcement Learning and Dynamic Optimization

The course will cover tools for optimization problems, where a sequence of (interdependent) decisions must be made (dynamically), often under uncertainty about the environment to be optimized (requires learning). Problems like this arise in many modern applications, ranging from autonomous driving and robotics, to game playing (chess, poker, go) and wired/wireless network management. The specific topics covered will be the following: Quick revision of first order optimization methods (gradient descent and stochastic gradient descent); Multi-armed Bandits; First order algorithms for Online Convex Optimization; Online Convex Optimization with instantaneous and long-term constraints; Distributed online convex optimization; backpressure algorithms for Stochastic Network Optimization; Markov Decision Processes; Tabular Reinforcement Learning

## Advanced Concepts in Machine Learning and Pattern Recognition

The course develops on the theoretical underpinnings of machine learning, addressing also issues of explainability in learning. The course provides a broad and detailed consideration of the issues in machine learning, datamining, and statistical pattern recognition. Topics include: Regression and Classification; Image Categorization vs Segmentation; Regularization to prevent overfitting the training data; Neural Networks: Representation & Learning; Associations of Neural with brain networks: Perception, abstraction and detailed recognition; Evaluation of Machine Learning; Machine Learning System Design; Addressing skewed data; Dimensionality Reduction; Anomaly Detection; Large Scale Machine Learning vs Transfer Learning; Deep Neural Networks: Recurrent, Convolutional, Physics-inspired neural networks.

## Quantum Machine Learning, Optimization and Applications

Notion of Qubit and the Bloch Sphere. Single and Two Qubit Gates and Basic Quantum Circuits. Entangled states. Quantum parallelism and phase kick back. Basic quantum algorithms (Deutsch and Grover). Quantum Phase Estimation. Basics of Quantum Annealing and Adiabatic Quantum Computing. Max Cut and quadratic unconstrained optimization (QUBO). Ising model and solving QUBO problems with a quantum computer. Quantum Approximate Optimization Algorithm (QAOA) algorithm and variational quantum algorithm (VQAs). Applications of VQAs in solving optimization problems using quantum programming languages (IBM Qiskit). Basics of quantum machine learning. Data encoding in quantum circuits. Quantum circuit learning and training. Quantum matrix inversion. Quantum clustering and quantum support vector machines. Basics of quantum neural networks. Quantum Boltzmann Machines. Applications using quantum programming languages (Penny Lane, IBM Qiskit and others)

## Secure Systems

Access control models and mechanisms, malware, phishing, botnets, spam, denial of service (DoS), code injection, race conditions, defenses.

## Decision Making and Learning in Multiagent Worlds

Utility Theory, Decision Theory, and Game Theory (cooperative and non-cooperative). Rationality and strategic decision making. Reinforcement Learning and Multiagent Reinforcement Learning. in Game Theoretic Settings. Unsupervised Learning and Probabilistic Topic Modeling. Deep Learning and Deep Reinforcement Learning). Learning in Game Theoretic Settings.

## Research seminar/ Independent Study/Capstone project

This element of the program involves the combination of independent study by the students and attendance of research seminars in preparation for the selection and execution of a capstone project.  The latter will involve research and analysis of datasets based on concepts and tools learned by students in MLDS courses.  The results achieved by each project will be presented at the end of the academic year to the MLDS instructors in front of a public audience.  The duration of each presentation should be 20 minutes.  Some ideas for effective presentations are given in the Seminars/Project e-class module along with research papers which the students can use for independent study.