template: inverse layout: true --- class: center, middle, firstslide # Introduction to Machine Learning --- class: darkback # What is Machine Learning? ## Machine learning is a branch of artificial intelligence. Using computing, we design systems that can learn from data in a manner of being trained. The systems might learn and improve with experience, and with time, refine a model that can be used to predict outcomes of questions based on the previous learning. --- class: darkback # History of Machine Learning ### Alan Turing Considered 'father of AI'. Famous [*Computing Machinery and Intelligence* paper](http://www.csee.umbc.edu/courses/471/papers/turing.pdf), asked "Can Machines think?" (1950) ### Arthur Samuel (IBM) Defined machine learning as, "[A] Field of study that gives computers the ability to learn without being explicitly programmed." (1959) ### Tom M. Mitchell (Chair of ML at Carnegie Mellon) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with the experience E. (1998) --- class: darkback # Types of Machine Learning ## Supervised Learning ## Unsupervised Learning ## Reinforcement Learning ??? - Supervised learning is the machine learning task of inferring a function from labeled training data.[1] The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. - Unsupervised learning is that of trying to find hidden structure in unlabeled data. Many methods employed in unsupervised learning are based on data mining methods used to preprocess data. - Reinforcement Learning is a third that I found noted elsewhere. - Current research topics include: adaptive methods which work with fewer (or no) parameters under a large number of conditions, addressing the exploration problem in large MDPs, large-scale empirical evaluations, learning and acting under partial information. - Thus, reinforcement learning is particularly well suited to problems which include a long-term versus short-term reward trade-off. It has been applied successfully to various problems, including robot control, elevator scheduling, telecommunications, backgammon and checkers (Sutton and Barto 1998, Chapter 11). --- class: darkback # How is Machine Learning Used? - Spam Detection - Voice Recognition / Natural Language Processing - Computer Vision - Handwriting Recognition - Stock Trading - Robotics - Medicine and Healthcare - Advertising - Retail and E-Commerce - Product Recommendations - Gaming Analytics - [The Internet of Things](http://www.google.com/about/datacenters/efficiency/internal/) - Understanding human learning ??? # How is Machine Learning Used? - Stock Trading (70% of trading on the Stock market are done by High Frequency Trading algorithms) - Gaming Analytics (Fraud analysis, predicting churn, whale spotting, help systems) - The Internet of Things model plant performance and predict PUE. (With the advent of many connected sensors, Google was able to take data about their data centers, from average load, to pump speed, chillers condition ,and create a model with 99.6 effectiveness in predicting PUE (Power Usage Effectiveness) - The data center industry uses the measurement PUE, or power usage effectiveness, to measure efficiency. A PUE of 2.0 means that for every watt of IT power, an additional watt is consumed to cool and distribute power to the IT equipment. A PUE closer to 1.0 means nearly all of the energy is used for computing. - Data Center Enginer, Jim Gao, did this as his 20% project --- class: darkback # Languages and Tools .left-column[ - **Python** [scikit-learn](http://scikit-learn.org/stable/), [PyML](http://pyml.sourceforge.net/), [pybrain](http://pybrain.org/), [IPython Notebook](http://ipython.org/notebook.html) - **Java** [Java-ML](http://java-ml.sourceforge.net/), [Mallet](http://mallet.cs.umass.edu/) - **R** [R Project](http://www.r-project.org/) - **Matlab** [Mathworks site](http://www.mathworks.com/products/matlab/) - **Scala** [ScalaNLP](http://www.scalanlp.org/) (And Java libs above too) - **Clojure** [Clojush](https://github.com/lspector/Clojush) (And JAva libs above too) - **Go** [GoLearn](https://github.com/sjwhitworth/golearn) - **JavaScript** [Brain](https://github.com/harthur/brain), [ConvNetJS](http://cs.stanford.edu/people/karpathy/convnetjs/), [encog-javascript](https://github.com/encog/encog-javascript) - **C#** [Accord.NET](http://accord-framework.net/intro.html) ] .right-column[ - [Hadoop](http://hadoop.apache.org/) - [Mahout](https://mahout.apache.org/) - [Spark](http://spark.apache.org/) - [Weka](http://www.cs.waikato.ac.nz/ml/weka/) - [Spring XD](http://projects.spring.io/spring-xd/) - [Azure ML](http://azure.microsoft.com/en-us/services/machine-learning/) ] ??? ScalaNLP - Puck is an insanely fast GPU-powered parser, built on the same grammars produced by the Berkeley Parser. On a mid-range Nvidia GTX 680, it can parse over 400 sentences a second, or over half a million words per minute. --- class: darkback # Types of Machine Learning ## Decision Tree Learning ## Artificial neural networks ## Clustering ## Bayesian networks ??? Not a encompassing list, just a few. - Decision tree learning uses a decision tree as a predictive model, which maps observations about an item to conclusions about the item's target value. - An artificial neural network (ANN) learning algorithm, usually called "neural network" (NN), is a learning algorithm that is inspired by the structure and functional aspects of biological neural networks. Computations are structured in terms of an interconnected group of artificial neurons, processing information using a connectionist approach to computation. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs, to find patterns in data, or to capture the statistical structure in an unknown joint probability distribution between observed variables. - Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that observations within the same cluster are similar according to some predesignated criterion or criteria, while observations drawn from different clusters are dissimilar. Different clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated for example by internal compactness (similarity between members of the same cluster) and separation between different clusters. Other methods are based on estimated density and graph connectivity. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis. - A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning. --- class: darkback # Moar Machine Learning ## [Practical Machine Learning](https://www.coursera.org/course/predmachlearn) online class ## [Machine Learning](https://www.coursera.org/course/ml) online class ## [Machine Learning - Hands-On for Developers and Technical Professionals](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1118889061,subjectCd-ST70.html) book ## [Kaggle.com - Machine Learning Competitions](http://www.kaggle.com/competitions) ## [Azure Machine Learning from Microsoft](http://azure.microsoft.com/en-us/services/machine-learning/) ??? --- class: darkback # Thanks! ## How can you leverage Machine Learning in your job?