# Introduction to Machine Learning: Part 1

Machine learning and artificial intelligence have become topics of great interest within the last 10 to 15 years. Many believe A.I. to be humanity’s last hope or harbinger of self destruction. Whatever opinion you have about A.I. one thing is clear, it is now a part of our modern lives. I think Artificial Intelligence has the potential to be another tool humans use to provide more opportunities for next generations. I will attempt to demystify artificial intelligence by educating people on its use cases and the importance of using it safely and ethically. To begin, I will cover an overview of machine learning, also known as statistical learning. Then I will drill down into specific topics on machine learning with respective python programming labs. This will facilitate knowledge on machine learning tools of when and how to use them. Going forward it’s important to remember the context of these tools and why one outcompetes the others. At the end of this series I hope you will come back with a better understanding of Artificial Intelligence, its use cases, its dangers, and benefits.

Pandora’s box symbolizes the unforeseen consequences of human actions. The story illustrates that even in the midst of disaster, hope endures. The gods gave Pandora a box containing all the world’s evils along with hope, all of which were released when she opened it. The advent of artificial intelligence represents a parallel to this Greek myth. AI has unlocked potential for various adverse implications such as cybersecurity threats, privacy issues, and negative impacts on employment. The philosophical “Hope” in this real-life tale of AI is that the barrier to entry in any field has been fundamentally reduced to how much effort an individual is willing to put in. For every human technological endeavor, there has always been an unforeseen consequence. Each technological breakthrough, from the discovery of fire, which could destroy life or unlock nutrients in food, to the invention of the wheel, which could enhance mobility or facilitate warfare, carries potential dual outcomes. Even nuclear power, offering a source of clean energy, originated from humanity’s greatest existential crisis — the creation of the atomic bomb. Historically, the better aspects of humanity have consistently surfaced to address these challenges. However, this action does not happen by accident. We have to come together. I would like to do my part in reducing the barrier to entry in this field and contribute to “Artificial Intelligence Literacy” With that, let us begin.

**Machine Learning**

First, let’s talk about what machine learning is. Machine learning is a vast set of tools for understanding and analyzing data. It primarily consists of supervised and unsupervised learning, but also includes reinforcement learning among other methodologies.

Within these types of learning, different kinds of problems can be addressed using statistical learning algorithms. These problems typically fall into three main categories: regression, classification, and clustering. Each category aims to solve distinct types of questions about the data, whether it’s predicting continuous outcomes, categorizing data into predefined labels, or grouping data based on similarity (Garath James, 2023, pp. 1–8). Setting the playing field for these tools will allow one to effectively gain insights from most sources of Data.

**Types of Machine Learning**

Next, let’s talk about the various types of machine learning. Machine learning is grouped into two main categories: supervised learning and unsupervised learning. Supervised learning is a type of statistical modeling that focuses on predicting or estimating an output based on one or more inputs. This process involves training a model on a dataset containing input-output pairs, allowing the model to learn the relationship between the inputs and outputs. Unsupervised learning, on the other hand, involves inputs but no supervising output labels or targets. Despite this lack of direct guidance, unsupervised learning algorithms can still discover patterns, relationships, and structures within the data. Unsupervised learning algorithms are particularly useful for exploring the underlying organization of the data or for finding natural groupings (clusters), reducing dimensionality, and more. Both categories encompass numerous algorithms that solve specific machine learning problems (Garath James, 2023, pp. 1–8). A third type, reinforcement learning, represents a more recent category and will be discussed later.

**Machine Learning Problems**

Although these supervised and unsupervised are vital for understanding machine learning, it’s important to frame the use of these algorithms in the context of the problems they solve. Machine learning problems can be broken down into three main categories: regression, classification, and clustering. In a regression problem, the goal is to predict a continuous or quantitative output value based on one or more input variables. This type of problem is characterized by the ability to estimate numerical values from given predictors, such as predicting prices, temperatures, or other measurable quantities. In a classification problem, the goal is to predict which category or class an observation belongs to. This involves categorizing data into predefined groups or classes based on its attributes, such as determining whether an email is spam or Ham, or diagnosing whether a tumor is benign or malignant. Next, in a clustering problem, the goal is to group a set of objects based on their similarities without any pre-assigned labels or outputs. This involves observing input variables to discover natural patterns or groupings within the data. Clustering helps identify distinct categories in the data, which can reveal underlying structures, such as grouping customers by purchase behavior or categorizing genes with similar functions (Garath James, 2023, pp. 1–8). By understanding how problems are framed it’s up to you to determine which tool is most appropriate for each job.

**Supervised Learning Algorithm**

Let’s delve deeper into supervised learning and the algorithms it comprises. Among them are Linear Regression, Logistic Regression, Linear Discriminant Analysis, Naive Bayes, K-Nearest Neighbors, Decision Trees, Neural Networks, and finally, Support Vector Machines.

Linear Regression is a statistical method that predicts a quantitative response by modeling the relationship between a dependent variable and one or more independent variables using a linear equation. Coefficients derived from the data indicate the impact of each predictor on the outcome (Garath James, 2023, pp. 69). The machine learning problem that Linear Regression solves is a regression problem.

Logistic Regression is a classification method used to predict a binary outcome by modeling the probability that a given input belongs to one of two classes. The machine learning problem that this method solves is a classification problem (Garath James, 2023, pp. 27).

Linear Discriminant Analysis is a statistical method that identifies a linear combination of features to separate multiple classes of objects or events, which can then be used either as a classifier or to reduce dimensionality for improved classification. (https://en.wikipedia.org/wiki/Linear_discriminant_analysis). The machine learning problem that LDA’s solves is a classification problem.

Naive Bayes is a family of scalable, probabilistic classifiers that assumes features are conditionally independent given the target class. It uses Bayes’ theorem for decision-making and is characterized by its simplicity and linear scalability in terms of parameter requirements and computation time. (https://en.wikipedia.org/wiki/Naive_Bayes_classifier). The machine learning problem that Naive Bayes solves is a classification problem.

The K-nearest neighbor algorithm is a non-parametric supervised learning method that classifies or predicts outputs based on the k closest training examples in the dataset. (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm). K-nearest neighbor is capable of solving both regression and classification problems based on the principle that similar data points are likely to yield similar results.

A decision tree is a model where classification trees handle discrete target values with leaves as class labels and branches as feature conjunctions, while regression trees manage continuous outcomes, although adaptations can extend to data defined by pairwise dissimilarities(https://en.wikipedia.org/wiki/Decision_tree_learning). A decision tree is capable of solving both regression and classification problems based on the structured decision-making process it follows, which mimics human decision-making by sequentially splitting the data into increasingly specific groups.

Neural networks take an input vector consisting of several variables and construct a nonlinear function through layers of interconnected nodes or neurons. Each node applies a combination of weights, biases, and activation functions to process the input and ultimately predict the output response (Garath James, 2023, pp. 400). Neural networks are capable of solving both regression and classification problems based on their ability to model complex patterns and relationships in large datasets.

Support-vector machines are versatile supervised learning models that handle binary and multi-class classification by maximizing the decision boundary. They perform both linear and non-linear classification using the kernel trick and are also used for ε-sensitive (epsilon sensitive) regression and clustering in complex, high-dimensional spaces (https://en.wikipedia.org/wiki/Support_vector_machine). Support-vector machines are capable of solving both regression and classification problems based on their ability to find the optimal hyperplane that separates different classes with the maximum margin, making them effective in high-dimensional spaces.

**Unsupervised Learning Algorithms**

Understanding these common supervised learning methods and the problems they solve allows for more effective insight extraction from datasets. Now, let’s explore unsupervised learning and examine its principal algorithms. This domain is largely dominated by methods such as K-Means Clustering and Hierarchical Clustering. In contrast to supervised learning, which employs a diverse array of algorithms for various tasks, unsupervised learning is primarily focused on clustering problems. Therefore, it’s generally understood that clustering tasks fall under the category of unsupervised learning. K-Means Clustering is an algorithm that segments a dataset into a specified number of distinct, non-overlapping clusters, K, by assigning each data point to the nearest cluster, effectively solving a straightforward optimization problem (Garath James, 2023, pp. 521). Hierarchical Clustering builds a dendrogram, a tree-like structure that represents observations, allowing the visualization of cluster formation at every possible level without pre-specifying the number of clusters (Garath James, 2023, pp. 521). These two clustering methods exemplify the core focus of unsupervised learning on uncovering the inherent structure within data without the need for labeled outcomes.

**Core Principles of Machine Learning**

Finally, it’s important to know about the core principles of machine learning. The core principles of learning can be summarized as: 1. Cross-Discipline Relevance, 2. Not a Black Box, 3. Practical Knowledge vs. Technical Knowledge, and 4. Real-World Applications.

In the first principle, ‘Cross-Discipline Relevance,’ machine learning emphasizes the broad applicability of statistical learning methods across various fields, not just statistics. The focus must be on presenting the most universally useful methods rather than covering every possible technique.

In the second principle, ‘Not a Black Box,’ it’s important to understand the inner workings and trade-offs of each method to choose the most suitable one for a given application because no single approach is the best for all situations.

In the third principle, ‘Practical Knowledge vs. Technical Knowledge,’ minimizing the technicalities of algorithm construction is crucial, aiming to make the content accessible without advanced mathematical knowledge. Concepts are explained in a way that avoids complex mathematics like matrix algebra.

In the fourth principle, ‘Real-World Applications,’ a fundamental aspect is to incorporate tutorials in various programming languages that apply machine learning algorithms to real-world scenarios. These practical exercises are designed to be engaging and are essential for understanding the material comprehensively. Understanding these core principles helps in effectively applying machine learning techniques across diverse contexts and ensures a well-rounded comprehension of the field.

**Next Steps**

In conclusion, machine learning and artificial intelligence have emerged as pivotal technologies in the past decade, offering immense potential and posing significant questions about their role in society. As we navigate this evolving landscape, it is crucial to explain these technologies by understanding their applications, methodologies, and ethical implications. Supervised learning, with its diverse array of algorithms such as Linear Regression, Logistic Regression, and Neural Networks, provides powerful tools for prediction and classification. Meanwhile, unsupervised learning methods like K-Means Clustering and Hierarchical Clustering reveal the inherent structures within data, allowing for insightful analysis without the need for labeled outcomes. By grasping the core principles of machine learning — Cross-Discipline Relevance, Not a Black Box, Practical Knowledge vs. Technical Knowledge, and Real-World Applications — we can better appreciate the versatility and complexity of these tools. These principles emphasize the broad applicability of machine learning, the importance of understanding algorithmic workings, and the necessity of practical, real-world applications to reinforce learning. Ultimately, mastering these concepts and techniques enables us to harness the full potential of machine learning, driving innovation and creating opportunities across various fields. As we continue to explore and develop these technologies, our ability to apply them thoughtfully and ethically will shape the future of our increasingly data-driven world.

Thank you for reading this article as I attempt to unravel Artificial Intelligence and Machine Learning. My name is Angel Sanchez and I am currently teaching myself to become an expert in Artificial Intelligence by developing a free online course to document my learning. This journey is not just about acquiring knowledge but also about sharing and growth. I’m learning as I go, and I invite you to be part of this learning process with me. If you found this content insightful or if you have suggestions or corrections, please comment below or give feedback. Also Check out the Resources Below, more specifically my use of the book “An Introduction to Statistical Learning, With Applications in Python” of which this course will be based on. Your input is invaluable as it helps me and others learn more effectively. Don’t forget to stay tuned for the next part of this series, where I will dive even deeper into the mechanics and applications of machine learning.

<iframe width=”560" height=”315" src=”https://www.youtube.com/embed/rxB8QeX0ntA?si=mIYVTjPFuL97y__1" title=”YouTube video player” frameborder=”0" allow=”accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share” referrerpolicy=”strict-origin-when-cross-origin” allowfullscreen></iframe>

# Works cited

Garath James, D. W. (2023). An Introduction to Statistical Learning: with Applications in Python. Springer. https://www.statlearning.com/

https://en.wikipedia.org/wiki/Linear_discriminant_analysis

https://en.wikipedia.org/wiki/Naive_Bayes_classifier

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

https://en.wikipedia.org/wiki/Decision_tree_learning

https://en.wikipedia.org/wiki/Support_vector_machine

Machine Learning: (Garath James, 2023, pp. 1–8).

Supervised Learning:(Garath James, 2023, pp. 1–8).

Unsupervised Learning: (Garath James, 2023, pp. 1–8).

Regression problem:(Garath James, 2023, pp. 1–8).

Classification problem: (Garath James, 2023, pp. 1–8).

Clustering problem:(Garath James, 2023, pp. 1–8).

Linear Regression: (Garath James, 2023, pp. 69).

Logistic Regression: (Garath James, 2023, pp. 27).

Linear Discriminant Analysis: (https://en.wikipedia.org/wiki/Linear_discriminant_analysis).

Naive Bayes:(https://en.wikipedia.org/wiki/Naive_Bayes_classifier).

K-nearest neighbor algorithm:(https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm)..

Decision trees:(https://en.wikipedia.org/wiki/Decision_tree_learning).

Neural networks: (Garath James, 2023, pp. 400).

Support-vector machines: (https://en.wikipedia.org/wiki/Support_vector_machine ).

K-Means Clustering: (Garath James, 2023, pp. 521).

Hierarchical Clustering: (Garath James, 2023, pp. 521).

Core Principles of Machine Learning: (Garath James, 2023, pp. 7)