10 Machine Learning Blueprints You Should Know for Cybersecurity: Protect your systems and boost your defenses with cutting-edge Al techniques

Автор: Oak Rajvardhan

Дата выхода: 2023

Издательство: Packt Publishing Limited

Количество страниц: 309

Размер файла: 3,2 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы Дополнительные материалы

10 Machine Learning Blueprints You Should Know for Cybersecurity....2

Contributors....5

About the author....5

About the reviewers....5

Preface....15

Who this book is for....17

What this book covers....18

To get the most out of this book....19

Download the example code files....20

Conventions used....20

Get in touch....21

Share Your Thoughts....21

Download a free PDF copy of this book....21

Chapter 1: On Cybersecurity and Machine Learning....23

The basics of cybersecurity....23

Traditional principles of cybersecurity....23

Modern cybersecurity – a multi-faceted issue....25

Privacy....28

An overview of machine learning....28

Machine learning workflow....29

Supervised learning....31

Unsupervised learning....33

Semi-supervised learning....34

Evaluation metrics....35

Machine learning – cybersecurity versus other domains....37

Summary....39

Chapter 2: Detecting Suspicious Activity....41

Technical requirements....41

Basics of anomaly detection....42

What is anomaly detection?....42

Introducing the NSL-KDD dataset....43

Statistical algorithms for intrusion detection....46

Univariate outlier detection....46

Elliptic envelope....49

Local outlier factor....50

Machine learning algorithms for intrusion detection....55

Density-based scan (DBSCAN)....55

One-class SVM....59

Isolation forest....62

Autoencoders....65

Summary....74

Chapter 3: Malware Detection Using Transformers and BERT....75

Technical requirements....75

Basics of malware....76

What is malware?....76

Types of malware....77

Malware detection....78

Malware detection methods....78

Malware analysis....79

Transformers and attention....80

Understanding attention....80

Understanding transformers....83

Understanding BERT....85

Detecting malware with BERT....87

Malware as language....87

The relevance of BERT....88

Getting the data....88

Preprocessing the data....89

Building a classifier....90

Summary....96

Chapter 4: Detecting Fake Reviews....98

Technical requirements....98

Reviews and integrity....98

Why fake reviews exist....98

Evolution of fake reviews....99

Statistical analysis....101

Exploratory data analysis....101

Feature extraction....105

Statistical tests....106

Modeling fake reviews with regression....113

Ordinary Least Squares regression....113

OLS assumptions....114

Interpreting OLS regression....115

Implementing OLS regression....116

Summary....120

Chapter 5: Detecting Deepfakes....121

Technical requirements....121

All about deepfakes....121

A foray into GANs....122

How are deepfakes created?....124

The social impact of deepfakes....125

Detecting fake images....126

A naive model to detect fake images....127

Detecting deepfake videos....131

Building deepfake detectors....132

Summary....137

Chapter 6: Detecting Machine-Generated Text....139

Technical requirements....139

Text generation models....140

Understanding GPT....143

Naïve detection....145

Creating the dataset....145

Feature exploration....150

Using machine learning models for detecting text....152

Playing around with the model....154

Automatic feature extraction....155

Transformer methods for detecting automated text....159

Compare and contrast....162

Summary....162

Chapter 7: Attributing Authorship and How to Evade It....164

Technical requirements....164

Authorship attribution and obfuscation....164

What is authorship attribution?....165

What is authorship obfuscation?....166

Techniques for authorship attribution....167

Dataset....167

Feature extraction....169

Training the attributor....173

Improving authorship attribution....175

Techniques for authorship obfuscation....176

Improving obfuscation techniques....181

Summary....182

Chapter 8: Detecting Fake News with Graph Neural Networks....184

Technical requirements....184

An introduction to graphs....185

What is a graph?....185

Representing graphs....186

Graphs in the real world....188

Machine learning on graphs....189

Traditional graph learning....190

Graph embeddings....191

GNNs....193

Fake news detection with GNN....195

Modeling a GNN....195

The UPFD framework....195

Dataset and setup....196

Implementing GNN-based fake news detection....198

Playing around with the model....204

Summary....204

Chapter 9: Attacking Models with Adversarial Machine Learning....206

Technical requirements....206

Introduction to AML....207

The importance of ML....207

Adversarial attacks....207

Adversarial tactics....208

Attacking image models....210

FGSM....210

PGD....215

Attacking text models....217

Manipulating text....219

Further attacks....224

Developing robustness against adversarial attacks....225

Adversarial training....225

Defensive distillation....225

Gradient regularization....225

Input preprocessing....226

Ensemble methods....226

Certified defenses....226

Summary....227

Chapter 10: Protecting User Privacy with Differential Privacy....228

Technical requirements....228

The basics of privacy....229

Core elements of data privacy....229

Privacy and the GDPR....230

Privacy by design....232

Privacy and machine learning....233

Differential privacy....234

What is differential privacy?....234

Differential privacy – a real-world example....235

Benefits of differential privacy....236

Differentially private machine learning....238

IBM Diffprivlib....238

Credit card fraud detection with differential privacy....239

Differentially private deep learning....243

DP-SGD algorithm....243

Implementation....245

Differential privacy in practice....249

Summary....250

Chapter 11: Protecting User Privacy with Federated Machine Learning....252

Technical requirements....252

An introduction to federated machine learning....252

Privacy challenges in machine learning....253

How federated machine learning works....253

The benefits of federated learning....257

Challenges in federated learning....258

Implementing federated averaging....260

Importing libraries....260

Dataset setup....260

Client setup....261

Model implementation....262

Weight scaling....262

Global model initialization....263

Setting up the experiment....263

Putting it all together....264

Reviewing the privacy-utility trade-off in federated learning....267

Global model (no privacy)....268

Local model (full privacy)....269

Understanding the trade-off....270

Beyond the MNIST dataset....271

Summary....271

Chapter 12: Breaking into the Sec-ML Industry....273

Study guide for machine learning and cybersecurity....273

Machine learning theory....273

Hands-on machine learning....274

Cybersecurity....274

Interview questions....275

Theory-based questions....275

Experience-based questions....277

Conceptual questions....277

Additional project blueprints....278

Improved intrusion detection....279

Adversarial attacks on intrusion detection....280

Hate speech and toxicity detection....281

Detecting fake news and misinformation....282

Summary....283

Index....285

Why subscribe?....305

Other Books You May Enjoy....305

Packt is searching for authors like you....308

Share Your Thoughts....308

Download a free PDF copy of this book....308

Machine learning in security is harder than other domains because of the changing nature and abilities of adversaries, high stakes, and a lack of ground-truth data. This book will prepare machine learning practitioners to effectively handle tasks in the challenging yet exciting cybersecurity space.

The book begins by helping you understand how advanced ML algorithms work and shows you practical examples of how they can be applied to security-specific problems with Python – by using open source datasets or instructing you to create your own. In one exercise, you'll also use GPT 3.5, the secret sauce behind ChatGPT, to generate an artificial dataset of fabricated news. Later, you'll find out how to apply the expert knowledge and human-in-the-loop decision-making that is necessary in the cybersecurity space. This book is designed to address the lack of proper resources available for individuals interested in transitioning into a data scientist role in cybersecurity. It concludes with case studies, interview questions, and blueprints for four projects that you can use to enhance your portfolio.

By the end of this book, you'll be able to apply machine learning algorithms to detect malware, fake news, deep fakes, and more, along with implementing privacy-preserving machine learning techniques such as differentially private ML.

What you will learn

Use GNNs to build feature-rich graphs for bot detection and engineer graph-powered embeddings and features
Discover how to apply ML techniques in the cybersecurity domain
Apply state-of-the-art algorithms such as transformers and GNNs to solve security-related issues
Leverage ML to solve modern security issues such as deep fake detection, machine-generated text identification, and stylometric analysis
Apply privacy-preserving ML techniques and use differential privacy to protect user data while training ML models
Build your own portfolio with end-to-end ML projects for cybersecurity

Who this book is for

This book is for machine learning practitioners interested in applying their skills to solve cybersecurity issues. Cybersecurity workers looking to leverage ML methods will also find this book useful. An understanding of the fundamental machine learning concepts and beginner-level knowledge of Python programming are needed to grasp the concepts in this book. Whether you're a beginner or an experienced professional, this book offers a unique and valuable learning experience that'll help you develop the skills needed to protect your network and data against the ever-evolving threat landscape.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг