Transformer, BERT, and GPT: Including ChatGPT and Prompt Engineering

Автор: Campesato Oswald

Дата выхода: 2024

Издательство: Mercury Learning and Information LLC.

Количество страниц: 379

Размер файла: 1,7 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Front Cover....1

Half-Title Page....2

LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY....3

Title Page....4

Dedication....6

Contents....8

Preface....12

Chapter 1 Introduction....16

What is Generative AI?....16

Conversational AI Versus Generative AI....18

Is DALL-E Part of Generative AI?....20

Are ChatGPT-3 and GPT-4 Part of Generative AI?....21

DeepMind....22

OpenAI....23

Cohere....24

Hugging Face....25

AI21....26

InflectionAI....26

Anthropic....26

What are LLMs?....27

What is AI Drift?....29

Machine Learning and Drift (Optional)....30

What is Attention?....31

Calculating Attention: A High-Level View....34

An Example of Self Attention....36

Multi-Head Attention (MHA)....40

Summary....42

Chapter 2 Tokenization....44

What is Pre-Tokenization?....44

What is Tokenization?....49

Word, Character, and Subword Tokenizers....54

Trade-Offs with Character-Based Tokenizers....57

Subword Tokenization....58

Subword Tokenization Algorithms....61

Hugging Face Tokenizers and Models....64

Hugging Face Tokenizers....68

Tokenization for the DistilBERT Model....70

Token Selection Techniques in LLMs....74

Summary....74

Chapter 3 Transformer Architecture Introduction....76

Sequence-to-Sequence Models....77

Examples of seq2seq Models....79

What About RNNs and LSTMs?....81

Encoder/Decoder Models....82

Examples of Encoder/Decoder Models....84

Autoregressive Models....85

Autoencoding Models....87

The Transformer Architecture: Introduction....89

The Transformer is an Encoder/Decoder Model....93

The Transformer Flow and Its Variants....95

The transformers Library from Hugging Face....97

Transformer Architecture Complexity....99

Hugging Face Transformer Code Samples....100

Transformer and Mask-Related Tasks....106

Summary....110

Chapter 4 Transformer Architecture in Greater Depth....112

An Overview of the Encoder....113

What are Positional Encodings?....115

Other Details Regarding Encoders....118

An Overview of the Decoder....119

Encoder, Decoder, or Both: How to Decide?....122

Delving Deeper into the Transformer Architecture....125

Autoencoding Transformers....129

The “Auto” Classes....130

Improved Architectures....131

Hugging Face Pipelines and How They Work....132

Hugging Face Datasets....134

Transformers and Sentiment Analysis....141

Source Code for Transformer-Based Models....141

Summary....142

Chapter 5 The BERT Family Introduction....144

What is Prompt Engineering?....145

Aspects of LLM Development....151

Kaplan and Under-Trained Models....154

What is BERT?....155

BERT and NLP Tasks....161

BERT and the Transformer Architecture....164

BERT and Text Processing....164

BERT and Data Cleaning Tasks....166

Three BERT Embedding Layers....167

Creating a BERT Model....168

Training and Saving a BERT Model....170

The Inner Workings of BERT....170

Summary....173

Chapter 6 The BERT Family in Greater Depth....174

A Code Sample for Special BERT Tokens....174

BERT-Based Tokenizers....176

Sentiment Analysis with DistilBERT....179

BERT Encoding: Sequence of Steps....181

Sentence Similarity in BERT....184

Generating BERT Tokens (1)....187

Generating BERT Tokens (2)....189

The BERT Family....191

Working with RoBERTa....197

Italian and Japanese Language Translation....198

Multilingual Language Models....200

Translation for 1,000 Languages....201

M-BERT....202

Comparing BERT-Based Models....204

Web-Based Tools for BERT....205

Topic Modeling with BERT....207

What is T5?....208

Working with PaLM....209

Summary....210

Chapter 7 Working with GPT-3 Introduction....212

The GPT Family: An Introduction....213

GPT-2 and Text Generation....221

What is GPT-3?....225

GPT-3 Models....229

What is the Goal of GPT-3?....231

What Can GPT-3 Do?....232

Limitations of GPT-3....234

GPT-3 Task Performance....235

How GPT-3 and BERT are Different....236

The GPT-3 Playground....237

Inference Parameters....241

Overview of Prompt Engineering....244

Details of Prompt Engineering....246

Few-Shot Learning and Fine-Tuning LLMs....249

Summary....252

Chapter 8 Working with GPT-3 in Greater Depth....254

Fine-Tuning and Reinforcement Learning (Optional)....255

GPT-3 and Prompt Samples....260

Working with Python and OpenAI APIs....280

Text Completion in OpenAI....285

The Completion() API in OpenAI....287

Text Completion and Temperature....289

Text Classification with GPT-3....294

Sentiment Analysis with GPT-3....296

GPT-3 Applications....299

Open-Source Variants of GPT-3....302

Miscellaneous Topics....306

Summary....308

Chapter 9 ChatGPT and GPT-4....310

What is ChatGPT?....310

Plugins, Code Interpreter, and Code Whisperer....315

Detecting Generated Text....318

Concerns about ChatGPT....319

Sample Queries and Responses from ChatGPT....321

ChatGPT and Medical Diagnosis....324

Alternatives to ChatGPT....324

Machine Learning and ChatGPT: Advanced Data Analytics....326

What is InstructGPT?....327

VizGPT and Data Visualization....328

What is GPT-4?....330

ChatGPT and GPT-4 Competitors....332

LlaMa-2....335

When Will GPT-5 Be Available?....337

Summary....338

Chapter 10 Visualization with Generative AI....340

Generative AI and Art and Copyrights....341

Generative AI and GANs....341

What is Diffusion?....343

CLIP (OpenAI)....345

GLIDE (OpenAI)....346

Text-to-Image Generation....347

Text-to-Image Models....352

The DALL-E Models....353

DALL-E 2....359

DALL-E Demos....362

Text-to-Video Generation....364

Text-to-Speech Generation....366

Summary....367

Index....368

This book provides a comprehensive group of topics covering the details of the Transformer architecture, BERT models, and the GPT series, including GPT-3 and GPT-4. Spanning across ten chapters, it begins with foundational concepts such as the attention mechanism, then tokenization techniques, explores the nuances of Transformer and BERT architectures, and culminates in advanced topics related to the latest in the GPT series, including ChatGPT. Key chapters provide insights into the evolution and significance of attention in deep learning, the intricacies of the Transformer architecture, a two-part exploration of the BERT family, and hands-on guidance on working with GPT-3. The concluding chapters present an overview of ChatGPT, GPT-4, and visualization using generative AI. In addition to the primary topics, the book also covers influential AI organizations such as DeepMind, OpenAI, Cohere, Hugging Face, and more. Readers will gain a comprehensive understanding of the current landscape of NLP models, their underlying architectures, and practical applications. Features companion files with numerous code samples and figures from the book.

FEATURES:

Provides a comprehensive group of topics covering the details of the Transformer architecture, BERT models, and the GPT series, including GPT-3 and GPT-4.
Features companion files with numerous code samples and figures from the book.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг