Python Data Analysis: An end-to-end guide covering data processing, data manipulation and data visualization. 4 Ed

Python Data Analysis: An end-to-end guide covering data processing, data manipulation and data visualization. 4 Ed

Python Data Analysis: An end-to-end guide covering data processing, data manipulation and data visualization. 4 Ed
Автор: Navlani Avinash, Wijaya Cornellius Yudha
Дата выхода: 2026
Издательство: Packt Publishing Limited
Количество страниц: 223
Размер файла: 4,4 МБ
Тип файла: PDF
Добавил: codelibs
 Проверить на вирусы

Welcome to Packt Early Access....10

Python Data Analysis, Fourth Edition: An end-to-end guide covering data processing, data manipulation and data visualization....10

Chapter 1: Getting Started with Python Libraries....12

Join our book community on Discord....13

Navigating the landscape of data analysis....14

Exploring libraries for data analysis....15

Data analysis process methodologies....16

Knowledge discovery from data (KDD)....16

SEMMA....18

CRISP-DM....19

Standard process of data analysis....21

Compare Data Analysis, Data Science and Data Engineering....23

Data Science Domain Job Roles and Skillsets....24

Roles of Data Analyst, Data Scientist, and Data Engineer....24

Skillsets for Data Analyst and Data Scientist....25

Roles of ML Engineer and NLP Engineer....28

Skill set for Data Engineer and ML Engineer....29

A quick look at MLOps....31

Installing Python 3....32

Python installation and setup on Windows....32

Python installation and setup on Linux....33

Python installation and setup on Mac OS X with a GUI installer....33

Python installation and setup on Mac OS X with brew....34

Software tools used in this book....34

Using IPython as a shell....35

Hands on with Ipython....36

Reading manual pages....39

Where to find help?....40

Using JupyterLab....41

Using Jupyter Notebooks....42

Advanced features of Jupyter Notebooks....44

Using PyCharm and VS Code....53

Pycharm....53

Visual Studio Code....54

Using Databricks for PySpark....56

Summary....57

Chapter 2: NumPy and pandas....59

Join our book community on Discord....60

Technical requirements....61

Grasping the essence of NumPy arrays....62

Array properties and attributes....66

Selecting array elements....67

NumPy array numerical data types....69

Data type objects....72

Data type character codes....73

Data type constructors....74

Data type attributes....75

Converting arrays....75

Manipulating array shapes....76

Stacking arrays....79

Splitting arrays....83

Creating views and copies....85

Slicing NumPy Array....88

Broadcasting arrays....92

More on NumPy Methods....94

Creating Pandas DataFrames and Series....98

Describing pandas DataFrames....100

Understanding pandas Series....101

Pandas Series Features....103

Reading and querying the Quandl and Nasdaq Data Link data....105

Grouping and joining pandas DataFrames....109

Concatenating DataFrames....113

Working with missing values....115

Creating pivot tables....116

Dealing with dates....118

Date Features....120

Date Methods....123

Summary....127

References....127

Chapter 3: Statistics for Data Insights....129

Join our book community on Discord....130

Technical requirements....131

Understanding attributes of data and their types....131

Nominal attributes....132

Ordinal attributes....132

Numeric attributes....132

Discrete and continuous attributes....133

Measuring central tendency....134

Mean....134

Mode....135

Median....135

Measuring dispersion....136

Range....136

Inter Quartile Range (IQR)....136

Variance....137

Standard deviation....138

Skewness and kurtosis....139

Understanding relationships using covariance and correlation coefficients....141

Covariance....141

Correlation....141

Pearson's correlation coefficient....142

Spearman's rank correlation coefficient....142

Kendall's rank correlation coefficient....143

Collecting samples....144

Probability Sampling....144

Non-probability sampling....145

Performing parametric tests....146

Understanding t-tests....146

One Sample t-test....147

Two Sample t-test....148

Paired Sample t-test....149

ANOVA....150

One-way ANOVA....151

Two-way ANOVA....152

Performing non-parametric tests....152

Chi-Square Test....153

Mann-Whitney U Test....155

Wilcoxon Signed-Rank Test....156

Kruskal-Walis Test....157

AB testing....159

Performing Sampling and Split the Data into Groups....163

Formulating a Hypothesis and Performing Sampling....164

Bayes theorem....165

Summary....167

Chapter 4: Linear Algebra....169

Join our book community on Discord....170

What is linear algebra?....171

Introduction to scalar, vector, matrix, and tensor....172

Scalar and vectors....172

Matrices and tensors....173

Working with linear algebra in python....174

Fitting polynomials with NumPy....175

Exploring matrix operations....179

The determinant operation....179

Finding the rank of a matrix....180

Matrix inverse using NumPy....180

Solving linear equations using NumPy....182

Eigenvalues, eigenvectors, and matrix decomposition....183

Eigenvectors and Eigenvalues....184

Decomposing a matrix using SVD....185

LU Decomposition....186

QR Decomposition....188

Probability distributions and random number generation....189

Probability Functions for Random Variables....190

Probability Mass Functions....190

Density Functions....190

Types of data distributions....191

Discrete Probability Distributions....191

Continuous Probability Distributions....198

Generating random numbers....206

Test normality of data using SciPy....207

Histogram....208

Anderson-Darling Test....212

D'Agostino-Pearson test....213

Creating a masked array using numpy.ma subpackage....214

Summary....216

Understand data analysis pipelines using machine learning algorithms and techniques with this practical guide

Key Features

  • Prepare and clean your data to use it for exploratory analysis, data manipulation, and data wrangling
  • Discover supervised, unsupervised, probabilistic, and Bayesian machine learning methods
  • Get to grips with graph processing and sentiment analysis

Book Description

Data analysis enables you to generate value from small and big data by discovering new patterns, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you'll get up and running using Python for data analysis by exploring the different phases used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines.

Starting with the essential statistical and data analysis fundamentals using Python, you'll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You'll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you'll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. You'll also work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask.

By the end of this data analysis book, you'll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.

What you will learn

  • Prepare, clean, and transform your data for exploratory analysis, manipulation, and wrangling.
  • Explore concepts in signal processing, time series analysis, and predictive analytics.
  • Understand and apply key machine learning techniques, including supervised, unsupervised, probabilistic, and Bayesian methods.
  • Work with graph data and perform sentiment analysis.
  • Handle large-scale image and text analytics efficiently.
  • Accelerate data manipulation using Dask, Modin, and Ray.
  • Perform scalable big data analytics with PySpark.

Who this book is for

This book is for data analysts, business analysts, statisticians, and data scientists looking to learn how to use Python for data analysis. Students and academic faculties will also find this book useful for learning and teaching Python data analysis using a hands-on approach. A basic understanding of math and working knowledge of the Python programming language will help you get started with this book.


Похожее:

Список отзывов:

Нет отзывов к книге.