Member-only story

Understanding your Data Set

John Jarvis
2 min readAug 30, 2024

--

Python functions for data exploration and description. Great for exploratory analysis and troubleshooting.

Image generated using Copilot Designer

Loading Data:

  • pd.read_csv(): Load data from a CSV file.
  • pd.read_excel(): Load data from an Excel file.

Basic Information:

  • df.head(): Display the first few rows of the DataFrame.
  • df.tail(): Display the last few rows of the DataFrame.
  • df.info(): Get a summary of the DataFrame, including the data types and non-null values.
  • df.describe(): Generate descriptive statistics for numeric columns.

Data Types and Structure:

  • df.dtypes: Display the data types of each column.
  • df.shape: Get the dimensions of the DataFrame (rows, columns).

Column and Index Information:

  • df.columns: List all column names.
  • df.index: Display the index (row labels) of the DataFrame.

Missing Values:

  • df.columns: List all column names.
  • df.isnull().sum(): Count the number of missing values in each column.

Unique Values and Value Counts:

  • df[‘column’].unique(): Get unique values in…

--

--

John Jarvis
John Jarvis

Written by John Jarvis

Data Analyst with an MBA. I write about adapting to new technology and perspectives.

No responses yet