Pandas

Pandas is a Python data analysis library. It was developed to bring a portion of the statistical capabilities of R into Python. Pandas accomplishes this by introducing the Series and DataFrame objects to represent data, and incorporating Matplotlib and many features of NumPy into these objects to simplify data representation, analysis, and plotting. Pandas works with the statsmodel and scikit-learn packages for data modeling. Pandas supports data alignment, missing data, pivoting, grouping, merging and joining of datasets, and many other features for data analysis.

Note that we will adhere to a convention of

import pandas as pd

This naming scheme is not required, but it is very common, much like np for NumPy.

Previous
Next