Pandas is an open-source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
Python is pretty great for data preparation and munging but not for data analysis. Pandas help’s us to fill this gap, enabling us to carry out the entire data analysis process in Python.
Pandas provide us two distinct data types
- Pandas Series
- Pandas DataFrame
Pandas Series is a one-dimensional array which can hold any data type. Pandas Series are labelled dataset. The axis label are collectively called as index. In simple words, we can say the pandas Series is same as a column in excel as shown below. Here the series is labelled as A and row number depicts the index.
The difference between this excel column and Pandas Series is, the indexing in Series starts with “0”.
One of feature of Pandas Series it is not necessary to have the default indexing i.e. we can have a custom indexing as shown in above figure. label-based indexing provides a host of methods for performing operations involving the index.
Creating a Pandas Series
Pandas Series can be created using a numpy array,
python list, python Dictionary, scalar values etc. by using the Keyword Series.
We can check the data type of the Series elements by the keyword
To create a Pandas Series we have to import the pandas
library. To create a Series from list, we can first create a list, or we can
create a list in the
syntax itself. We will see both the examples.
import pandas as pd #Creating List list1 = [100,90,80,70,60] #passing list to Series Keyword series1 = pd.Series(list1) print(series1) #creating list inside Series keyword series2 = pd.Series([1,2,3,4,5,6]) print(series2)
0 100 1 90 2 80 3 70 4 60 dtype: int64 0 1 1 2 2 3 3 4 4 5 5 6 dtype: int64
Using NumPy Array
To create a pandas series using numpy array, we have
to import numpy library as well. We’ll create an array using
array() function and passing that array to
import pandas as pd import numpy as np #Creating array array1 = np.array(["P","Y","T","H","O","N"]) #Creating series using array series3 = pd.Series(array1) print(series3)
0 P 1 Y 2 T 3 H 4 O 5 N dtype: object
Indexing of Pandas Series
In Pandas Series, if we do not pass any custom index,
the series will have the default numerical index that starts from 0. We can change the default indexing while
We can add a new index after creating the series by using reindex keyword. The
new created indexes will have
null (NaN) value.
#Creating Custom index ser1 = pd.Series(data,index = ["TATA","Hundai","Ferari","Suzuki"]) print(ser1) #Adding new index value new_index = ["TATA","Hundai","Ferari","Suzuki","Mercedes"] ser2 = ser1.reindex(new_index) print(ser2)
TATA 20 Hundai 30 Ferari 27 Suzuki 24 dtype: int64 TATA 20.0 Hundai 30.0 Ferari 27.0 Suzuki 24.0 Mercedes NaN dtype: float64
Accessing Elements from a Series
We can access the elements of the series in two ways
To access the elements of the Series with their
position number, is similar to that of accessing the elements of array. We used the index
operator  to access the elements of the
series. To access multiple or partial
elements from a series we use the slice
We can also access the elements of the
Series by the index is similar to that of
position. Below are the example of both types.
series1 = pd.Series(["Ajay","Amar","Priya","Sumit","Tarun"],index=["First","Second","Third","Fourth","Fifth"]) print(series1) # positional Slicing print("\n Third Element") print(series1) #third element of series1 #positional slicing multiple elements print("\n First 3 Elements") print(series1[:3]) # First 3 values #index slicing print("\n Second Element") print(series1["Second"])
First Ajay Second Amar Third Priya Fourth Sumit Fifth Tarun dtype: object Third Element Priya First 3 Elements First Ajay Second Amar Third Priya dtype: object Second Element Amar