Page 5: Time Series & Advanced Operations

5. Time Series & Advanced Operations

pandas is exceptionally well-suited for working with time series data, offering robust functionalities for various time-based analyses.

5.1. Date and Time Handling

Converting strings to datetime objects is the first step for time series analysis. Setting a datetime column as the index enables powerful time-based operations.

# Create a DataFrame with a date column
data = {'Date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-05'],
        'Value': [10, 12, 15, 13, 16]}
df_time = pd.DataFrame(data)

# Convert 'Date' column to datetime objects
df_time['Date'] = pd.to_datetime(df_time['Date'])

# Set 'Date' as the index
df_time.set_index('Date', inplace=True)
print(df_time.head())
print(df_time.info())

5.2. Resampling Time Series Data

Resampling is the process of converting a time series from one frequency to another (e.g., daily to monthly). .resample() followed by an aggregation function is commonly used.

# Resample daily data to weekly data, taking the mean
weekly_mean = df_time.resample('W')['Value'].mean()
print(weekly_mean)

# Resample to monthly sum
monthly_sum = df_time.resample('M')['Value'].sum()
print(monthly_sum)

5.3. Rolling and Expanding Windows

Window functions allow you to perform calculations over a sliding (rolling) or growing (expanding) window of data. This is useful for moving averages, cumulative sums, etc.

# Calculate a 3-day rolling mean
df_time['Rolling_Mean_3D'] = df_time['Value'].rolling(window=3).mean()
print(df_time)

# Calculate an expanding sum
df_time['Expanding_Sum'] = df_time['Value'].expanding().sum()
print(df_time)

With these advanced operations, you can perform sophisticated time series analysis and extract deeper insights from your datasets using pandas.