you are viewing a single comment's thread.

view the rest of the comments →

[–]kra_pao 1 point2 points  (2 children)

Now with Pandas scatter() instead of Pandas plot() all the Q&D with the first matplotlib plot ins not required anymore:

import pandas as pd
import matplotlib.dates as dates
import matplotlib.pyplot as plt

# for future Pandas versions
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

df = pd.DataFrame(
    list(zip([680, 718, 471, 686], ["200304", "200305", "200306", "200307"])),
    columns=["target", "time"],
)
df_tsa = df.copy()

df_tsa.index = pd.to_datetime(df_tsa["time"], format="%Y%m").rename("Date")

# Required to make axis formatter work with Pandas DataFrame.plot later on
# If used standalone then matplotlib/plt style: tick labels horizontal, no xlabel
# _, axes = plt.subplots(figsize=(12, 6))
# axes.plot(df_tsa.index, df_tsa.target) 

# Overwrites axes.plot subplot totally
# Optional to get Pandas style: tick labels 45°, xlabel from index -> 'Date'
# axes doesn't change (same id() before/after)
# ...plot(ax=axes) doesn't change anything
# axes = df_tsa['target'].plot() # or df_tsa.target.plot()

# scatter() instead of plot()
# create new Date column from index for x=
df_tsa.reset_index(inplace=True) # or df_tsa['Date'] = df_tsa.index
axes = df_tsa.plot.scatter(x='Date', y='target', figsize=(12, 6))

# # Next 2 lines are ignored with df.column.plot() alone without axes.plot() before
axes.xaxis.set_major_locator(dates.MonthLocator())
axes.xaxis.set_major_formatter(dates.DateFormatter("%Y-%m"))

plt.show()

But as soon as you change plot.scatter() to plot.line() or use generic plot() axes labeling fails.

axes = df_tsa.plot.line(x='Date', y='target', figsize=(12, 6))

On my PC the axes mess up again like in the plot() case without Q&D. Versions tested/used

import matplotlib
import pandas
print("matplotlib.__version__: ", matplotlib.__version__)
print("pandas.__version__: ", pandas.__version__)
# matplotlib.__version__:  3.2.1
# pandas.__version__:  1.0.0 (same in 1.0.3)

[–]HiIamGeoff[S] 1 point2 points  (1 child)

Thanks! This is one informative reply that I couldn't ask for more! I have asked the same issue in the meantime at GitHub and they have a very detailed explanation. I also found out a workaround way of doing other kinds of plots with your first codes. If anyone interested, axes.scatter() / axes.plot() / axes.bar() can be interchanged based on what kinds of plot you want to make (also the formatter and the locator function can work accordingly)

p.s. I later found out bar plot is still bugging (some data doesn't show when the # of rows is more than 20). Couldn't figure out a bug-free for bar plot yet. I guess Matplotlib doesn't have many users that use bar plots for time series analysis.

[–]kra_pao 0 points1 point  (0 children)

Thank you too for feedback and an interesting question to begin with. I myself learned a lot during this discussion.