Read csv with dd.mm.yyyy in Python and Pandas -
i reading csv file german date format. seems worked ok in post:
picking dates imported csv pandas/python
however, seems in case date not recognized such. not find wrong string in test file.
import pandas pd import numpy np %matplotlib inline import matplotlib.pyplot plt matplotlib import style pandas import dataframe style.use('ggplot') df = pd.read_csv('testdata.csv', dayfirst=true, parse_dates=true) df[:5]
this results in:
so, column dates not recognized such. doing wrong here? or date format not compatible?
- osx 10.10.3
- anaconda conda 3.13.0
- python 3.4.3-0
- ipython notebook 3.1.0
if use parse_dates=true
read_csv
tries parse index date. therefore, need declare first column index index_col=[0]
:
in [216]: pd.read_csv('testdata.csv', dayfirst=true, parse_dates=true, index_col=[0]) out[216]: morgens mittags abends datum 2015-03-16 382 452 202 2015-03-17 288 467 192
alternatively, if don't want datum
column index, use parse_dates=[0]
explicitly tell read_csv
parse first column dates:
in [217]: pd.read_csv('testdata.csv', dayfirst=true, parse_dates=[0]) out[217]: datum morgens mittags abends 0 2015-03-16 382 452 202 1 2015-03-17 288 467 192
under hood read_csv
uses dateutil.parser.parse
parse date strings:
in [218]: import dateutil.parser dp in [221]: dp.parse('16.03.2015', dayfirst=true) out[221]: datetime.datetime(2015, 3, 16, 0, 0)
since dateutil.parser
has no trouble parsing date strings in dd.mm.yyyy
format, don't have declare custom date parser here.
Comments
Post a Comment