I have a dataframe:
df = pd.DataFrame({'unix_utc_ts': [1503007204222, 1503007210206, 1503007215121,
1503007220475], 'tz': ['+0000', '+0100', 'CEST', 'EEST']})
And I want to convert unix timestamps into datetime with timezone, so I want something like this:
df['local_ts'] = pd.to_datetime(df['unix_utc_ts'], unit='ms', tz=df['tz'])
The above code doesn't work. Without the tz argument I get this:
tz unix_utc_ts utc_ts
0 +0000 1503007204222 2017-08-17 22:00:04.222
1 +0100 1503007210206 2017-08-17 22:00:10.206
2 CEST 1503007215121 2017-08-17 22:00:15.121
3 EEST 1503007220475 2017-08-17 22:00:20.475
But of course I want timezone to be included in datetime, so I want this dataframe:
tz unix_utc_ts utc_ts local_ts
0 +0000 1503007204222 2017-08-17 22:00:04.222 2017-08-17 22:00:04.222
1 +0100 1503007210206 2017-08-17 22:00:10.206 2017-08-17 23:00:10.206
2 CEST 1503007215121 2017-08-17 22:00:15.121 2017-08-18 00:00:15.121
3 EEST 1503007220475 2017-08-17 22:00:20.475 2017-08-18 01:00:20.475
I've searched and read lots of stackoverflow questions but didn't find any working answer :(
Thank you!
解决方案
Use apply and tz_localize to convert each row:
import pandas as pd
df = pd.DataFrame({'unix_utc_ts': [1503007204222, 1503007210206, 1503007215121,
1503007220475], 'tz': ['CEST', 'EEST', 'CEST', 'EEST']})
df['datetime_utc'] = pd.to_datetime(df['unix_utc_ts'], unit='ms')
df['datetime_local'] = df.apply(lambda x: x['datetime_utc'].tz_localize(x['tz']), axis=1)
However, you are going to have an issue with the format of your timezones. Pytz is used to map timezone strings, and the ones you have listed don't match (here is a listing I found of them). So you'll have to make a mapping from your timezone names to Pytz's.