I have a Value 38142 I need to convert it into date format using python. if use this number in excel and do a right click and format cell at that time the value will be converted to 04/06/2004 and I need the same result using python. How can I achieve this
7 Answers
The offset in Excel is the number of days since 1900/01/01, with 1
being the first of January 1900, so add the number of days as a timedelta to 1899/12/31:
from datetime import datetime, timedelta
def from_excel_ordinal(ordinal: float, _epoch0=datetime(1899, 12, 31)) -> datetime:
if ordinal >= 60:
ordinal -= 1 # Excel leap year bug, 1900 is not a leap year!
return (_epoch0 + timedelta(days=ordinal)).replace(microsecond=0)
You have to adjust the ordinal by one day for any date after 1900/02/28; Excel has inherited a leap year bug from Lotus 1-2-3 and treats 1900 as a leap year. The code above returns datetime(1900, 2, 28, 0, 0)
for both 59
and 60
to correct for this, with fractional values in the range [59.0 - 61.0) all being a time between 00:00:00.0 and 23:59:59.999999 on that day.
The above also supports serials with a fraction to represent time, but since Excel doesn't support microseconds those are dropped.
-
1@Krish: the bug is popularized by Joel Spolsky: My First BillG Review– jfsCommented Apr 2, 2015 at 20:07
-
Are you sure the epoch is not December 31, 1899?
datetime(1899, 12, 31) + timedelta(ordinal - (ordinal > 59))
– jfsCommented Apr 2, 2015 at 20:20 -
@J.F.Sebastian I stuck to the documentation for Excel here; it makes little difference here to subtract one relative to 1900-01-01. Commented Apr 2, 2015 at 21:00
-
makes no sense to have _epoch as a parameter if we hard code the ordinal check for being > 59. Commented Oct 29, 2018 at 17:36
-
2@FinanceGuyThatCantCode: The
_epoch
parameter is there to cache the value as a local variable, nothing more. This helps avoid having to create it for each call, or to have to look up a global (slightly slower). Commented Oct 29, 2018 at 17:40
from datetime import datetime, timedelta
def from_excel_ordinal(ordinal, epoch=datetime(1900, 1, 1)):
# Adapted from above, thanks to @Martijn Pieters
if ordinal > 59:
ordinal -= 1 # Excel leap year bug, 1900 is not a leap year!
inDays = int(ordinal)
frac = ordinal - inDays
inSecs = int(round(frac * 86400.0))
return epoch + timedelta(days=inDays - 1, seconds=inSecs) # epoch is day 1
excelDT = 42548.75001 # Float representation of 27/06/2016 6:00:01 PM in Excel format
pyDT = from_excel_ordinal(excelDT)
The above answer is fine for just a date value, but here I extend the above solution to include time and return a datetime values as well.
-
There is no need to split out the days and the seconds;
timedelta()
does this for you whendays
is a floating point value. Commented Apr 20, 2022 at 15:09
I would recomment the following:
import pandas as pd
def convert_excel_time(excel_time):
return pd.to_datetime('1900-01-01') + pd.to_timedelta(excel_time,'D')
Or
import datetime
def xldate_to_datetime(xldate):
temp = datetime.datetime(1899, 12, 30)
delta = datetime.timedelta(days=xldate)
return temp+delta
Is inspired by https://gist.github.com/oag335/9959241 but fixed to handle the two day offset bug in Excel
-
xldate_to_datetime(44000) gives 2020-06-20 where as the answer is 2020-06-18 Commented Oct 5, 2020 at 15:58
-
1@PoornaPrudhvi is correct; the base date should be 1899-12-30. One day offset because we should be adding to Dec 31 and another day offset b/c of the leap year bug mention in the accepted answer.– JJLCommented Oct 14, 2021 at 19:11
I came to this question when trying to do the same above, but for entire columns within a df. I made this function, which did it for me:
import pandas as pd
from datetime import datetime, timedelta
import copy as cp
def xlDateConv(df, *cols):
tempDt = []
fin = cp.deepcopy(df)
for col in [*cols]:
for i in range(len(fin[col])):
tempDate = datetime(1900, 1, 1)
delta = timedelta(float(fin[col][i]))
tempDt.append(pd.to_datetime(tempDate+delta))
fin[col] = tempDt
tempDt = []
return fin
Note that you need to type each column, quoted (as string), as one parameter, which can most likely be improved (list of columns as input, for instance). Also, it returns a copy of the original df (doesn't change the original).
Btw, partly inspired by this (https://gist.github.com/oag335/9959241).
-
1
If you are working with Pandas this could be useful
import xlrd
import datetime as dt
def from_excel_datetime(x):
return dt.datetime(*xlrd.xldate_as_tuple(x, datemode=0))
df['date'] = df.excel_date.map(from_excel_datetime)
If the date seems to be 4 years delayed, maybe you can try with datemode 1.
:param datemode: 0: 1900-based, 1: 1904-based.
I had the same problem and then I used this function: (source: https://gist.github.com/OmarArain/9959241)
import datetime
def xldate_to_datetime(xldate):
xldate = int(xldate)
temp = datetime.datetime(1900, 1, 1)
delta = datetime.timedelta(days=xldate)
return temp+delta
And then I applied it to my dataframe:
df['column_date'] = df['column_date'].apply(lambda x: xldate_to_datetime(x))
This is going to be the simplest solution yet: openpyxl
has a built-in just for that:
from openpyxl.utils.datetime import from_excel
excel_oridinal = 38142
python_datetime = from_excel(excel_oridinal)
print(python_datetime, type(python_datetime)) # 2004-06-04 00:00:00 <class 'datetime.datetime'>
It takes care of the 1900-02-29
bug by itself.
Just make sure you understand which date system you're working with – read more here.
date.fromordinal()
does. But that number is missing a digit then.