34

I have a Value 38142 I need to convert it into date format using python. if use this number in excel and do a right click and format cell at that time the value will be converted to 04/06/2004 and I need the same result using python. How can I achieve this

5
  • That's a weird ordinal; are you sure 04/06/2004 is correct? If the value 38142 stands for days then that'd be an offset from either 1993/12/25 or 1993/10/27 depending on what you interpret as the month. Commented Apr 1, 2015 at 9:29
  • Formula to convert date to number suggests it should be a number of days since 1900/01/01, which is what date.fromordinal() does. But that number is missing a digit then. Commented Apr 1, 2015 at 9:30
  • My file have the value I don't know its ordinal or not my client says its ordinal and told me that "if you want find the actual date just do format cell in excel for the given value at that time I am getting this value" @MartijnPieters Commented Apr 1, 2015 at 9:39
  • yeah, it is indeed an ordinal, but there's a bug in Excel which caused me to discount my initial theory. Commented Apr 1, 2015 at 9:51
  • related, older question: How to convert a python datetime.datetime to excel serial date number Commented Apr 16, 2021 at 6:17

7 Answers 7

49

The offset in Excel is the number of days since 1900/01/01, with 1 being the first of January 1900, so add the number of days as a timedelta to 1899/12/31:

from datetime import datetime, timedelta

def from_excel_ordinal(ordinal: float, _epoch0=datetime(1899, 12, 31)) -> datetime:
    if ordinal >= 60:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    return (_epoch0 + timedelta(days=ordinal)).replace(microsecond=0)

You have to adjust the ordinal by one day for any date after 1900/02/28; Excel has inherited a leap year bug from Lotus 1-2-3 and treats 1900 as a leap year. The code above returns datetime(1900, 2, 28, 0, 0) for both 59 and 60 to correct for this, with fractional values in the range [59.0 - 61.0) all being a time between 00:00:00.0 and 23:59:59.999999 on that day.

The above also supports serials with a fraction to represent time, but since Excel doesn't support microseconds those are dropped.

6
  • 1
    @Krish: the bug is popularized by Joel Spolsky: My First BillG Review
    – jfs
    Commented Apr 2, 2015 at 20:07
  • Are you sure the epoch is not December 31, 1899? datetime(1899, 12, 31) + timedelta(ordinal - (ordinal > 59))
    – jfs
    Commented Apr 2, 2015 at 20:20
  • @J.F.Sebastian I stuck to the documentation for Excel here; it makes little difference here to subtract one relative to 1900-01-01. Commented Apr 2, 2015 at 21:00
  • makes no sense to have _epoch as a parameter if we hard code the ordinal check for being > 59. Commented Oct 29, 2018 at 17:36
  • 2
    @FinanceGuyThatCantCode: The _epoch parameter is there to cache the value as a local variable, nothing more. This helps avoid having to create it for each call, or to have to look up a global (slightly slower). Commented Oct 29, 2018 at 17:40
8
from datetime import datetime, timedelta

def from_excel_ordinal(ordinal, epoch=datetime(1900, 1, 1)):
    # Adapted from above, thanks to @Martijn Pieters 

    if ordinal > 59:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    inDays = int(ordinal)
    frac = ordinal - inDays
    inSecs = int(round(frac * 86400.0))

    return epoch + timedelta(days=inDays - 1, seconds=inSecs) # epoch is day 1

excelDT = 42548.75001           # Float representation of 27/06/2016  6:00:01 PM in Excel format  
pyDT = from_excel_ordinal(excelDT)

The above answer is fine for just a date value, but here I extend the above solution to include time and return a datetime values as well.

1
  • There is no need to split out the days and the seconds; timedelta() does this for you when days is a floating point value. Commented Apr 20, 2022 at 15:09
2

I would recomment the following:

import pandas as pd

def convert_excel_time(excel_time):
    
    return pd.to_datetime('1900-01-01') + pd.to_timedelta(excel_time,'D')

Or

import datetime

def xldate_to_datetime(xldate):
    temp = datetime.datetime(1899, 12, 30)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

Is inspired by https://gist.github.com/oag335/9959241 but fixed to handle the two day offset bug in Excel

2
  • xldate_to_datetime(44000) gives 2020-06-20 where as the answer is 2020-06-18 Commented Oct 5, 2020 at 15:58
  • 1
    @PoornaPrudhvi is correct; the base date should be 1899-12-30. One day offset because we should be adding to Dec 31 and another day offset b/c of the leap year bug mention in the accepted answer.
    – JJL
    Commented Oct 14, 2021 at 19:11
1

I came to this question when trying to do the same above, but for entire columns within a df. I made this function, which did it for me:

import pandas as pd    
from datetime import datetime, timedelta
import copy as cp

def xlDateConv(df, *cols):      
    tempDt = []
    fin = cp.deepcopy(df)
    for col in [*cols]:
        for i in range(len(fin[col])):
            tempDate = datetime(1900, 1, 1)
            delta = timedelta(float(fin[col][i]))
            tempDt.append(pd.to_datetime(tempDate+delta))

        fin[col] = tempDt
        tempDt = []
    return fin

Note that you need to type each column, quoted (as string), as one parameter, which can most likely be improved (list of columns as input, for instance). Also, it returns a copy of the original df (doesn't change the original).

Btw, partly inspired by this (https://gist.github.com/oag335/9959241).

1
  • 1
    Thanks a lot for this
    – Veronica
    Commented Jul 7, 2020 at 15:51
1

If you are working with Pandas this could be useful

    import xlrd
    import datetime as dt
    
    def from_excel_datetime(x):
        return dt.datetime(*xlrd.xldate_as_tuple(x, datemode=0))
    
    df['date'] = df.excel_date.map(from_excel_datetime)

If the date seems to be 4 years delayed, maybe you can try with datemode 1.

:param datemode: 0: 1900-based, 1: 1904-based.

0

I had the same problem and then I used this function: (source: https://gist.github.com/OmarArain/9959241)

import datetime
def xldate_to_datetime(xldate):
    xldate = int(xldate)
    temp = datetime.datetime(1900, 1, 1)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

And then I applied it to my dataframe:

df['column_date'] = df['column_date'].apply(lambda x: xldate_to_datetime(x))
0

This is going to be the simplest solution yet: openpyxl has a built-in just for that:

from openpyxl.utils.datetime import from_excel

excel_oridinal = 38142
python_datetime = from_excel(excel_oridinal)

print(python_datetime, type(python_datetime)) # 2004-06-04 00:00:00 <class 'datetime.datetime'>

It takes care of the 1900-02-29 bug by itself. Just make sure you understand which date system you're working with – read more here.

Not the answer you're looking for? Browse other questions tagged or ask your own question.