Skip to main content

All Questions

Tagged with
0 votes
1 answer
62 views

Replace iterrow loops in pandas matrices with something else to shorten the running time

This post is modified from this one: https://codereview.stackexchange.com/posts/292885/edit (Alternatives to iterrow loops in python pandas dataframes). I have a piece of code to calculate price ...
Laura's user avatar
  • 61
5 votes
2 answers
632 views

Alternatives to iterrow loops in python pandas dataframes

I have a piece of code to calculate price sensitivity based on the product and its rating. Below is the original data set with product type, reported year, customer’s rating, price per unit, and ...
Laura's user avatar
  • 61
2 votes
1 answer
37 views

Maintain a log containing values if certain conditions are met

I'm trying to capture profits and set a stop loss in my trading strategy. I want the stop loss to be set daily based on the past data and if the current price, i.e., price for the date falls below the ...
driver's user avatar
  • 232
2 votes
1 answer
222 views

Python using generators with Excelwriter - Performance

I'm looking to understand if my code has an obvious blockage or performance pain point that will cause it to operate slower or use more memory than it should. The current Excelfile i am processing ...
sayth's user avatar
  • 131
3 votes
1 answer
268 views

Transferring dataframe columns into dataframe rows

I have the following data: ...
mahmoud988's user avatar
1 vote
1 answer
108 views

Custom neural network implementation in TensorFlow to compare normalisation vs. no normalisation on data

I am performing a sports prediction multi-class classification problem, and wanted to compare the differences in model performance between normalised and non-normalised data. You can see the 2 ...
pastybake2002's user avatar
3 votes
1 answer
210 views

Machine learning training, hyperparameter tuning and testing with 3 different models

I am trying to solve a multi-class classification involving prediction the outcome of a football match (target variable = Win, Lose or Draw). With a dataset of 2280 rows, which is 6 seasons of ...
pastybake2002's user avatar
3 votes
1 answer
74 views

Calculating premium splits for policies

Looking for a better approach to write below transformation using Python. Is it possible to avoid loop and still achieve the desired output? It is too slow for 10 million rows. ...
user278818's user avatar
5 votes
2 answers
98 views

Creating csvs using Pandas on large dataset for document retrieval

I am trying to build a useable NLP corpus but getting bottlenecked by how long the program takes (200 hours). With so much data I know that optimizing my code even a little bit will net me huge time ...
evader110's user avatar
  • 143
1 vote
1 answer
60 views

Extending die roll simulations for complex data science tasks

I've developed a Python script that simulates die rolls and analyses the results. I'm now looking to extend and modify this code for more complex data science tasks and simulations. Is this code ...
Attila Vajda's user avatar
3 votes
3 answers
152 views

Syntactic sugar for derived variables from Pandas DataFrame columns

Update: Okay, after trying to use this for a while, I think it's probably a bad idea. Please use (lambda x: x["a"] + x["b"])(df) if really ...
user1537366's user avatar
0 votes
2 answers
120 views

Optimize a Python code which indicates duplicated values in an excel file [closed]

I wrote this code to indicate duplicated values. It actually works but I hope to know if there's another possible solution to optimize this process. Thanks. ...
peternish's user avatar
1 vote
0 answers
63 views

Combined or separate data-cleaning routine

I am a junior data engineer that have 3 years of experience with Python. I write a lot of Python code for my job and I came up with this question I can't solve by my own. I don't have the chance to ...
Izem's user avatar
  • 11
2 votes
1 answer
67 views

Use row data from a database to find rows in dataframes that match and use data to generate a separate dataframe

I have a DataFrame (database_df) that contains the general record with the IDs that are the same team in each of the lines, containing these values I need to find ...
Digital Farmer's user avatar
1 vote
2 answers
53 views

Imrove performance when updating DataFrame rows based on complex criteria

My question got rejected the last time so I am trying a better approach to getting a solution: ...
PyNoob's user avatar
  • 21

15 30 50 per page
1
2 3 4 5
41