All Questions
604
questions
0
votes
1
answer
62
views
Replace iterrow loops in pandas matrices with something else to shorten the running time
This post is modified from this one: https://codereview.stackexchange.com/posts/292885/edit (Alternatives to iterrow loops in python pandas dataframes).
I have a piece of code to calculate price ...
5
votes
2
answers
632
views
Alternatives to iterrow loops in python pandas dataframes
I have a piece of code to calculate price sensitivity based on the product and its rating.
Below is the original data set with product type, reported year, customer’s rating, price per unit, and ...
2
votes
1
answer
37
views
Maintain a log containing values if certain conditions are met
I'm trying to capture profits and set a stop loss in my trading strategy. I want the stop loss to be set daily based on the past data and if the current price, i.e., price for the date falls below the ...
2
votes
1
answer
222
views
Python using generators with Excelwriter - Performance
I'm looking to understand if my code has an obvious blockage or performance pain point that will cause it to operate slower or use more memory than it should.
The current Excelfile i am processing ...
3
votes
1
answer
268
views
Transferring dataframe columns into dataframe rows
I have the following data:
...
1
vote
1
answer
108
views
Custom neural network implementation in TensorFlow to compare normalisation vs. no normalisation on data
I am performing a sports prediction multi-class classification problem, and wanted to compare the differences in model performance between normalised and non-normalised data. You can see the 2 ...
3
votes
1
answer
210
views
Machine learning training, hyperparameter tuning and testing with 3 different models
I am trying to solve a multi-class classification involving prediction the outcome of a football match (target variable = Win, Lose or Draw). With a dataset of 2280 rows, which is 6 seasons of ...
3
votes
1
answer
74
views
Calculating premium splits for policies
Looking for a better approach to write below transformation using Python. Is it possible to avoid loop and still achieve the desired output?
It is too slow for 10 million rows.
...
5
votes
2
answers
98
views
Creating csvs using Pandas on large dataset for document retrieval
I am trying to build a useable NLP corpus but getting bottlenecked by how long the program takes (200 hours). With so much data I know that optimizing my code even a little bit will net me huge time ...
1
vote
1
answer
60
views
Extending die roll simulations for complex data science tasks
I've developed a Python script that simulates die rolls and analyses the results. I'm now looking to extend and modify this code for more complex data science tasks and simulations.
Is this code ...
3
votes
3
answers
152
views
Syntactic sugar for derived variables from Pandas DataFrame columns
Update: Okay, after trying to use this for a while, I think it's probably a bad idea. Please use (lambda x: x["a"] + x["b"])(df) if really ...
0
votes
2
answers
120
views
Optimize a Python code which indicates duplicated values in an excel file [closed]
I wrote this code to indicate duplicated values. It actually works but I hope to know if there's another possible solution to optimize this process. Thanks.
...
1
vote
0
answers
63
views
Combined or separate data-cleaning routine
I am a junior data engineer that have 3 years of experience with Python. I write a lot of Python code for my job and I came up with this question I can't solve by my own. I don't have the chance to ...
2
votes
1
answer
67
views
Use row data from a database to find rows in dataframes that match and use data to generate a separate dataframe
I have a DataFrame (database_df) that contains the general record with the IDs that are the same team in each of the lines, containing these values I need to find ...
1
vote
2
answers
53
views
Imrove performance when updating DataFrame rows based on complex criteria
My question got rejected the last time so I am trying a better approach to getting a solution:
...