Skip to main content

Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

pandas
0 votes
0 answers
18 views

Vectorize a sampling of a dataframe based on filtered conditions?

I have two dataframes, one which has three variables (one discrete, two continuous) and the other which has the same, except with also an additional 4th variable. For the purposes of a minimum ...
user13132640's user avatar
-1 votes
0 answers
28 views

Expand Pandas dataframe horizontally using mean of adjacent columns [closed]

How to expand a Pandas dataframe horizontall in such a manner that between every two adjacent columns, a column is added containing the mean of those two columns (for each row) without using explicit ...
Dhruv Jain's user avatar
0 votes
0 answers
28 views

What is the recommended way to process large Spark DataFrames in chunks: `toPandas()` or `RDD.foreachPartition()`?

I am working with large datasets using PySpark and need to process my data in chunks of 500 records each. I am contemplating between converting my Spark DataFrames to Pandas DataFrames using toPandas()...
Bo Yuan's user avatar
  • 109
0 votes
0 answers
24 views

How to install geopandas on termux

i cant seem to understand what am i doing wrong on termux i tried everything i ran pip install geopandas it doesent seem to work to install 'geopandas' on termux it always gets stuck i dont know why ...
StoreyedJoker72's user avatar
0 votes
0 answers
16 views

num_samples = set(int(i.shape[0]) for i in tree.flatten(data)) IndexError: tuple index out of range (Tensorflow)

When trying to train an Tensorflow LSTM I am getting the following error: File "C:\Users\user\Documents\LSTM_Volatility.py", line 66, in <module> model.fit(x_train , ...
Harry Dunn's user avatar
0 votes
2 answers
94 views

Lists in pandas dataframe cells

If we have a (part of a bigger) dataframe that shows what states individuals (rows) visited in a trip: df = pd.DataFrame({'states_visited': [['NY', 'CA'], 'CA', 'CA']}, index = ['John', 'Mary', 'Joe'])...
Saeed's user avatar
  • 1,969
0 votes
0 answers
36 views

Pandas reads inconsistent date formats from Excel files generated by PDF4me API

I'm using the PDF4me API to convert PDF invoices into Excel files. The dates are read correctly by the API. However, when I open the Excel files, the dates are displayed inconsistently: Some dates ...
Fateh Muhammad's user avatar
1 vote
2 answers
53 views

Identify starting row of actual data in Pandas DataFrame with merged header cells

My original df looks like this - df Note in the data frame: The headers are there till row 3 & from row 4 onwards, the values for those headers are starting. The numbers of rows & columns ...
Debojit Roy's user avatar
2 votes
1 answer
48 views

Read Met Office Data Point JSON into Panda

I am using the MetOffice Datapoint API to download UK Weather data as a JSON. I would then like to read that JSON file into a pandas DataFrame. The format of the JSON file is as shown {"SiteRep&...
user284377's user avatar
0 votes
1 answer
43 views

Filter datetime column with object as data type in python

I have a df with a Timestamp and Value columns. Both have the 'object' dtype | Timestamp | Value | -------------------------------------- | 8/21/2023 12:00:00 AM | a | | 11/...
aditya tandel's user avatar
1 vote
2 answers
43 views

Converting JSON list with multiple nested dictionaries to csv or excel

I have a JSON that I download from a website that has multiple nested dictionaries inside the main list. This is a very simplified version of it. [ { "id": 1, "...
TxHemi's user avatar
  • 11
-1 votes
1 answer
31 views

Series is empty when using .loc to slice

I'd like to get the item between Q1 to Q9. I used .loc to slice the series object: s.loc['Q1':'Q2'] However it returns a empty series. Series([], dtype: object) Normally, I should get a return of ['...
NNInsomniaTonight's user avatar
1 vote
2 answers
39 views

Map Dataframe Column Values Based on Two Dictionaries Conditionally [duplicate]

I have a dataframe df_test. I want to map the column color conditionally: if category is 'tv', then map using the tv_map dictionary else map using the radio_map dictionary I could split df_test by ...
shsh's user avatar
  • 727
3 votes
2 answers
61 views

How do I get variable length slices of values using Pandas?

I have data that includes a full name and first name, and I need to make a new column with the last name. I can assume full - first = last. I've been trying to use slice with an index the length of ...
J Web's user avatar
  • 65
2 votes
0 answers
67 views

How to compare rows within the same csv file faster

I have a csv file containing 720,000 rows with and 10 columns, the columns that are relevant to the problem are ['timestamp_utc', 'looted_by__name', 'item_id', 'quantity'] This File is logs of items ...
banom's user avatar
  • 31

15 30 50 per page
1
2 3 4 5
19200