Skip to main content

All Questions

Tagged with
0 votes
1 answer
47 views

Pyspark Filtering Array inside a Struct column

I have a column in my Spark DataFrame that has this schema: root |-- my_feature_name: struct (nullable = true) | |-- first_profiles: map (nullable = true) | | |-- key: string | | |--...
MathLal's user avatar
  • 392
1 vote
2 answers
40 views

How to keep the first appearance of a value while filtering everything else out in R?

This is the appearance of my dataset currently. I want to include patient 1 data until the first '1' occurs in 'test.result' then remove any information about patient 1 after that. current dataset ...
DN98024's user avatar
  • 19
1 vote
2 answers
42 views

How to get column-wise summary statistics with missing codes?

I have written a custom function ord_table() to extract summary statistics from a series of databases. To get those summary statistics, I have to filter out missing data codes (all codes are large ...
Suzanne Segerstrom's user avatar
1 vote
1 answer
59 views

Conditional filtering of dataframe in R

I wonder how to dplyr::filter() my DATA to catch the rows for IDs whose Language value when 'Type!=5F' and when 'Type==5F' changes from other languages to "English"? For example, ID==1 has ...
Simon Harmel's user avatar
  • 1,449
0 votes
1 answer
45 views

How to write a function to read csv files with different separators in pandas in Python?

I have a bunch of CSV files for different years named my_file_2019, my_file_2020, my_file_2023 and so on. Some files have tab separator while others have semi-colon. I want to write a common function ...
hbstha123's user avatar
  • 1,456
2 votes
4 answers
92 views

Filter DataFrame events not in time windows DataFrame

I have a DataFrame of events (Event Name - Time) and a DataFrame of time windows (Start Time - End Time). I want to get a DataFrame containing only the events not in any of the time windows. I am ...
Yakir Shlezinger's user avatar
1 vote
3 answers
46 views

Finding rows in a data.frame that are the same on one variable but different on another variable in R

In my DATA below, how could I filter the rows where the Nm values are the same but Descr values are different to achieve my Desired_out below? DATA <- read.table(header=T, text =" Cd Nm ...
Simon Harmel's user avatar
  • 1,449
1 vote
3 answers
82 views

Remove elements with same prefix from a string in a column

and thank you for being part of an awesome community of learners and explorers! :) I have an issue with removing elements from a character column. This answer pointed me in a correct direction (split ...
ferallOut's user avatar
1 vote
5 answers
79 views

How to filter out the top Nth value of each row in a R dataframe?

There is a R dataframe, I want to filter out the top Nth value of each row with the rest values being set to 0. For example, suppose the dataframe looks like the following table: 0.5 0.3 0.2 0.15 0.9 ...
Li Ma's user avatar
  • 31
0 votes
2 answers
37 views

Pandas - Filter - TypeError: 'in <string>' requires string as left operand, not list

I am exploring Pandas filter and while doing so I came across this error while using the below query df2.filter(like = ['Republic','United'], axis=0 ) How do I provide a list in Like parameter in ...
srijan bansal's user avatar
2 votes
3 answers
69 views

Extracting entries in a dataframe corresponding to n smallest positive values and n largest negative values of a certain variable in r

Imagine I have a table like the following one. set.seed(12) table = data.frame( value = rnorm(n = 10), par = runif(n = 10, min = - 1, max = 1) ) How can I extract the entries of value ...
Mr Frog's user avatar
  • 446
0 votes
1 answer
43 views

Filtering Dataframe for first letter of a Classification Code in a column entry that has multiple of them

I am trying to filter a Dataframe of patens into their classification codes. I want to only get the patents that have a specific first letter in the code, but each column entry has multiple of these ...
Cio's user avatar
  • 3
0 votes
0 answers
53 views

Pandas filtering yields null values

I have a df which I am filtering based on a particular column. This is my code to do that test_table[(test_table['Item']=='1')] Ideally this code should return the rows where the value of column '...
AnonymousMe's user avatar
1 vote
0 answers
30 views

Is there another way to select rows with null values within a function in python? [duplicate]

I've been working on cleaning a dataset and so far I've been using missing_df = main_df[main_df.col.isnull()] to select rows with null values under 'col'. Since I'm using this a lot I'm trying to put ...
jPV's user avatar
  • 23
3 votes
1 answer
55 views

How to filter dataframe column names containing 2 specified substrings?

I need the column names from the dataframe that contain both the term software and packages. I'm able to filter out columns containing one string.. for eg: software_cols = df.filter(regex='Software|...
say_n's user avatar
  • 45

15 30 50 per page
1
2 3 4 5
73