Filter data, remove entire dataframe

pandas drop rows based on multiple column values
pandas drop rows with string
pandas drop rows with value in list
dataframe drop
pandas drop rows with value in any column
pandas drop rows with condition
pandas drop rows based on list of values
pandas dataframe

My data is like this:

X1e  X2e  X3e  X4e
360    0    0    0
360    0    0    0
260    0    0    0
0      0    0    0
0      0    0    0
0      0    0    0
90     0    0    0
360    0  360    0
360    0  360  260

I want to remove the X1e X4e columns between 0 and 270, did not remove 0 row

for (i in c(1,4)){
  e <- assign(paste("X",i, "e",sep = ""),i)
  dat <- dat[with(dat, !((e>0)&(e<270))), ] 
}

This removes all my dat and my dat became empty. Where is my problem?

Base R Solution:

dat[!(dat$X1e>0 & dat$X1e<270) & !(dat$X4e>0 & dat$X4e<270),]

OR

Using sqldf:

library(sqldf)
sqldf("select * from dat where X1e not between 1 AND 270 AND X4e not between 1 AND 270")

Output:

   X1e X2e X3e X4e
1 360   0   0   0
2 360   0   0   0
3   0   0   0   0
4   0   0   0   0
5   0   0   0   0
6 360   0 360   0

Drop rows from the dataframe based on certain condition applied on , Pandas provides a rich collection of functions to perform data analysis in Python. While performing data analysis, quite often we require to filter the data to  I have a data frame and tried to select only the observations I'm interested in by this: data[data["Var1"]>10] Unfortunately, this command destroys the data.frame structure and returns a long vector. What I want to get is the data.frame shortened by the observations that don't match my criteria.

Create the column names of interest

cidx <- paste0("X", c(1, 4), "e")

Perform the logical operations on each column

test <- !(df[,cidx] > 0 & df[,cidx] < 270)

Sum (logical 'and') across rows to find those where all columns are TRUE

ridx <- rowSums(test) == length(cidx)

Subset the original data.frame

df[ridx,]

Python, Pandas provide data analysts a way to delete and filter data frame using .drop() method. Rows or columns can be removed using index label or column name  Filtering rows of a DataFrame is an almost mandatory task for Data Analysis with Python. Given a Data Frame, we may not be interested in the entire dataset but only in specific rows. Related course: Data Analysis with Python Pandas. Filter using query A data frames columns can be queried with a boolean expression.

Like this?

library(tidyverse)
 df<-read.table(text="X1e  X2e  X3e  X4e
 360    0    0    0
            360    0    0    0
            260    0    0    0
            0      0    0    0
            0      0    0    0
            0      0    0    0
            90     0    0    0
            360    0  360    0
            360    0  360  260",header=T)
 df%>%
   filter_at(vars(X1e,X4e), all_vars(.<=0 | .>270))
  X1e X2e X3e X4e
1 360   0   0   0
2 360   0   0   0
3   0   0   0   0
4   0   0   0   0
5   0   0   0   0
6 360   0 360   0

pandas.DataFrame.filter, Subset the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents. The filter is  DataFrame.filter(self: ~FrameOrSeries, items=None, like: Union [str, NoneType] = None, regex: Union [str, NoneType] = None, axis=None) → ~FrameOrSeries [source] ¶. Subset the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.

Yet another solution. I usually don't like subset because it uses non standard evaluation and it is slow but here it goes.

subset(df, (X1e <= 0 | X1e >= 270) & (X4e <= 0 | X4e >= 270))
#  X1e X2e X3e X4e
#1 360   0   0   0
#2 360   0   0   0
#4   0   0   0   0
#5   0   0   0   0
#6   0   0   0   0
#8 360   0 360   0

Dropping Rows Using Pandas, Cleaning your Pandas Dataframes: dropping empty or problematic data The pandas .drop() method is used to remove entire rows or columns  Sometimes, you may want tot keep rows of a data frame based on values of a column that does not equal something. Let us filter our gapminder dataframe whose year column is not equal to 2002. Basically we want to have all the years data except for the year 2002.

How To Drop One or More Columns in Pandas Dataframe?, After filtering, we will have a smaller dataframe with just four rows and six columns. 1. 2. 3. 4. gapminder_ocean = gapminder[(  DataFrame.dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False) [source] ¶ Remove missing values. See the User Guide for more on which values are considered missing, and how to work with missing data. axis{0 or ‘index’, 1 or ‘columns’}, default 0. Determine if rows or columns which contain missing values are removed.

How to filter/delete specific column values using R?, I have a data frame (RNASeq), I want to filter a column (>=1.5 & <=-2, log2 values​), should be able to delete all the rows with respective the column values which  Drop specified labels from rows or columns. Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level. Index or column labels to drop. Whether to drop labels from the index (0 or ‘index

Python : 10 Ways to Filter Pandas DataFrame, Make sure you specify values in list [ ]. Select rows whose column value does not equal a specific value. In this example, we are deleting all the flight details where​  Selecting, Slicing and Filtering data in a Pandas DataFrame Posted on 16th October 2019 One of the essential features that a data analysis tool must provide users for working with large data-sets is the ability to select, slice, and filter data easily.

Comments
  • I only want to remove the row ( 260 0 0 0) ,(90 0 0 0) and row (360 0 360 260) , but it remove all the line ,I think my code is right Is there a problem with my R enviroment , but I reinstall my R from 3.5.1 to 3.4.4 the problem still remain
  • "Where is my problem?" You're not using assign correctly, and in fact, you should avoid using assign altogether! See Why is using assign bad.
  • Your code is definitiely not right. In your loop the variable e is 1 in the first iteration. Therefore all the rows get deletetd, because 1>0 & 1<270 is always true.
  • "select * from dat where X2e not between 0 AND 360 AND X3e not between 0 AND 360"
  • but is not the all_vars(.<=0 | .>270) is the [X1e,X4e] (.<=0 | .>270) and [X2e, X3e](.<=0 | .>=360).
  • 'all_vars` there used only for vars(X1e,X4e)
  • I had an error " Error in all_vars(. = 0 | . >= 270) : unused argument (. = 0 | . >= 270)"
  • Can I use this style: filter_at(vars(a,h), all_vars(((a+h)<3.8)&((a+h)>0)))
  • a and h is two columns date