Given a NumPy array that may contain invalid values such as NaN, inf, or non-numeric entries, the task is to remove all rows that contain any such value and keep only rows with clean numerical data.
For Example:
Input: [[10.5, 22.5, 3.8], [41, nan, nan]]
Output: [[10.5, 22.5, 3.8]]
Let's explore different methods to remove rows which contains non numeric values in Python.
Using np.isfinite()
This method keeps only those rows where every element is a valid finite number. It works by detecting and removing rows that contain NaN, inf, or -inf in one check.
import numpy as np
a = np.array([[10.5, 22.5, 3.8],
[41, np.nan, np.nan]])
res = a[np.isfinite(a).all(axis=1)]
print(res)
Output
[[10.5 22.5 3.8]]
Explanation:
- np.isfinite(a) creates a mask where valid numbers = True, invalid values = False.
- .all(axis=1) keeps rows where all values are True.
- a[...] filters and returns only numeric rows
Using ~np.isnan(a).any(axis=1)
This method filters rows by checking if they contain any NaN. Rows without NaN are kept; rows with even one NaN are removed.
import numpy as np
a = np.array([[10.5, 22.5, 3.8],
[41, np.nan, np.nan]])
clean = a[~np.isnan(a).any(axis=1)]
print(clean)
Output
[[10.5 22.5 3.8]]
Explanation:
- np.isnan(a) True where elements are NaN.
- .any(axis=1) checks if any column in a row contains NaN.
- ~ flips the mask so that only valid rows remain.
Using np.isreal() + Manual Numeric Check
This method validates each row by checking whether all values are numeric. It works even when the array contains strings or Python objects.
import numpy as np
a = np.array([[10.5, 22.5, 3.8],
["41", "x", "50"]], dtype=object)
m = []
for r in a:
ok = True
for v in r:
s = str(v)
if s.replace('.', '', 1).replace('-', '', 1).isdigit() == False:
ok = False
break
if ok:
m.append(r.astype(float))
res = np.array(m)
print(res)
Output
[[10.5 22.5 3.8]]
Explanation:
- str(v) converts each value to a string so the numeric check never errors.
- s.replace('.', '', 1).replace('-', '', 1).isdigit() removes one decimal point and one minus sign, then checks if the rest is numeric.
- Row is valid only if all values pass the numeric check.
- r.astype(float) converts valid rows to float because every value is confirmed numeric.