How do I select a subset of a DataFrame? — pandas 2.3.2 documentation
This section shows you how to retrieve a specific row or column in the data frame.
Import the library:
import pandas as pd
Read the following data:
data = {
"WorkerID": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"Age": [25, None, 35, 30, 24, 28, None, 32, 27, None],
"Salary": [50000, 54000, None, 58000, 45000, 60000, 49000, None, None, 47000]
}
df = pd.DataFrame(data)
print("DataFrame:\\n", df)
Output:
DataFrame:
WorkerID Age Salary
0 1 25.0 50000.0
1 2 NaN 54000.0
2 3 35.0 NaN
3 4 30.0 58000.0
4 5 24.0 45000.0
5 6 28.0 60000.0
6 7 NaN 49000.0
7 8 32.0 NaN
8 9 27.0 NaN
9 10 NaN 47000.0
The df.values
attribute returns a Numpy representation of the DataFrame. This attribute will be useful later.
# The df.values attribute returns a Numpy representation of the DataFrame
print("\\nNumpy representation of DataFrame:\\n", df.values)
Output:
Numpy representation of DataFrame:
[[1.0e+00 2.5e+01 5.0e+04]
[2.0e+00 nan 5.4e+04]
[3.0e+00 3.5e+01 nan]
[4.0e+00 3.0e+01 5.8e+04]
[5.0e+00 2.4e+01 4.5e+04]
[6.0e+00 2.8e+01 6.0e+04]
[7.0e+00 nan 4.9e+04]
[8.0e+00 3.2e+01 nan]
[9.0e+00 2.7e+01 nan]
[1.0e+01 nan 4.7e+04]]
# This will return the Numpy representation of the 'Age' column
print("\\nNumpy representation of the 'Age' column:\\n", df['Age'].values)