File tree 1 file changed +36
-0
lines changed
docs/source/user-guide/common-operations
1 file changed +36
-0
lines changed Original file line number Diff line number Diff line change @@ -129,3 +129,39 @@ The function :py:func:`~datafusion.functions.in_list` allows to check a column f
129
129
.limit(20 )
130
130
.to_pandas()
131
131
)
132
+
133
+
134
+ Handling Missing Values
135
+ =====================
136
+
137
+ DataFusion provides methods to handle missing values in DataFrames:
138
+
139
+ fill_null
140
+ ---------
141
+
142
+ The ``fill_null() `` method replaces NULL values in specified columns with a provided value:
143
+
144
+ .. code-block :: python
145
+
146
+ # Fill all NULL values with 0 where possible
147
+ df = df.fill_null(0 )
148
+
149
+ # Fill NULL values only in specific string columns
150
+ df = df.fill_null(" missing" , subset = [" name" , " category" ])
151
+
152
+ The fill value will be cast to match each column's type. If casting fails for a column, that column remains unchanged.
153
+
154
+ fill_nan
155
+ --------
156
+
157
+ The ``fill_nan() `` method replaces NaN values in floating-point columns with a provided numeric value:
158
+
159
+ .. code-block :: python
160
+
161
+ # Fill all NaN values with 0 in numeric columns
162
+ df = df.fill_nan(0 )
163
+
164
+ # Fill NaN values in specific numeric columns
165
+ df = df.fill_nan(99.9 , subset = [" price" , " score" ])
166
+
167
+ This only works on floating-point columns (float32, float64). The fill value must be numeric (int or float).
You can’t perform that action at this time.
0 commit comments