-
Notifications
You must be signed in to change notification settings - Fork 556
Open
Description
regarding item: src/main/scala/com/sparkbyexamples/spark/dataframe/examples/DropColumn.scala
I am running these examples in Azure PySpark 3.3 and I noticed that df.drop('colname') does NOT drop the column from the df dataframe. It only removes it from the value returned by the current pyspark statement.
Try these three lines in pyspark:
df.drop("first_name").printSchema() #prints the schema without the first_name column, same as in your examples.
df.drop("first_name"). #run this without displaying output.
df.printSchema(). #prints the schema WITH the first_name column.
Conclusion: the df.drop('col') statement does NOT change the df dataframe.
Metadata
Metadata
Assignees
Labels
No labels