Adding Column In Python Spark Apache
Takes a column name and returns a new DataFrame that drops a column. Is related to. SPARK-7509 Add drop column to Python DataFrame API. Resolved; links to Github Pull Request #5818 (rakeshchalasani) Activity. Assignee: Rakesh Chalasani Reporter: Reynold Xin. Powered by a free Atlassian JIRA open source license for Apache. Apr 16, 2017 I have been using spark’s dataframe API for quite sometime and often I would want to add many columns to a dataframe(for ex: Creating more features from existing features for a machine learning model) and find it hard to write many withColumn statements.
Hi, all.
I have a dataframe read from a CSV file in Scala. I would like to add another column to the dataframe by two columns, perform an operation on, and then report back the result into the new column (specifically, I have a column that is latitude and one that is longitude and I would like to convert those two to the Geotrellis Point class and return the point).
It seems like it shouldn't be that hard, but I have tried many things to no avail.
From here, I would like to use the Geotrellis Point class to add a column where $lat and $lon are used to generate coord = Point($'lon', $'lat'). I tried:
and received the following error:
Note that .printSchema() verifies that lon and lat are both double.
Adding Column In Python Spark Apache 1
Thanks in advance!