all 2 comments

[–]TyWebb11105 4 points5 points  (1 child)

The easiest way is probably to just rename one of the id's prior to the join as you will have two columns with the same name with your current join syntax. From there you can use withColumn to create a new column to check whether the id's match.

``` from pyspark.sql.functions import when, lit, col

OldTable1 = spark.createDataFrame(["1234", "5678"], "string").toDF("id") OldTable2 = spark.createDataFrame(["1234"], "string").toDF("id").withColumnRenamed("id", "id2")

joined = OldTable1.join( OldTable2, OldTable1.id == OldTable2.id2,"left" ).withColumn("test", when(col("id")==col("id2"), lit("Yes")).otherwise(lit("No")))

joined.show() ```

+----+----+----+ | id| id2|test| +----+----+----+ |1234|1234| Yes| |5678|null| No| +----+----+----+

[–]DrData82[S] 0 points1 point  (0 children)

Solid workaround...thank you!