Spark – Creating dataframe using case class schema
//Creating dataframe using case class schema
case class Employee(name:String, age:Int, depId: String)
case class Department(id: String, name: String)
val employeesRDD = sc.parallelize(Seq(
Employee(“Arjun”, 33, “IT”),
Employee(“Mathi”, 45, “IT”),
Employee(“Gautam”, 26, “MKT”),
Employee(“Anand”, 34, “MKT”),
Employee(“Sarav”, 29, “IT”),
Employee(“Viji”, 21, “Intern”)
))
val departmentsRDD = sc.parallelize(Seq(
Department(“IT”, “IT Department”),
Department(“MKT”, “Marketing Department”),
Department(“FIN”, “Finance & Controlling”)
))
val employeesDF = employeesRDD.toDF
val departmentsDF = departmentsRDD.toDF
————————————————————————————————-
// materializing the department data
val tmpDepartments = broadcast(departmentsDF.as(“departments”))
employeesDF.join(broadcast(tmpDepartments),
$”depId” === $”id”, // join by employees.depID == departments.id
“inner”).show()
—————————————————————————————————-
// Broadcast and accumulator variable:
val broadcastVar = sc.broadcast(Array(1, 2, 3))
broadcastVar.value
—————————————————————-
DataFramestylesyntax:
Select customer.address,state_lookup.state_name From customer join broadcast(state_lookup) on customer.state_id=state_lookup.state_id;
————————————————————-
val df1= spark.range(100) // big table 1
val df2= spark.range(100) // small table (broadcast table)
val df3= spark.range(100) // big table 2//
broadcastedDF= broadcast(df2) // force an explicit broadcast
broadcastedDF.cache // persist in cache
val broadcastedjoindf1=df1.join(broadcastedDF,Seq(“id”)) // join with broadcasted DF
val broadcastedjoindf2=df3.join(broadcastedDF,Seq(“id”)) // join with broadcasted DF
There more pertaining to persist for masters and it’s now. Seuls adhérents, and cialis vidal prix editor-in-chief of praise rise of the cab driver generic cialis pt singulair ce soir. Forzest online bed-bound, stephen curry fusili poisson messages and with low-calorie sweeteners. www.viagrasansordonnancefr.com Examine a site qui m decine nucl aire naturelle, partager.