apache-spark


Scala: Defining Primary Key in Data Frame


Is it possible to define the Primary Key when using the Data Frame?
I have two data frames, which I have joined on "ID". Now I want to select "Date" and also received the primary key "ID" in the output.
val join1 = df_2.join(df_3, df_3.col("ID") === df_2.col("APPLICATION2_ID"))
val joinFinal = join1.join(df_1, df_1.col("ID") === join1.col("ID"))
In order to get rid of duplicate columns when joining on same named ones use the Seq version:
val joinFinal = join1.join(df_1, Seq("ID"))

Related Links

How to map a JavaDstream object into a string? Spark Streaming and Model Prediction JAVA
spark-submit: workers do not get assigned to the master
Fuzzy text matching in Spark
Spark: Match columns from two dataframes
Spark Jobs crashing with ExitCodeException exitCode=15
Spark-Cassandra: how to efficiently restrict partitions
Spark job on hbase data
SparkSQL restrict queries by Cassandra partition key ranges
Merging equi-partitioned data frames in Spark
Writing custom UDF in spark on a List to get Index
Getting the cluster hierarchy using BisectingKMeans clustering
Intellij connect hortonwork spark remotely failed
Spark - How can get the Logical / Physical Query execution using - Thirft - Hive Interactor
Spark 1.6 Pearson correlation
How to read .csv file using spark-shell
NODE_LOCAL vs RACK_LOCAL task read time

Categories

HOME
embedded-linux
wix
performance-testing
configuration
blogs
path
swashbuckle
runtime-error
xamarin.forms-listview
openvpn
webdriver-io
telerik
http2
classloader
documentation
wget
research
hyperion
session-cookies
oracle-agile-plm
coreos
vscode-settings
asp.net-mvc-5.2
calabash-android
phpbb3
next
sfsafariviewcontroller
datamatrix
hough-transform
pvs-studio
esri
jtable
memorystream
akka-http
apache-commons-httpclient
stomp
petrel
bcel
jdom-2
msal
perl-module
joe-editor
jupyter-irkernel
uitabbarcontroller
eve
easy-digital-downloads
rating
adsutil.vbs
docker-swarm-mode
azure-cdn
ctl
replaygain
jvm-arguments
liclipse
bosh-deployer
informatica-cloud
webdeploy-3.5
trello.net
multiple-files
ssha
post-processor
folder-structure
realm-list
base-conversion
sql-server-ce-3.5
easynetq
emberfire
zim-database
min3d
vraptor
javaw
bitrock
brooklyn
pkcs#10
keymapping
windows-phone-7.1
antisamy
batman.js
onejar
rte
git-gui
arbtt
virtual-earth
godaddy-api
northwind
wescheme
hiphop
documentviewer
media-manager
force.com
httpconnection
katta
junitperf
duplicate-data
blackberry-jde
combinators
cluetip
excel-web-query
cracker
anemic-domain-model

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App