hadoop


merge tuple in Pig


I have two sets of tuples and I want to inner join them by first element and merge other parts into one tuple, wondering how to implement this in Pig on Hadoop?
Input two tuple sets,
1,(1,2)
2,(2,3)
1,(b,c,b,c)
2,(c,d,c,d)
Expected output,
1,(1,2,b,c,b,c)
2,(2,3,c,d,c,d)
thanks in advance,
Lin
A thought worth contemplating ...
Inputs :
dataA :
1 (1,2)
2 (2,3)
dataB:
1 (b,c,b,c)
2 (c,d,c,d)
Pig Script :
A = LOAD 'dataA' USING PigStorage('\t') AS (aid:long, atuple : tuple(af1:long, af2:long));
B = LOAD 'dataB' USING PigStorage('\t') AS (bid:long, btuple : tuple(bf1:chararray, bf2:chararray, bf3:chararray, bf4:chararray));
C = JOIN A BY aid, B BY bid;
D = FOREACH C GENERATE aid AS id, FLATTEN(atuple) AS (af1:long, af2:long) , FLATTEN(btuple) AS (bf1:chararray, bf2:chararray, bf3:chararray, bf4:chararray);
E = FOREACH D GENERATE id, (af1..bf4);
DUMP E;
Output : DUMP E :
(1,(1,2,b,c,b,c))
(2,(2,3,c,d,c,d))

Related Links

Block assignation using network topology
flume loss data when collect online data to hdfs
SemanticException [Error 10007]: Ambiguous column reference _c1
“Can not validate” error using JSON SerDe with Hive in HDInsight
Streaming data [Hadoop/MapReduce] - What are the challenges?
jps lists datanodes, but not dfsadmin. Can't copy to hdfs
How to rename output file(s) of Hive on EMR?
Logging from mappers into one location
Prevent camus from increasing the offset value
Modify cloudera manager port 7180 to 80
Does addition of properties to conf object available back in driver ?
Copying files from SFTP to S3 using Apache Pig
Hive unable to perform queries other than SELECT *
Is it possible to retrieve schema from avro data and use them in MapReduce?
Hadoop: Image processing of colored images
Is it possible for Hadoop MapReduce programs to access local resource?

Categories

HOME
sbt
typescript
web-applications
music
opencl
hortonworks-data-platform
robotframework
jsessionid
sql-server-2012
word-vba
arm
wso2is
oracle-adf
winsql
angular-mdl
webdriver-io
netbeans-8
wget
ceph
mobilefirst-adapters
hadoop2
superfish
xorg
bluej
marketplace
command-line-interface
google-cloud-sdk
axios
postgresql-9.5
googletest
installer
nsmutableattributedstring
game-maker-language
opencms
pwm
django-autocomplete-light
mediator
desktop-app-converter
pdfa
reformatting
file-manager
charts.js
magento2.0.2
complex-networks
vrtk
fipy
sieve-of-eratosthenes
outlook-vba
gradient-descent
packaging
dsx
polymer-cli
gradle-tooling-api
rtems
distributed-lock
requirements-management
simplesamlphp
typesetting
harvest-scm
ietf-netconf
jahia
tryton
untagged
singlepage
lapply
android-alertdialog
file-import
sql-server-2016-express
menustrip
blpapi
asp.net-webpages
kinto
aldryn
actioncable
hjson
voting
realm-list
keypad
scalariform
odoo
wintersmith
android-tablelayout
applicationstate
computer-science-theory
prettyfaces
mouseenter
yourls
blueprint
deap
resolver
visual-assist
icon-fonts
word-2010
lungojs
mysql-backup
ad-hoc-distribution
android-actionmode
mknetworkkit
httpcontext.cache
ruby-1.9.2
junitperf
formsauthentication
xoom
geneva-framework
clipboard-pictures
greensoftware

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App