In this post we discusses about Encoding and Decoding SQOOP. First i share some basic description of SQOOP.

Sqoop:

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases.

Encode and Decode in Sqoop:

Create the table in mysql and insert the values to mysql table. Import the mysql table to hive and encode the values, which value is null in the mysql table.

Step 1

If you already have a database in MYSQL and you are willing to use same database goto to STEP 2 or continue the following steps

Create database databasename;

Create database geo;

Step 2

use databasename;

use geo;

sqoop,encoding,encode,decoding,decode

Step 3

Create a table in mysql.

Create table purchase(purchaseorder_id int,productname varchar(50),quantity bigint,prize bigint);

sqoop,encoding,encode,decoding,decode

Step 4

Insert the values into purchase table.

insert into purchase values(1,’Diesel’,500,5000),(2,’Petrol’,null,4000),(3,’Oil’,1000,7000),(4,null,450,9000);

sqoop,encoding,encode,decoding,decode

Step 5 Encoding

Now import data from mysql table to hive table. Also while importing if null value found for any string value then replace with ‘\NA’ and also for int value with ‘\NA’.

sqoop import \

–connect jdbc:mysql://localhost/geo \

–username root \

–password root \

–table purchase \

–hive-home /user/hive/warehouse-dir \

–hive-import \

–hive-overwrite \

–hive-table purchase_hive \

–null-string ‘\\NA’ \

–null-non-string ‘\\NA’ \

-m 1;

sqoop,encoding,encode,decoding,decode

Step 6.

View the result.

hadoop fs -cat /user/hive/warehouse/purchase_hive/part-m-00000

sqoop,encoding,encode,decoding,decode

Step 7

Create a new table in mysql to export the data from hive.

Create table newpurchase(purchaseorder_id int,productname varchar(50),quantity bigint,prize bigint);

sqoop,encoding,encode,decoding,decode

Step 8.Decoding

Now export data from hive table purchase_hive1.while exporting Wherever there is ‘\NA’ (string) and ‘\NA’ (non string) it should be loaded as null value in mysql.

sqoop export \

–connect jdbc:mysql://localhost/geo \

–username root \

–password root \

–table newpurchase \

–export-dir /user/hive/warehouse/purchase_hive \

–input-fields-terminated-by ‘\001’ \

–input-lines-terminated-by ‘\n’ \

–batch \

–input-null-string ‘\\NA’ \

–input-null-non-string ‘\\NA’ \

-m 1;

sqoop,encoding,encode,decoding,decode

Step 9

Finally view the output as decoded file.

Select * from newpurchase;

sqoop,encoding,encode,decoding,decode