SQOOP Encoding

SQOOP:

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases.

Import MySQL table into hive and encode the values, which value is NULL in the table, also make sure that table file is partitioned in 3 files.

If u already have data in mysql directly goto STEP 4 or continue the following steps.

First create a table in MySQL and insert values into that table, and then import that table into HIVE using SQOOP and if the table contain any NULL values then, encode that NULL values and also split(Partitioned) the table into 3 files (like part-m-00000, part-m-00001, part-m-00002).

Step-1 : Create database in MySQL

create database sqoopdb;

Step-2 : Create table in sqoopdb;

Create table product_order_details (user_id int, order_id int, product_id varchar(20), order_amount bigint, order_delivery_date varchar(20), user_email_id varchar(20));

Step-3 : Insert values into product_order_details

Step-4 : Import table from MySQL to HIVE with the following conditions,

(i). Encode the values, which value is NULL in the table ,

(ii). Make sure that table file is partitioned in 3 files.

sqoop import

–connect jdbc:mysql://localhost/sqoopdb

–username root

–password password

–table product_order_details

–null-non-string ‘\\NA’

–null-string ‘\\NA’

–split-by user_id

-m 3

–target-dir /user/hive/warehouse/product_order_details;

Step-5 : List the table product_order_details

hdfs dfs -ls /user/hive/warehouse/product_order_details

Step-6 : View the content of the partitioned files

hdfs dfs -cat /user/hive/warehouse/product_order_details/part-m-00000

hdfs dfs -cat /user/hive/warehouse/product_order_details/part-m-00001

hdfs dfs -cat /user/hive/warehouse/product_order_details/part-m-00002

Description about command :

sqoop import -> keyword for importing

connect -> keyword

<path> -> path of the mysql

username -> keyword

root -> name of the user, which is stored in the mysql

password -> keyword

password -> password for the user

table -> keyword

product_order_details -> importing table

null-non-string -> keyword for Encoding non-string NULL values

null-string -> keyword for Encoding string NULL values

NA -> encoding value

split-by -> keyword for split the table

user_id -> name of the column (the table splited based on that column)

m 3 -> number of partitioned

target-dir -> keyword for target directory

/user/hive/warehouse/product_order_details -> path for where our table should be stored.

Latest News

Recent Posts

Categories