INCREMENTAL APPEND IN SQOOP

SQOOP

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases.

Import MySQL table into hive, once import is done, insert some new records into mysql table and import the table again but import only new inserted records and append to existing direcotry,

First create a table in MySQL and insert values into that table, and then import that table into HIVE using SQOOP then insert some new values into the same mysql table when import is done, and then import the table again to the hive but import only new inserted records in the mysql table and append to existing direcotry,

Step-1

Create database in MySQL

create database sqoopdb;

Step-2

Create database in MySQL

use sqoopdb;

Step-3

Create table in sqoopdb;

create table purchase_details (s_no int, user_id int, purchase_id varchar(20), purchae_amonut bigint);

Step-4

Insert values into purchase_details table

insert into purchase_details values(1,10001526,’PS678945′,12600), (2,10001567,’PS786435′, 9000),(3,10001356,’PS096789′,13420),(4,10001467,’PS567839′,5030);

sqoop append, incremental append in sqoop

Step-5

Import table from MySQL to HIVE with the following conditions,

sqoop import \

–connect jdbc:mysql://localhost/sqoopdb \

–username root \

–password password \

–table purchase_details \

–target-dir ‘/user/hive/warehouse/purchase’\

-m 1;

sqoop append, incremental append in sqoop

Step-6

View the content of the file

hdfs dfs -cat /user/hive/warehouse/purchase/part-m-00000

sqoop append, incremental append in sqoop

Step-7

Insert new data into the same mysql table purchase_details

insert into purchase_details values (5,10001906,’PS678000′,22600), (6,10002467,’PS780035′,9070), (7,10002856,’PS045789′,20000), (8,10002444,’PS562732′,500);

sqoop append, incremental append in sqoop

Step-8:

Import the table again but import only new inserted record and append to existing directory

sqoop import \

–connect jdbc:mysql://localhost/sqoopdb \

–username root \

–password password \

–table purchase_details \

–target-dir ‘/user/hive/warehouse/purchase’

–check-column ‘s_no’ \

–incremental append \

–last-value 4 \

–m 1;

sqoop append, incremental append in sqoop

Step-9:

View the appended file

hdfs dfs -cat /user/hive/warehouse/purchase/part-m-00000 /user/hive/warehouse/ purchase/part-m-00001

sqoop append, incremental append in sqoop

Description about command:

sqoop import -> keyword for importing

connect -> keyword

<path> -> path of the mysql

username -> keyword

root -> name of the user, which is stored in the mysql

password -> keyword

password -> password for the user

table -> keyword

purchase_details -> importing table

target-dir -> keyword for target directory

/user/hive/warehouse/product_order_details -> path for where our table should be stored

check-column -> keyword for checking column

‘s_no’ -> checked column name

incremental append -> keyword for incremental append

last-value 4 -> last value for checked column in already imported table

m 3 -> number of partitioned.

Leave a Reply

Your email address will not be published. Required fields are marked *