In this post we discuss about step by step ‘Flume Installation’ on ubuntu

Step 1 : Download flume

Download the Apache-Flume-1.6.0-bin and extract it in required location.

Step 2 : flume configure

Change the file name “flume-conf.properties.template” into “flume.conf” from apache-flume-1.6.0-bin/conf folder.

flume installation

flume installation,flume tutorial

Step 3 :

After that, paste the following properties in “flume.conf

TwitterAgent.sources = Twitter

TwitterAgent.channels = Memchannel

TwitterAgent.sinks = HDFS

#TwitterAgent.sources.Twitter.type=com.cloudera.flume.source.TwitterSource

TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource

TwitterAgent.sources.Twitter.channels = MemChannel

TwitterAgent.sources.Twitter.consumerKey = <consumerkey>

TwitterAgent.sources.Twitter.consumerSecret = <consumerSecret>

TwitterAgent.sources.Twitter.accessToken = <AccessToken>

TwitterAgent.sources.Twitter.accessTokenSecret = <AccessTokenSecret>

TwitterAgent.sources.Twitter.keywords = big data

TwitterAgent.sinks.HDFS.channel = MemChannel

TwitterAgent.sinks.HDFS.type = hdfs

TwitterAgent.sinks.HDFS.hdfs.path = hdfs: //localhost:9000/user/flume/tweets/

TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream

TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text

TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000

TwitterAgent.sinks.HDFS.hdfs.rollSize = 0

TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000

TwitterAgent.channels.MemChannel.type = memory

TwitterAgent.channels.MemChannel.capacity = 1000

TwitterAgent.channels.MemChannel.transactionCapacity = 100

Step 4 :

set the following path in “flume-env.sh” file from apache-flume-1.6.0-bin/conf and save it.

JAVA_HOME=/usr/lib/jvm/java-8-oracle

export HADOOP_HOME=/home/geouser/hadoop-2.7.1

Step 5 :

set the following path in “flume-ng” file from apache-fume-1.6.0-bin/bin and save it.

export JAVA_HOME=/usr/lib/jvm/java-8-oracle

export HADOOP_HOME=/home/geouser /hadoop-2.7.1

[Using flume we are analyzing the Twitter data not only Twitter (any social medias).Now we have to create the twitter account because to generate the consumer key and Token key]

Step 6 :

First we need to create an app in Twitter in order to generate the consumer key and token key, but before creating an app, we must have an account in Twitter.

Step 7 :

Logging into Twitter Account to create an app.

http://Twitter.com

flume installation,flume tutorial

Next sign up with new account.

flume installation,flume tutorial

flume installation,flume tutorial

Next create the new Twitter app

http://apps.twitter.com

flume installation.flume tutorial

flume installation,flume tutorial

Create the application by filling the given requirements such as Name , Description , Website (any website with ‘http’).

Callback URL (not necessary to fill it).

After that click “yes , I agree” in the Developer Agreement.

Next click the create your Twitter Application.

flume installation,flume tutorial

Once the application is created, you will be directed to application management page . In that page select “Keys and Access Tokens” tab.

flume installation,flume tutorial

By selecting “Keys and Access Tokens” tab, you can see the Consumer Key and Consumer Secret.

flume installation,flume tutorial

Just down the page you can also find a Button “Create my access token” and by click it.

flume installation,flume tutorial

Access Token and Access Token Secret keys will be generated.

flume installation,flume tutorial

flume installation,flume tutorial

Copy all series keys such that Consumer Key, Consumer Secret, Access Token, Access Token Secret and paste into the “flume.conf “ file.

flume installation,flume tutorial

flume installation,flume tutorial

flume installation,flume tutorial

flume installation,flume tutorial

After copying the keys to the flume.conf file, now enter your search keyword in that file and save it. (i.e)

TwitterAgent.sources.Twitter.keywords = bigdata.

flume installation,flume tutorial

Step 8:

Open the terminal.

Step 9:

Set the bashrc

Sudo gedit .bashrc

flume installation,flume tutorial

export FLUME_HOME=$HOME/apache-flume-1.6.0-bin

export PATH=:$FLUME_HOME/bin:

flume tutorial, flume installation

Step 10:

Open terminal and goto the following path cd apache-flume-1.6.0-bin/bin

flume installation,flume tutorial

Step 11:

Check the flume-ng file locate or not in that bin path using ls command

Step 12:

Then enter the below command

./flume-ng agent –conf ./conf/ -f /home/geouser/apache-flume-1.6.0-bin/conf/flume.conf-Dflume.root.logger=DEBUG,console -n TwitterAgent.

Step 13:

By entering the above command, flume starts processing the json datasets and store them hdfs.

Step 14:

You can see the datasets. Once you feel u got enough datasets, you can abort the process by using “Ctrl + z

flume tutorial,flume installation

Step 15:

Finally we have to start Hadoop to run the local administrator.

You can get into http://localhost:50070

Then select the directory in local administrator user/flume/tweets.

flume installation,flume tutorial

flume installation,flume tutorial

flume installation,flume tutorial

Now we finished the flume installation and got the raw data from Specified Twitter account.

Leave a Reply

Your email address will not be published. Required fields are marked *