Twitter Live Feed with Oracle Database As A Service and Business Intelligence Cloud Service

Standard

 

In this post I am going to show how you can very quickly get a Twitter live feed streaming into your Oracle Database as a Service (DBAAS) instance.  Although this example will work on any machine with an Oracle client and for any Oracle Database instance putting it on DBAAS makes it extremely easy to utilize our streaming Twitter data inside of other Oracle Cloud applications such as Business Intelligence Cloud Service (BICS).

Acknowledgments:

  1. Thank you to geniuses at Tweepy for making it so I don’t have to deal with oauth or really anything related to the Twitter API.  I don’t know who you are but I love you! ♥
  2. Twitter so I never have to wonder what Kim or Kanye are thinking.
  3. My homey Przemyslaw Piotrowski (I don’t know him either but his examples sustain my laziness in coding) for writing a super helpful series called The Mastering Oracle+Python Series which I fully intend to finish someday.
  4. Who ever this dude is (https://pythonprogramming.net/twitter-api-streaming-tweets-python-tutorial/) whose example made this easy.

The great thing about DBAAS is I actually have access to the underlying server so I can take advantage of the operating system, and languages like python, as well as the database.  Some might be offended by such interloping but I’m not one of them and as I said above there isn’t any requirement to do this on the database server I’m just taking advantage of a machine that has the cycles to spare.

I will assume that you already have a DBAAS instance up and running and are familiar with how to administer it and are also familiar with using tools like MobaXterm to connect to it.

At the time of this example our Database as a Service is running Oracle Linux 6.7 and 12.0.1.2 of the database with all the bells and whistles.  I needed to install 5 things to make this work:

  1. plliblzma to make it so I can get the latest EPEL repository without having to deal with “expletive” Error: xz compression not available message

wget https://kojipkgs.fedoraproject.org//packages/pyliblzma/0.5.3/3.el6/x86_64/pyliblzma-0.5.3-3.el6.x86_64.rpm

yum install pyliblzma-0.5.3-3.el6.x86_64.rpm

  1. Latest EPEL repository for this version of linux

wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

yum install epel-release-6-8.noarch.rpm

  1. Then I can install python-pip with ease

yum install python-pip

  1. And then we can fetch Tweepy with equal ease

pip install tweepy

  1. And last but not least Oracle’s python library for interacting with the database

pip install cx_Oracle

Now that we have all the chunks make sure you can import the stuff you need with python and not get any errors.

[oracle@testdrive-01 ~]$ python
Python 2.6.6 (r266:84292, Jul 23 2015, 05:13:40)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tweepy
>>> import cx_Oracle
>>>

If you have issue with cx_Oracle you most likely just need to make sure the Oracle environment variables are set via bashrc or however you like to set them.

I created a table in my database to store the JSON document that is the API response

CREATE TABLE "TWITTER_USER"."EAT_MY_TWEET"
( "ID" NUMBER,
"USERNAME" VARCHAR2(500 BYTE),
"TIMEWHYUPUNISHME" TIMESTAMP (6) DEFAULT systimestamp,
"TWEET_JSON" CLOB,
CONSTRAINT "ENSURE_JSON" CHECK (TWEET_JSON is JSON) ENABLE
)

In order to use the Twitter API you will have to register your app with them here https://apps.twitter.com at which point they will give you the keys to make oauth work.


Then all we need to do is write (by which I mean copy paste and modify) a little python code and we are ready to rock:

#import libraries
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import cx_Oracle
import datetime
import json


#connection string for database
conn_str='WHODAT/Ap@ssw0rd@localhost:1521/MyPluggableDatabseServiceName'

#get me a connection
conn =cx_Oracle.connect(conn_str)

#turn on autocommit
conn.autocommit=1

#object for executing sql
c=conn.cursor()

#clob variable
bvar=c.var(cx_Oracle.CLOB)

#twitter api application keys
#consumer key, consumer secret, access token, access secret.
ckey='Dont'
csecret='Tell'
atoken='Anybody'
asecret='The Password'

#listen to the stream
class listener(StreamListener):

#get some
    def on_data(self, data):
         try:

            #barf response insto json object
            all_data = json.loads(data)

            #parse out tweet text, screenname and tweet id
            tweet = all_data["text"]
            if (all_data["user"]["screen_name"]) is not None:
                username = all_data["user"]["screen_name"]
            else:
                username = 'No User'
            tid = all_data["id"]

            #set clob variable to json doc
            bvar.setvalue(0,data)
            try:
                #create sql string with bind for clob var

                sql_str="INSERT INTO EAT_MY_TWEET (ID,USERNAME,TWEET_JSON) Values("+str(tid)+",q'["+username.encode('utf-8').strip()+"]',:EATIT)" 

                #insert into database 
                c.execute(sql_str,[bvar])                  

            except Exception: 
                sys.exc_clear() 

            #watch tweets go by in console 
            print((username,tweet)) 

            #in case you want to print response 
            #print(data) 
            return(True) 
        except Exception: 
                sys.exc_clear() 
def on_error(self, status): 
     print status 
     print(data) 
     print sql_str 

#log in to twitter api 
auth = OAuthHandler(ckey, csecret) 
auth.set_access_token(atoken, asecret)
 
#fire it up 
twitterStream = Stream(auth, listener()) 

#what to search for (not case sensitive) 
#comma separated for words use 
# for hashtag 
#phrases are words separated by spaces (like this comment) 
twitterS.filter(track=["ProveFilterWorks,Oracle,Vlamis,OBIEE,BICS,DVCS,Data Visualization Desktop"])

I named my file stream.py so to execute I just fire it up with nohup –

nohup python -u /home/oracle/stream.py>/home/oracle/stream.out 2>/home/oracle/stream.err &

and if I tail my stream.out file with tail -f stream.out I can watch the tweets go by as they are inserted into the database

stream

now that I am inserting the json document that the twitter api sends back to me into the database

json

I can use Oracle Database 12c support for json to expose the document in my table as a view using the following SQL

  CREATE OR REPLACE FORCE EDITIONABLE VIEW "TWITTER_USER"."LIVE_TWITTER_FEED" ("CST_DATE", "UTC_DATE", "UTC_HOUR", "UTC_MINUTE", "ID", "CREATED_ON", "SCREEN_NAME", "LOCATION", "FOLLOWERS_CNT", "FRIENDS_CNT", "LISTED_CNT", "FAVOURITES_CNT", "STATUSES_CNT", "RETWEET_CNT", "FAVOURITE_CNT", "URL", "PROFILE_IMAGE_URL", "BANNER_IMAGE_URL", "HASHTAGS", "TWEET", "EMT_TIMEWHYUPUNISHME") AS 
  SELECT cast(TO_TIMESTAMP_TZ(REPLACE(upper(b.created_on),'+0000','-00:00'),'DY MON DD HH24:MI:SS TZH:TZM YYYY')  at Time zone 'CST' as date) CST_DATE,
cast(TO_TIMESTAMP_TZ(REPLACE(upper(b.created_on),'+0000','-00:00'),'DY MON DD HH24:MI:SS TZH:TZM YYYY') as date) UTC_DATE,
cast(to_char(cast(TO_TIMESTAMP_TZ(REPLACE(upper(b.created_on),'+0000','-00:00'),'DY MON DD HH24:MI:SS TZH:TZM YYYY') as date),'HH24') as number) UTC_HOUR,
cast(to_char(cast(TO_TIMESTAMP_TZ(REPLACE(upper(b.created_on),'+0000','-00:00'),'DY MON DD HH24:MI:SS TZH:TZM YYYY') as date),'MI') as number) UTC_MINUTE,
a."ID",b."CREATED_ON",a."USERNAME",b."LOCATION",b."FOLLOWERS_CNT",b."FRIENDS_CNT",b."LISTED_CNT",b."FAVOURITES_CNT",b."STATUSES_CNT",
b."RETWEET_CNT",b."FAVOURITE_CNT",b."URL",b."PROFILE_IMAGE_URL",b."BANNER_IMAGE_URL",b."HASHTAGS",b."TWEET",
a.TIMEWHYUPUNISHME
FROM EAT_MY_TWEET a,
json_table(tweet_json,
'$' columns(
    id varchar(50) path '$.id',
    created_on varchar2(100) path '$.created_at',
    screen_name varchar2(200) path '$."user".screen_name',
    location varchar2(250) path '$."user"."location"',
    followers_cnt number path '$."user".followers_count',
    friends_cnt  number path '$."user".friends_count',
    listed_cnt  number path '$."user".listed_count',
    favourites_cnt  number path '$."user".favourites_count',
    statuses_cnt  number path '$."user".statuses_count',
    retweet_cnt number path '$.retweet_count',
    favourite_cnt number path '$.favorite_count',
     url varchar2(250) path '$."user"."url"',
    profile_image_url  varchar2(500) path '$."user".profile_image_url',
    banner_image_url varchar2(500) path '$."user".profile_banner_url',
     hashtags varchar2(500) format json with wrapper path '$.entities.hashtags[*].text',
      tweet varchar2(250) path '$.text'
    -- nested path '$.entities.hashtags[*]' columns (ind_hashtag varchar2(30) path '$.text'    )
   
)
) b

Then I can analyze the real time twitter data I’m interested in using a simple sql statement

SELECT * FROM LIVE_TWITTER_FEED ;

and while being able to hit real time twitter data with SQL is cool the real goal is being able to surface that data inside of our BI tools. Here we combine Answers and Visual Analyzer on a dashboard to show latest tweets and aggregate information over time utilizing Oracle Business Intelligence Cloud Service (BICS)

bics

Deploying Oracle APEX on EC2 against Oracle Database on EC2

Standard

YouTube Playlist – Oracle Database and APEX on AWS EC2

AWS (Amazon Web Services) supports some versions of APEX on its RDS service but if you really want to be able to have the full power and ability of the latest versions you need to learn how to install the database on an EC2 instance ja(as I show in this post – DEPLOYING ORACLE DATABASE 12C ON AWS EC2 INSTANCE) and install and configure APEX and ORDS on a separate EC2 instance.

In the accompanying YouTube video I will show the steps to delete APEX from the container 12c Database and then install on a pluggable database.  Lastly I’ll show how to configure and run the ORDS listener on a separate EC2 instance

First let me list some super helpful resources –

Posts from my new hero Tim Hall and other random stuff I encountered

Multitenant : Uninstall APEX from the CDB in Oracle Database 12c Release 1 (12.1)

Oracle Application Express (APEX) 5.0 Installation

Connection error after upgrade to 4.1

Register DB Service to Listener

REST Data Services Installation and Configuration Guide

Application Express Installation Guide

Workflow for Installing APEX and ORDS

This YouTube video follows the workflow I’ve created below –

  1. Set up Ec2 instance using OL7.2-x86_64-HVM-2015-12-10 Amazon Machine Image (AMI)
  2. Login via MobaXterm as ec2-user and change password
    • sudo passwd ec2-user
  3. Install packages needed for instance
    • sudo yum install wget zip unzip -y
    • sudo yum install perl-libwww-perl.noarch -y
    • sudo yum install java-1.8.0-openjdk.x86_64 -y
    • sudo yum install updates
  4. Create apex user and make it so user can connect
  5. Make directories for apexlistener, oracle client, and ORDS
    • mkdir /home/apexuser/apexlistener
    • mkdir /home/apexuser/orclclient
    • mkdir /home/apexuser/ORDS
  6. Upload files and unzip in respective directories
    • unzip apex_5.0.3_en.zip -d /home/apexuser/apexlistener
    • upload oracle-instantclient12.1-basic-12.1.0.2.0-1.x86_64.rpm into /home/apexuser/orclclient
    • upload oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm into /home/apexuser/orclclient
    • unzip ords.3.0.3.351.13.24.zip -d /home/apexuser/ORDS
  7. Install oracle client
    • su ec2-user
    • cd /home/apexuser/orclclient
    • rpm -ivh oracle-instantclient12.1-basic-12.1.0.2.0-1.x86_64.rpm
    • rpm -ivh oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm
  8. Update bash profile of apex user for client
    • su apexuser
    • cd ~
    • vi ~/.bash_profile
      • export LD_LIBRARY_PATH=/usr/lib/oracle/12.1/client64/lib:$LD_LIBRARY_PATH
      • export PATH=/usr/lib/oracle/12.1/client64/bin:$PATH
    • source ~/.bash_profile
    • sqlplus SYS/PASSWORD@HOSTNAME:1521/orcl as sysdba
  9. Go to your database server & uninstall apex from container database
    • !!!!!!!Make sure you uninstall or unplug any pluggable databases before you do this!!!!!!
    • /u01/app/oracle/product/12.1.0/dbhome_1/apex
    • sqlplus / sys as sysdba
    • @apxremov_con.sql
  10. Create pluggable database & add USERS tablespace
    • Go to SQL Developer for these tasks
    • Create pluggable database and open read / write
    • Connect to pluggable database
    • Create Users tablespace on pluggable database
      • CREATE TABLESPACE “USERS” DATAFILE
        ‘/u01/app/oracle/oradata/orcl/pdbseed/users01.dbf’ SIZE 5242880
        AUTOEXTEND ON NEXT 1310720 MAXSIZE 32767M
        LOGGING ONLINE PERMANENT BLOCKSIZE 8192
        EXTENT MANAGEMENT LOCAL AUTOALLOCATE DEFAULT
        NOCOMPRESS SEGMENT SPACE MANAGEMENT AUTO;
  11. Install apex on your container database, change apex admin user and create rest data services user
    • ALTER SESSION SET CONTAINER =PDBORCL (your pluggable database name)
    • @apexins.sql SYSAUX SYSAUX TEMP /i/
    • @apxchpwd.sql
    • @apex_rest_config.sql
    • ALTER USER APEX_PUBLIC_USER IDENTIFIED BY Password1 ACCOUNT UNLOCK;
  12. Go to your Apex server and Run ORDS set up
    • cd /home/apexuser/ORDS
    • java -jar ords.war
    • /home/apexuser/ORDS/config/ (configuration directory)
    • /home/apexuser/apex/images (static resources directory)
    • add tcp rule for 8080 (or port you used if you haven’t already when you created server)
    • stop server using ctrl c and start with nohup
      • nohup java -jar ords.war &
    • if you need to stop server issue top command and find java process running under apexuser and kill process number
      • top -u apexuser
      • kill [process]
  13. At this point your APEX UI should be available at http://HOSTNAME:8080/ords

Deploying Oracle Database 12c on AWS EC2 Instance

Standard

YouTube Playlist – Oracle Database and APEX on AWS EC2

See companion post Deploying Oracle APEX on EC2 against Oracle Database on EC2

So if you have ever used AWS (Amazon Web Services) before you probably know they have an RDS service that will allow you to auto-deploy several flavors of database.  In this youtube playlist I show my students how to do just that.

But if you have deployed Oracle databases via RDS you may have also noticed that you don’t really get the kind of access (no root and no sysdba) to the instance you really need if you want to truly be able to utilize the full power of the database.

So what are you to do?

Well thankfully AWS also allows you to deploy EC2 instances of various OS (Operating System) flavors which are really just VM’s (Virtual Machines).  Additionally, lots of helpful people have created VM’s that have OS’s, like Oracle Linux, already installed that you can use as the basis of new instances.  In this post I explore how to accomplish this feat using lots of great resources I’ve found around the net.  If you have never touched a Linux OS or AWS or an Oracle Database then don’t be scared this post is for you!  If you have done these things, but not on AWS, then this post is also for you but I will definitely do my best to assume you are a beginner.

This example is done on a t2.small instance size (1 vCPU + 2GB RAM) which is one step above AWS’s free tier but still a very reasonable 2 cents an hour for powered on instances (no charge when they are off).  It seems to run just fine and I have used even smaller instances to run APEX development work spaces for 15 students.  Makes me wonder about just how over licensed / resourced a lot of people might be.

First let me list some super helpful resources –

A post by Tim Hall explaining Installation of Oracle on EC2 and it’s companion Oracle Database 12c Release 1 (12.1) Installation On Oracle Linux 7 (OL7) which are really the basis for most of what I show you but with a few key changes, by me, for beginners.

Other posts by Tim Hall for auto-starting database and pdb’s –

See heading ‘The “runuser” Command’ on this post for auto-starting 12c

See ‘Pluggable Database (PDB) Automatic Startup’ on this post for auto-starting PDB’s

Also uber helpful –

MobaXterm Download

AWS On-Demand Instance Prices (remarkably cheap)

Managing User Accounts on Your Linux Instance

Retrieving the Public Key for Your Key Pair on Linux

Making an Amazon EBS Volume Available for Use

Oracle Database Software Downloads

20 Linux YUM (Yellowdog Updater, Modified) Commands for Package Management

YUM COMMAND CHEAT SHEET

Linux Add a Swap File – Howto

Amazon EC2 Pricing

Cannot connect to Oracle DB on EC2 instance (Firewall NOT the issue)

And finally a video on how to create AWS account I’ve used with my students in case you need it.

Workflow for creating Oracle EC2 database instance:

This YouTube video follows the workflow I’ve created below –

  1. Download MobaXterm
  2. Set up Ec2 instance using OL7.2-x86_64-HVM-2015-12-10 Amazon Machine Image (AMI) with three additional volumes of 5, 8 and 10 gb (this will make your life easier).
  3. Login via MobaXterm as ec2-user and change password
    • sudo passwd ec2-user
  4. Install packages needed for instance
    • sudo yum install wget zip unzip -y
    • sudo yum install perl-libwww-perl.noarch -y
    • sudo yum install oracle-rdbms-server-12cR1-preinstall -y
  5. Mount volumes, format disks, mkdir
    1. sudo mkfs -t ext4 /dev/xvdb (be super careful here)
    2. sudo mkfs -t ext4 /dev/xvdc (be super careful here)
    3. sudo mkdir -p /swapfile1
    4. sudo mount /dev/xvdb /swapfile1 (swap file)
    5. sudo mkdir -p /u01/software
    6. sudo mount /dev/xvdf /u01/software (zip files)
    7. sudo mkdir -p /u01/app/oracle/oradata/orcl
    8. sudo mount /dev/xvdc /u01/app/oracle/oradata/orcl (data files)
  6. Add swap file
    • sudo dd if=/dev/zero of=/swapfile1/swapfile  bs=1024 count=3145728 (creates 3G swapfile)
    • sudo chown root:root /swapfile1/swapfile
    • sudo chmod 0600 /swapfile1/swapfile
    • sudo  mkswap /swapfile1/swapfile
    • sudo swapon /swapfile1/swapfile
    • free -m
  7. Make fstab entries
    • lsblk (gives names of volumes)
    • sudo cp /etc/fstab /etc/fstab.orig (back up fstab)
    • sudo vi /etc/fstab
    • /swapfile1 none swap sw 0 0
    • /dev/xvdc /u01/app/oracle/oradata/orcl ext4 defaults 0 0
    • /dev/xvdf /u01/software ext4 defaults 0 0
    • sudo mount -a (remount everything to make sure it worked)
  8.  Give oracle user ownership of directories
    • sudo chown -R oracle.oinstall /u01/app/oracle/oradata/orcl
    • sudo chown -R oracle.oinstall  /u01/software
    • sudo chown -R oracle.oinstall /u01
  9. Change hostfile for instance by adding hostname to localhost
    • hostname
    • sudo vi /etc/hosts
  10. Change password for oracle user and make it possible for user to connect remotely (Managing User Accounts on Your Linux Instance)
    • sudo passwd oracle
    • su oracle (switch to oracle user)
    • cd ~(make sure your are in oracle user home)
    • mkdir .ssh (create location for key file)
    • chmod 700 .ssh (set permissions)
    • touch .ssh/authorized_keys (create file)
    • chmod 600 .ssh/authorized_keys (set permissions)
    • “GET http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key>.ssh/authorized_keys” (copy public key to file)
    • log out and login as oracle user
  11. Upload database zip files into /u01/software (this takes a while)
  12. Unzip files
    • cd /u01/software
    • unzip linuxamd64_12102_database_1of2.zip
    • unzip linuxamd64_12102_database_2of2.zip
  13. Start install
    • cd /ora_software/database
    • ./runInstaller
  14. Run scripts as root (careful here you need to open separate instance of MobaXterm)
    • sudo /u01/app/oraInventory/orainstRoot.sh
    • sudo /u01/app/oracle/product/12.1.0/dbhome_1/root.sh
  15. Say OK to run dbca to create a database
  16. Update tnsnames.ora, listener.ora files by replacing localhost with actual host name
    • cd /u01
    • find -name tnsnames.ora
    • find -name listener.ora
    • hostname
    • vi ./app/oracle/product/12.1.0/dbhome_1/network/admin/tnsnames.ora
    • vi ./app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
  17. Restart listener and database
    • . oraenv
    • lsnrctl stop
    • lsnrctl start
    • sqlplus sys as sysdba
    • shutdown immediate;
    • startup;
  18. At this point you should be able to connect to your database remotely with SQL Developer or your favorite tool, remember if you are shutting down your EC2 instance to first stop the listener and the database.