Pages

Thursday 17 June 2021

HIVE-JOINS

Basically, there are 4 types of Hive Join. Such as:

HiveQL Select Joins 
Inner joins :  

The records common to the both tables will be retrieved by this inner join The simplest kind of join is the inner join, where each match in the input tables results in a row in the output.Consider the above said two tables  

Syntax:
SELECT table1.col1, table2.col1 
FROM table2 
JOIN table1 
ON (table1.matching_col = table2. matching_col);
Example 1:Using Create table method (multiple records can be loaded at the same time)
Table 1 (Students):
students.id students.name students.city
1	ABC	London 
2 BCD Mumbai
3 CDE Bangalore
4 DEF Mumbai
5 EFG         Bangalore

Table 2 (Dept): 

dept.did dept.name dept.sid 

100    CS    1
101    Maths    1
102    Physics    2
103    Chem    3
 
hive> create database joins ;
hive> use joins ;
 
TABLE1:
hive> create table student (sid int,sname string,city string)
    > row format delimited
    > fields terminated by '\t';
hive> load data local inpath '/home/hduser/Desktop/student.txt' into table student ;
hive> select * from student ;
 
TABLE2:
hive> create table dept (did int,dname string,dsid int)
    > row format delimited
    > fields terminated by '\t';
hive> load data local inpath '/home/hduser/Desktop/dept.txt' into table dept ;
hive> seelct * from dept ;
 
Example 1:
Query:

hive> desc student ;
OK
sid                     int                                        
sname               string                                     
city                    string                                     
Time taken: 0.071 seconds, Fetched: 3 row(s)
hive> desc dept ;
OK
did                     int                                        
dname               string                                     
dsid                   int                                        
Time taken: 0.072 seconds, Fetched: 3 row(s)
hive > select student.*,dept.*
    > from student JOIN dept
    > ON (student.sid = dept.dsid);

OK
1    ABC    London     100    CS    1
1    ABC    London     101    Maths    1
2    BCD    Mumbai     102    Physics    2
3    CDE    Bangalore     103    Chem    3

Example 2: Using Insert table method(only one record can be loaded into the table)

Table 1 (sales):
name    id 
ram        2
sita         4
raju        0     
rani        3
vani        

Table 2 (products): 

pid        pname
2        football
4        cricketball
3        hat     
1        bat 

Query:

hive> create database joins ;
hive> show databases ;
hive> use joins ; 
hive> create table sales (name string,id int);
hive> create table products (pid int ,pname string);
hive> show tables ;
hive> desc products ;
hive> desc sales ;
hive> desc products ;
hive> insert into table sales values('ram',2);
hive> insert into table sales values('sita',4);
hive> insert into table sales values('raju',0);
hive> insert into table sales values('rani',3;
hive> insert into table sales values('vani',2);
hive> select * from sales ;
OK
ram    2
sita    4
raju    0
rani    3
vani    2
hive> select * from products ;
OK
2    football
4    cricketball
3    hat
1    bat 
 
hive> select sales.*,products.*
    > from sales JOIN products
    > ON ( sales.id = products.pid);
OK
ram    2    2    football
vani    2    2    football
sita     4    4    cricketball
rani    3    3    hat
 

Left Outer Join

This type of join returns all rows from the left table even if there is no matching row in the right table. Table returns all rows from the left table and matching rows from right table. Unmatched right tables records will be NULL.

Syntax:

SELECT table1. col1, table2.col2 
from table1 LEFT OUTER JOIN table2
ON (table1.matching_col = table2.matching_col);

Example 1:

Query:

hive> select student.*,dept.*
    > from student LEFT OUTER JOIN dept
    > ON (student.sid = dept.dsid);

OK
1    ABC    London     100    CS    1
1    ABC    London     101    Maths    1
2    BCD    Mumbai     102    Physics    2
3    CDE    Bangalore     103    Chem    3
4    DEF    Mumbai     NULL    NULL    NULL
5    EFG    Bangalore    NULL    NULL    NULL

 

Example 2:

Query:

hive> create database joins ;
hive> show databases ;
hive> use joins ; 
hive> create table sales (name string,id int);
hive> create table products (pid int ,pname string);
hive> show tables ;
hive> desc products ;
hive> desc sales ;
hive> desc products ;
hive> insert into table sales values('ram',2);
hive> insert into table sales values('sita',4);
hive> insert into table sales values('raju',0);
hive> insert into table sales values('rani',3;
hive> insert into table sales values('vani',2);
hive> select * from sales ;
OK
ram    2
sita    4
raju    0
rani    3
vani    2
hive> select * from products ;
OK
2    football
4    cricketball
3    hat
1    bat

hive> select sales.*,products.*
    > from sales LEFT OUTER JOIN products
    > ON ( sales.id = products.pid);

OK
ram    2    2    football
sita    4    4    cricketball
raju    0    NULL    NULL
rani    3    3    hat
vani    2    2    football

 

Right Outer Join

It returns all rows from the right table even if there is no matching row in the left table. Table returns all rows from the right table and matching rows from left table. Unmatched left table records will be NULL.

Syntax:

SELECT table1. col1, table2.col2 
from table1 RIGHT OUTER JOIN table2
ON (table1.matching_col = table2.matching_col);

Query:

hive> select student.*,dept.*
    > from student RIGHT OUTER JOIN dept
    > ON (student.sid = dept.dsid);

OK
1    ABC    London     100    CS    1
1    ABC    London     101    Maths    1
2    BCD    Mumbai     102    Physics    2
3    CDE    Bangalore     103    Chem    3
 

Example 2:

Query:

hive> create database joins ;
hive> show databases ;
hive> use joins ; 
hive> create table sales (name string,id int);
hive> create table products (pid int ,pname string);
hive> show tables ;
hive> desc products ;
hive> desc sales ;
hive> desc products ;
hive> insert into table sales values('ram',2);
hive> insert into table sales values('sita',4);
hive> insert into table sales values('raju',0);
hive> insert into table sales values('rani',3;
hive> insert into table sales values('vani',2);
hive> select * from sales ;
OK
ram    2
sita    4
raju    0
rani    3
vani    2
hive> select * from products ;
OK
2    football
4    cricketball
3    hat
1    bat

hive> select sales.*,products.*
    > from sales RIGHT OUTER JOIN products
    > ON ( sales.id = products.pid);

OK
ram    2    2    football
vani    2    2    football
sita    4    4    cricketball
rani    3    3    hat
NULL    NULL    1    bat



Full Outer Join

It returns all rows from the both tables that fulfill the JOIN condition. The unmatched rows from both tables will be returned as a NULL.

Syntax:

SELECT table1. col1, table1.col2, table2.col1, table2.col2
FROM table1 
FULL OUTER JOIN table2 ON (table1.matching_col = table2.matching_col);

Results:

student_id student_name dept_id dept_name
1 ABC 100 CS 
1 ABC 101 Maths 
2 BCD 102 Physics 
3 CDE 103 Chem 
4 DEF NULL NULL
5 EFG NULL NULL
Example 2:

Query:

hive> create database joins ;
hive> show databases ;
hive> use joins ; 
hive> create table sales (name string,id int);
hive> create table products (pid int ,pname string);
hive> show tables ;
hive> desc products ;
hive> desc sales ;
hive> desc products ;
hive> insert into table sales values('ram',2);
hive> insert into table sales values('sita',4);
hive> insert into table sales values('raju',0);
hive> insert into table sales values('rani',3;
hive> insert into table sales values('vani',2);
hive> select * from sales ;
OK
ram    2
sita    4
raju    0
rani    3
vani    2
hive> select * from products ;
OK
2    football
4    cricketball
3    hat
1    bat

hive> select sales.*,products.*
    > from sales FULL OUTER JOIN products
    > ON ( sales.id = products.pid);

OK
raju    0    NULL    NULL
NULL    NULL    1    bat
vani    2    2    football
ram    2    2    football
rani    3    3    hat
sita    4    4    cricketball 

 
Example 3:

hive> desc sales  ;
OK
name                    string                                     
sid                     int                                        
Time taken: 0.107 seconds, Fetched: 2 row(s)
hive> desc products  ;
OK
pid                     int                                        
pname                   string                                     
Time taken: 0.17 seconds, Fetched: 2 row(s)
hive>

Inner Join:

Left outer Join:

 

Right outer Join:

Full outer Join:

 

 

No comments:

Post a Comment

Friends-of-friends-Map Reduce program

Program to illustrate FOF Map Reduce: import java.io.IOException; import java.util.*; import org.apache.hadoop.conf.Configuration; import or...