Impetus – Big Data Engineer Interview Experience

Impetus – Big Data Engineer Interview Experience

In this post, we will see the questions asked in Impetus Technologies Company Interview for 3+ year of experience guy in big data field.

Let’s see the questions:
1) What is the Core of spark ?
2) What is the difference between Spark Dataframe and Spark Dataset ?
3) Explain Wide and Narrow transformation with an example ?
4) What is the difference between Spark Session and SparkContext ?
5) How to initialize SparkContext ?
6) Explain Spark Architecture ?
7) What is Staging in Spark ?
8) Write a sql query to find out Second Highest Salaried person in each department ?
9) Write Recursion Fibonacci series code?
10) Explain Window function and difference between Row Number , Rank and Dense Rank ?
11) Explain HIVE Architecture ?
12) Explain Hive Partitioning and Bucketing concepts?
13) Let’s suppose If We added partitions manually in hive table path, We are trying to fetch those partition using hive query. Will it give the partition or not. If not then why?
14) Explain External and Managed tables in hive ?
15) What is MapSide joins ? why it is used?
16) Explain Python Generator ?
17) Explain Inheritance in Python ?
18) Explain Exception Handling in Python ?
19) Table employee has 3 columns like EmpID, OfficeMobile and HomeMobile.
Some employees have given same number for both places, Others have given different. Output should be 2 columns EmpID & ContactNo. If same numbers for both places then 1 row else 2 rows should be in the output ?
Example:
EmpID, OfficeMobile,HomeMobile
1,123,123
2,456,789

Output-
1,123
2,456
2,789

20) Write a SQL Query as well as Pyspark Dataframe operation to calculate the employee whose salary is greater than the average of salary with respect to each Department??
Sample Input:
EmpID | EmpNAme | Salary | Department
1 | ABC | 4000 | BANG
2 | MNL | 5000 | BANG
3 | XYZ | 7000 | HYD
4 | DEY | 8000 | HYD
5 | DFE | 2000 | DEL
6 | EGF | 9000 | DEL

21) Write a SQL Query as well as PySpark Dataframe operation to get the latest records of each customer.
Sample Input:
customer_id |ph_num|date |
1 |123 |2020-10-31|
2 |456 |2020-10-31|
3 |789 |2020-10-31|
1 |654 |2020-11-31|
2 |543 |2020-10-03|
1 |908 |2020-10-04|
4 |123 |2020-10-02|

22) Write a Python function to Add two python lists ?
Example:
= [‘foo’, ‘bar’, ‘bazz’]
func_to_appnd_two_list_vrb(l, None)
print(l)

l = [‘foo’, ‘bar’, ‘bazz’]
func_to_appnd_two_list_vrb(l, [‘hello’])
print(l)

l = [‘foo’, ‘bar’, ‘bazz’]
func_to_appnd_two_list_vrb(l, [‘hello’, ‘world’])
print(l)

### Output: ###
1. [‘foo’, ‘bar’, ‘bazz’]
2. [‘foo’, ‘bar’, ‘bazz’, ‘hello’]
3. [‘foo’, ‘bar’, ‘bazz’, ‘hello’, ‘world’]

This interview was held through Microsoft Teams Video Call.

Check out the given link for knowing about this company: Home (impetus.com)
Check out the given link for knowing about this company rating on Glassdoor: Impetus Technologies Reviews | Glassdoor

Check out the given link for this company profile on Linkedin: Impetus Technologies | LinkedIn

Thank you for reading this post.

Leave a Reply

Your email address will not be published.