Big Data Archives - BioChemiThon

Big Data Engineer Interview Questions

April 14, 2024July 7, 2024 by Ankit Rai

Preparing for an interview in the Big Data field can be challenging, given the diverse range of technologies and methodologies involved. To help you excel in your career, I’ve compiled an extensive collection of Big Data interview questions asked by different companies in the industry

Big Data, Interview ExperienceTagged Big data, Interview ExperienceLeave a Comment

KPMG | Big Data Engineer Interview Questions

March 16, 2024April 15, 2024 by Ankit Rai

In this article, we will see the list of questions asked in KPMG India Company Interview for 2+ year of experience candidate in big data field.
Let’s see the Questions:

Big Data, Interview ExperienceTagged Big data, Interview QuestionsLeave a Comment

Python | How to Setup Snowpark Environment in Local Machine

March 9, 2024 by Ankit Rai

Setting up a Snowpark environment on your local machine allows you to leverage the power of Snowflake for data processing and analytics. Whether you’re a data engineer, data scientist, or data analyst, having a local Snowpark environment can significantly enhance your productivity and facilitate experimentation. In this post, we’ll walk you through the steps to set up a Snowpark environment on your local machine.

Big Data, PythonTagged Big data, python, snowflake, snowparkLeave a Comment

Deloitte | Big Data Engineer Interview Experience

February 11, 2024 by Ankit Rai

In this article, we will see the list of questions asked in Deloitte Company Interview for 3+ year of experience candidate in big data field.

Let’s see the Questions:
1) Briefly Introduce YourSelf?
2) Difference between head() and take() in spark dataframe API?
…

Big Data, Interview ExperienceTagged Data Engineer, Interview Experience, Interview QuestionsLeave a Comment

Introduction to Apache Airflow: Simplifying Workflow Automation

August 13, 2023 by Ankit Rai

In the world of data and task automation, managing workflows efficiently is crucial. This is where Apache Airflow comes into play. Imagine having a tool that can help you automate and schedule tasks, coordinate data flows, and handle complex workflows seamlessly. This is exactly what Airflow does, making it an essential tool for modern data engineers and developers. In this article, we’ll take a beginner-friendly journey into the world of Airflow and explore its core concepts.

Apache-Airflow, Big Data, PythonTagged apache-airflow, Big data, pythonLeave a Comment

Python | Generating Fake Data using Python

April 29, 2023 by Ankit Rai

In the world of data analysis, data generation plays a critical role in various fields such as machine learning, data mining, and artificial intelligence. However, collecting large amounts of real data can be time-consuming and expensive. Therefore, fake data generation using tools like the Mimesis module in Python can be an efficient alternative.

Big Data, Programs, PythonLeave a Comment

DataBricks | How to Create a Free account on Databricks?

April 22, 2023 by Ankit Rai

DataBricks is a cloud-based data engineering platform that allows you to collaborate with other data scientists, analysts, and engineers to build and deploy data-driven applications. In this article, we will guide you through the process of creating a free account on DataBricks for the community edition. Community Edition is a limited Databricks environment for personal use and training.

Big DataTagged Big data, databricksLeave a Comment

Apache Airflow | Write your first DAG in Apache Airflow

April 16, 2023 by Ankit Rai

Apache Airflow is an open-source platform that allows developers to programmatically create, schedule, and monitor workflows as directed acyclic graphs (DAGs). With Airflow, you can define complex workflows with dependencies and execute them automatically or manually. In this article, we will guide you through the process of setting up Airflow and creating your first DAG.

Apache-Airflow, Big Data, Programs, PythonTagged apache-airflow, Big data, pythonLeave a Comment

BigData | Difference between ELT and ETL

January 28, 2023 by Ankit Rai

As a data professional, one of the most important aspects of our job is to ensure that data is accurate, timely, and accessible for analysis. Two common approaches to data integration are ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform).

Big DataTagged Big data, ELT, ETLLeave a Comment

Why Do We Need Bigdata Technology?

December 10, 2022December 10, 2022 by Ankit Rai

Why Big Data?

To process huge amounts of data which traditional systems (like your pc/ laptop) are not capable of processing.
To process huge amounts of data we need to store it first.

Example: Suppose we need to store 150 TBs of data, can our traditional system/ laptop which have 1 TB capacity store these huge amounts of data? No Right.

Big DataTagged Big data, Data EngineerLeave a Comment

Category: Big Data