Hadoop Developer

Mahender Reddy

Hadoop Developer
Atlanta, GA
Member Since Jun 14, 2023

Green Card

Candidates About

Mahender Reddy

PROFESSIONAL SUMMARY:

· Over all 8 years of IT experience in analysis, design and development using Hadoop, Java and J2EE.

· 3+ years' experience in Big Data technologies and Hadoop ecosystem projects like Map Reduce, YARN, HDFS, Apache Cassandra, Spark, NoSQL, HBase, Oozie, Hive, Tableau, Sqoop, Pig, Storm, Kafka, HCatalog, Zoo Keeper and Flume.

· Worked with Big Data distributions Cloudera CDH5, CDH4, CDH3 and Hortonworks 2.5.

· Using Ambari configuring initial development environment using Hortonworks standalone sandbox and monitoring the Hadoop echo system.

· Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.

· Knowledge of Data Analytics and Business Analytics processes.

· Hands on experience with Spark streaming to receive real time data using Kafka.

· Creating Spark SQL queries for faster requests.

· Experience in ingesting streaming data into Hadoop using Spark, Storm Framework and Scala.

· Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.

· Experienced in Performance tuning of the ETL.

· Implemented various ETL solutions as per the business requirement using Informatica.

· Experienced with test frameworks for Hadoop using MRUnit.

· Performed data analytics using PIG, Hive, and Language R for Data Scientists within the team.

· Worked extensively on Data Visualization tool Tableau, Graph Data Base like Neo4J.

· Worked on 32+ node Apache/Cloudera 5.9.2 Hadoop cluster for PROD Environment and used tools like sqoop, Flume for data ingestion from different sources to Hadoop system and Hive/Sparksql to generate reports for analysis.

· Experience in working on various python packages such as NumPy, SQL Alchemy, matPlotlib, Beautiful soup, pickle, Pyside, SciPy, PyTables.

· Experience in managing and reviewing Hadoop log files.

· Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.

· Responsible for smooth error-free configuration of DWH-ETL solution and Integration with Hadoop.

· Good experience with python frameworks like Flask and WebApp2.

· Extending HIVE and PIG core functionality by using custom User Defined Functions(UDF), User Defined Tables-Generating Functions(UDTF) and User Defined Aggregating Functions(UDAF)

· Expertise in developing enterprise applications based on J2EE Technologies like JDBC, Servlets, JSP, Struts, Stripes, EJB, Spring, Hibernate.

· Good Understanding of RDBMS through Database Design, writing queries using databases like Oracle, SQL Server, DB2 and MySQL.

· Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.

· A team player and self-motivator possessing excellent analytical, communication, problem solving, decision-making and Organizational skills.

TECHNICAL SKILLS:

Hadoop/Big Data	HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, Cassandra, Impala, Oozie, Zookeeper, MapR, Amazon Web Services, EMR, MRUnit, Spark, Storm,R, R studio.
Java & J2EE Technologies	Core Java, JDBC, Servlets, JSP, JNDI, Struts, Spring, Hibernate and Web Services (SOAP and Restful)
IDE’s	Eclipse, MyEclipse, IntelliJ
Frameworks	MVC, Struts, Hibernate, Spring
Programming languages	C,C++, Java, Python, Linux shell scripts, R,
Databases	Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, MongoDB, Graph DB
Web Servers	Web Logic, Web Sphere, Apache Tomcat
Web Technologies	HTML, XML, JavaScript, AJAX, Restful WS
Network Protocols	TCP/IP, UDP, HTTP, DNS, DHCP
ETL Tools	Informatica, Qlikview and Cognos

PROFESSIONAL EXPERIENCE:

Client: Intercontinental Hotel Groups (IHG), GA July2015-July 2017

Hadoop Developer

Responsibilities:

· Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.

· Wrote the Map Reduce jobs to parse the web logs which are stored in HDFS.

· Importing and exporting data into HDFS and HIVE using Sqoop.

· Develop HIVE queries for the analysis, to categorize different items.Worked on Big Data Integration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.

· Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.

· Having good experience on all flavors of Hadoop (Cloudera, Hortonworks, MapR) etc.

· Involved in working with Impala for data retrieval process.

· Experience in partitioning the Big Data according the business requirements using Hive Indexing, partitioning and Bucketing.

· Responsible for design development of Spark SQL Scripts based on Functional Specifications.

· Responsible for Spark Streaming configuration based on type of Input Source

· Developed the services to run the Map-Reduce jobs as per the requirement basis.

· Responsible for loading data from UNIX file systems to HDFS. Installed and configured Hive and also written Pig/Hive UDFs.

· Involved in using HCATLOG to access Hive table metadata from MapReduce And Pig Code.

· Developing business logic using scala.

· Coordinated with end users for designing and implementation of analytics solutions for User Based Recommendations using R as per project proposals.

· Created Talend jobs to populate the data into dimensions and fact tables.

· Load and transform data into HDFS from large set of structured data /Oracle/Sql server using TalendBig data studio.

· Used Python to extract weekly hotel availability information from XML files.

· Experience in Python OpenSatck API'S.

· Writing MapReduce (Hadoop) programs to convert text files into AVRO and loading into Hive (Hadoop) tables.

· Implemented the workflows using Apache Oozie framework to automate tasks.

· Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.

· Helped the Analytics team with Aster queries using HCatlog.

· Developing design documents considering all possible approaches and identifying best of them.

· Managed and guided the deposit team during their move to Hadoop using Syncsort tool - Deposit application was re-written from mainframe ETL to Hadoop to load into W (Teradata warehouse) - Used DMX-H syncsort for ETL.

· Involved in EDW mappings, sessions and workflows

· Loading Data into HBase using Bulk Load and Non-bulk load.

· Developed scripts and automated data management from end to end and sync up b/w all the clusters.

· Using Spark streaming consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.

· Import the data from different sources like HDFS/HBase into SparkRDD.

· Wrote Apache Spark streaming API on Big Data distribution in the active cluster environment.

· Working with real time streaming applications using tools like Spark Streaming, Storm and Kafka.

· Worked on monitoring and troubleshooting the Kafka-Storm-HDFS data pipeline for real-time data ingestion in Datalake in HDFS.

· Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop.

· Experienced with Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

· Involved in converting Hive /SQL queries into Spark transformations using Spark RDD, Scala and Python

· Developing traits and case classes etc in scala.

Environment: HDFS, Map Reduce, Hive, Spark, Flume, Cloudera, Pig, Hbase, HCatlog, Oozie, Sqoop, Java, Maven, Scala, R, Impala, Python, AngularJs, Splunk, Oracle, SYNCSORT, Yarn, GitHub, Junit, Tableau, Unix, Cloudera, Flume, Tomcat.

Client: American Medical Response (AMR), CA August 2014-June2015

Hadoop Developer

Responsibilities:

· Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation, administration and support for Hadoop.

· Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.

· Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.

· Worked with the team to increase cluster from 28 nodes to 42 nodes, the configuration for additional data nodes was done by Commissioning process in Hadoop.

· Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.

· Involved in implementing an HDInsight version 3.3 cluster, which is based on spark version 1.5.1.

· Good knowledge in using components that are used in cluster such as spark core (Includes Spark core, Spark SQL, Spark streaming API's).

· Managed and scheduled Jobs on a Hadoop cluster.

· Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.

· Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.

· Wrote python scripts to parse XML documents and load the data in database.

· Experience in converting MapReduce applications to Spark.

· Used hcatlog to load data into Pig and also written PigLatin scripts.

· Generated property list for every application dynamically using python.

· Developed consumer based features and applications using Python and Django in test driven Development.

· Involved in defining job flows, managing and reviewing log files.

· Installed Oozie workflow engine to run multiple Map Reduce, Hive HQL and Pig jobs.

· Collected the log data from web servers and integrated into HDFS using Flume.

· Cassandra developer: Set-up configured and optimized the Cassandra cluster. Developed real-time java based application to work along with the Cassandra database.

· Involved in HDFS maintenance and administering it through Hadoop-Java API.

· Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.

· Constructed System components and developed server side part using Java, EJB, and Spring Frame work. Involved in designing the data model for the system.

· Used J2EE design patterns like DAO, MODEL, Service Locator, MVC and Business Delegate.

· Defined Interface Mapping between JDBC Layer and Oracle Stored Procedures.

· Experience in managing and reviewing Hadoop log files.

· Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

· Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Worked on tuning the performance Pig queries.

· Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.

· Implemented best income logic using Pig scripts and UDFs.

· Component unit testing using Azure Emulator

· Analyze escalated incidences within the Azure SQL database

· Implemented test scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, Spark, Cloudera,shark, Kafka, HDFS, Zoo Keeper, Hive, Pig, Oozie, Core Java, Eclipse, HBase, Sqoop, Flume, Oracle 11g, Cassandra, SQL, SharePoint, Azure 2015, UNIX Shell Scripting.

Client: PayPal, CA January 2014-July 2014

Hadoop Developer

Responsibilities:

· Evaluated suitability of Hadoop and its ecosystem to the above project and implemented various proof of concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.

· Estimated Software & Hardware requirements for the Name Node and Data Node & planning the cluster.

· Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.

· Written the Map Reduce programs, Hive UDFs in Java where the functionality is too complex.

· Involved in loading data from LINUX file system to HDFS

· Develop HIVE queries for the analysis, to categorize different items.

· Designing and creating Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.

· Given POC of FLUME to handle the real time log processing for attribution reports.

<p class="MsoListParagraphCxSpMiddle

Mahender Reddy

Candidates About

Experience:

Birthday:

Education Level:

Contact Us

For Candidates

For Employers

About Us

Helpful Resources

Mahender Reddy

Candidates About

Experience:

Birthday:

Education Level:

Social media

Contact Us

Login to Livew2 - Livew2.com

Create a Free Livew2 - Livew2.com Account