
Vedavyas Java
- Hadoop Developer
- Atlanta, GA
- Member Since Jun 14, 2023
Vedavyas
Professional Summary
Ø Over 8 years of experience in Analysis, Development, Testing, Maintenance and User training of software application which includes over 4 Years in Big Data, Hadoop and HDFS environment and around 4 Years of experience in SQL, JAVA and J2EE.
Ø Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per requirement.
Ø Hands on using Sqoop to import data into HDFS from RDBMS and vice-versa.
Ø Used different Serde’s like Regex Serde and HBase Serde.
Ø Experience in analyzing data using Hive, Pig Latin and customMR programs in Java.
Ø Hands on experience in writing Spark SQL scripts and implementing Spark RDD transformations and actions using Python/Scala.
Ø Well versed with developing and implementing Spark programs usingPython/Scala and Spark Streaming to work with Big Data.
Ø Hands on writing custom UDFs for extending Hive and Pig core functionality.
Ø Hands on dealing with log files to extract data and to copy into HDFS using flume.
Ø Wrote Hadoop Test Cases in Hadoop for checking Input and Outputs.
Ø Hands on integrating Hive and HBase.
Ø Experience in Elastic Search and Solr.
Ø Experience in NOSQL databases: Mongo DB, Hbase and Cassandra.
Ø Good experience in working with real time streaming applications using tools like Spark Streaming, Storm and Kafka.
Ø Hands on using job scheduling and monitoring tools like Oozie and Zookeeper.
Ø Experience with handling different file format like XML,JSON,AVRO,ORC and PARQUET format in HIVE using different SerDes.
Ø Experience in Dimensional Data Modeling using Star and Snow Flake Schema.
Ø Worked on reusable code known as Tie outs to maintain the data consistency.
Ø Clear understanding on Hadoop architecture and various components such as HDFS, Job and Task Tracker,Name and Data Node, Secondary Name Node and Map Reduce programming.
Ø Experience in Hadoop administration activities such as installation and configuration of clusters using Apache and Cloudera.
Ø Knowledge on installing, configuring, and using Hadoop components like Hadoop Map Reduce(MR1), YARN(MR2), HDFS, Hive, Pig, Flume and Sqoop.
Ø More than one year of experience in JAVA, J2EE, Web Services, SOAP, HTML andXML related technologies demonstrating strong analytical and problem solving skills, computer proficiency and ability to follow through with projects from inception to completion.
Ø Extensive experience working in Oracle, DB2, SQL Server and My SQL database and Java Coreconcepts like OOPS,Multithreading, Collections and IO.
Ø Hands on JAXWS, JSP, Servlets, Struts, Web Logic, Web Sphere, Hibernate, Spring, Jboss, JDBC, RMI, Java Script, Ajax, jQuery, Linux, Unix, XML, HTML , Python ,Scala and Vertica.
Ø Having REDHAT certification on LINUX.
Ø Developed applications using Java, RDBMS, and Linux shell scripting.
Ø Good understanding of Data Mining and Machine Learning techniques.
Ø Configured GIT with Jenkins and schedule jobs using POLL SCM option.
Ø Using Jenkins AWS CodeDeploy plugin to deploy and Chef for unattended bootstrapping in AWS.
Ø Have good interpersonal, communicational skills, strong problem solving skills, explore/adopt to new technologies with ease and a good team member.
Technical Skills
Hadoop/BigData Technologies |
HDFS, Map Reduce, YARN, Pig, Hbase, Spark, Zookeeper, Hive, Oozie, Sqoop, Flume, Kafka, Storm, Impala |
HadoopDistribution Systems |
Horton works, Cloudera, MapR |
Programming Languages |
Java JDK1.6/1.8, Python, Scala. C/C++, HTML, SQL, PL/SQL, AVS & JVS |
Frameworks |
Hibernate 2.x/3.x , Spring 2.x/3.x,Struts 1.x/2.x |
Web Services |
WSDL, SOAP, Apache CXF/XFire, Apache Axis, REST, Jersey |
Operating Systems |
UNIX, Windows, LINUX |
Web/Application Servers |
IBM Web sphere, Tomcat, Web Logic, JBOSS |
Web technologies |
JSP, Servlets, JNDI, JDBC, Java Beans, JavaScript |
Databases |
Teradata, Oracle, Netezza, MySQL |
NoSQL Databases |
HBase, Cassandra, MangoDB |
Java IDE |
Eclipse 3.x, IBM Web Sphere Application Developer, IBM RAD 7.0 |
Development Tools |
SOAP UI , ANT, Jenkins, Nexus, Maven, Visio, Rational Rose |
Work Experience
T-Mobile, GA Mar 2016 – Till Date
Hadoop Developer
Responsibilities:
Ø Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Zookeeper and Sqoop.
Ø Configured, Designed implemented and monitored Kafka cluster and connectors.
Ø Implemented a proof of concept (Poc's) using Kafka, Strom, Hbase for processing streaming data.
Ø Used Sqoop to import data into HDFS and Hive from multiple data systems.
Ø Developed complex queries using HIVE and IMPALA.
Ø Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Ø Handled importing of data from various data sources, performed transformations using Hive, Mapreduce, and Loaded data into HDFS.
Ø Helped with the sizing and performance tuning of the Cassandra cluster.
Ø Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDD's.
Ø Developed multiple Pocs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Cassandra and SQL.
Ø Involved in the process of Cassandra data modelling and building efficient data structures.
Ø Analyzed the Cassandra/SQL scripts and designed the solution
Ø Extracted the data from Teradata into HDFS using Sqoop.
Ø Analyzed the data by performing Hive queries and running Pig scripts to know user behavior like shopping
Ø Configured Ooozie workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
Ø Optimized Mapreduce code, pig scripts and performance tuning and analysis.
Ø Implemented advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
Ø Developed Spark Application by using Python (Pyspark)
Ø Exported the aggregated data onto Oracle using Sqoop for reporting on the Tableau dashboard.
Ø Involvement in design, development and testing phases of Software Development Life Cycle.
Ø Performed Hadoop installation, updates, patches and version upgrades when required.
Ø Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
Ø Storage on AWS EBS, S3 and Glacier and automate sync data to Glacier. Databases services on AWS like RDS, Dynamo DB, Elastic Transcoder, Cloud front, Elastic Beanstalk. Migration of 2 instances from one region to another.
Ø Leveraged AWS cloud services such as EC2; auto-scaling; and VPC (Virtual Private Cloud) to build secure, highly scalable and flexible systems that handled expected and unexpected load bursts.
Ø Automated various infrastructure activities like Continuous Deployment, Application Server setup, Stack monitoring using Ansible playbooks and has Integrated with Jenkins.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Ooozie, Java, Eclipse, Cloudera, Cassandra, AWS Oracle 10g, 11g, Flume, Kafka, Flume, Scala, Spark, Sqoop, Python.
Anthem Health Insurance, GA Aug 2015 – Feb 2016
Hadoop Developer
Responsibilities:
Ø Primary responsibilities include building scalable distributed data solutions using Hadoop ecosystem
Ø Datasets will be loaded from two different sources like Oracle, MySQL to HDFS and Hive respectively on daily basis.
Ø Installed and configured Hive on the Hadoop cluster.
Ø Worked on Hbase Java API to populate operational Hbase table with Key value.
Ø Developed multiple Mapreduce jobs in java for data cleaning and preprocessing.
Ø Developing and running MapReduce jobs on YARN and Hadoop clusters to produce daily and monthly reports as per user's need.
Ø Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
Ø Experience in developing multiple Mapreduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other file formats.
Ø Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Ø Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating with Storm.
Ø Worked on migrating data from Mongo DB to Hadoop.
Ø Developed the Pig UDF'S to pre-process the data for analysis.
Ø Designed and developed PIG Latin Scripts to process data in a batch to perform trend analysis.
Ø Developed HIVE scripts for analyst requirements for analysis.
Ø Developed java code to generate, compare & merge AVRO schema files.
Ø Developed complex Mapreduce streaming jobs using Java language that are implemented Using Hive and Pig.
Ø Collected the logs data from web servers and integrated in to HDFS using Flume.
Ø Optimized Mapreduce Jobs to use HDFS efficiently by using various compression mechanisms.
Ø Handled importing of data from various data sources, performed transformations using Hive, Mapreduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
Ø Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
Ø Analyzed the data by performing Hive queries (HiveQL) and running Pig Latin scripts to study customer behavior.
Ø Developed Data Cleansing techniques / UDFs using Pig scripts / Hive QL, Map/Reduce.
Ø Worked on NoSQL Like MongoDB.
Ø Continuously monitored and managed the Hadoop Cluster using Cloudera Manager.
Environment: HDFS, Pig, Pig Latin, Storm, Kafka, Eclipse, Hive, Mapreduce, Java, Avro, Sqoop, LINUX, Cloudera, Big Data, MongoDB, JSON, XML and CSV.
Bank of America, NC Apr 2014 – Jul 2015 Hadoop Developer
Responsibilities:
Ø Responsible for building scalable distributed data solutions using Hadoop and migrate legacy Retail applications ETL to Hadoop.
Ø Accessed information through mobile networks and satellites from the equipment.
Ø Implemented ETL code to load data from multiple sources into HDFS using pig scripts.
Ø Hands on creating different applications in social networking websites and obtaining access data.
Ø Wrote Map Reduce jobs using the access tokens to get the data from the customers.
Ø Developed simple to complex Map Reduce jobs using Hive and Pig for analyzing the data.
Ø Used different Serde's for converting JSON data into pipe separated data.
Ø Implemented some business logics by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
Ø Used Oozie workflow engine to run multiple Hive and Pig jobs.
Ø Exported the results to Teradata using Sqoop to generate reports for the BI team.
Ø Worked with application teams in installing operating system, Hadoop updates, patches, version upgrades as required.
Ø Continuously monitored and managed the Hadoop Cluster using Cloudera Manager.
Environment: Hadoop, Map Reducer, Cloudera Manager, HDFS, Hive, Pig, Sqoop, Oozie, Impala, SQL, Java (jdk 1.6), Eclipse and Informatica 9.1.
Mindtree - Bangalore, India Feb 2013 – Mar 2014 Java Hadoop Developer
Responsibilities: Ø Involved in analysing requirements and establish development capabilities to support future opportunities. Ø Involved in sharing data to teams which analyse and prepare reports on Risk management. Ø Handled importing of data from various data sources, performed transformations using PIG, Mapreduce, loaded data into HDFS and extracted data from MySQL into HDFS using SQOOP. Ø Worked on streaming the analyzed data to the existing relational databases using SQOOP by making it available for visualization and report generation to the BI team. Ø Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts. Ø Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins. Ø Involved in End to End implementation of ETL logic. Ø Effective coordination with offshore team and managed project deliverable on time. Ø Worked on QA support activities, test data creation and Unit testing activities. Ø Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis. Ø Designed and developed read lock capability in HDFS. Ø Created Pig Latin scripts to sort, group, join and filter the enterprise wise data. Ø Analysed Web server log data using Apache Flume.
Environment: Hadoop, Map Reduce, Hive, Pig, Sqoop, Hbase, SQL, Oozie, Linux, UNIX.
Innova Infotech - Bangalore, India Dec 2010 – Jan 2013 Java Developer
Responsibilities: Ø Worked with Business Analyst and helped representing the business domain details. Ø Actively involved in setting coding standards and writing related documentation. Ø Created Preferred Vehicle Web Service using JAXWS. Ø The web service is created using top down approach and tested using SOAP UI tool Ø Used Hibernate 3.3.1 to interact with Data base. Ø Developed JSPs & Servlets to dynamically generate HTML and display data to client side. Ø An Admin tool is created in struts MVC design pattern to add preferred vehicle to Database. Ø Designed Web Applications using MVC design pattern. Ø Developed Shell script to retrieve the vendor files dynamically and used Cron tab to execute these scripts periodically. Ø Designed the Batch Process for processing vendor data files using IBM Web sphere Application Server’s Task Manager Framework. Ø Performed unit testing using JUnit Testing Framework and Log4J to monitor the error log.
Environment: IBM RAD, IBM Web Sphere App Server 7.0,Java/J2EE,Spring 3.0, JDK 1.5, Web services, SOAP, Servlets, JSP, ANT 1.6.x, Ajax, Hibernate 3.3.1,Custom tags
Indmax - Hyderabad, India Oct 2008 – Nov 2010 SQL Developer Responsibilities: Ø Created & modified database objects like tables, views, procedures, functions, triggers, packages, indexes, synonyms, materialized views using Oracle tools like TOAD and SQL Navigator. Ø Developed SQL and PL/SQL scripts to transfer tables across the schemas and databases. Ø Updated procedures, functions, triggers and packages based on the change request from users. Ø Support activities like Job monitoring, enhancements and resolving defects. Ø Worked with testing teams; perform UAT testing with business users Ø Worked with release team for the staging & production move. Ø Implemented efficient error handling process by capturing errors into user managed tables. Ø Pair-program with developers to enhance current PL/SQL packages to fix production issues, build new functionality and improve processing time through code optimization.
Environment: Oracle, SQL Developer, TOAD, Windows 2000/XP, ASP.Net, Visual Studio. |
|