
Ibraheem Fayemi
- Hadoop Data Engineer
- Atlanta, GA
- Member Since Jun 14, 2023
Ibraheem Fayemi
Professional summary
· 10+ years of professional experience in IT industry, with 3 years' experience in Hadoop ecosystem's implementation, maintenance, ETL and Big Data analytics operations.
· Experience in installation, configuration, monitoring and administration of Hadoop ecosystems such as Yarn, MapReduce, HDFS, Pig, Hive, HBase, Sqoop, Oozie, Flume, Spark and Zookeeper for data storage and analysis.
· Extensive experience in cluster planning, installing, configuring and administering Hadoop cluster for major Hadoop distributions like Cloudera and Hortonworks.
· Experience in running Hadoop streaming jobs to process Terabytes of xml and /or JSON format data.
· In-depth knowledge of NoSQL Technologies such as HBase, MongoDB, Cassandra and CouchDB
· Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
· Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop
· Hands on Experience in writing complex Map Reduce programs using Python to perform analytics based on different common patterns including Joins, Sampling, Data Organization, filtering and Summarization
· Strong knowledge of Amazon Web Services and Microsoft Azure
· Experience in real-time monitoring and alerting of applications deployed in AWS using Cloud watch, Cloud trail and Simple Notification Service
· Experience in provisioning highly available, fault tolerant and scalable applications using AWS Elastic Beanstalk, Amazon RDS, Elastic Load Balancing, Elastic Map Reduce and Auto Scaling.
· Good understanding on building Bigdata/Hadoop applications using AWS Services like Amazon S3, EMRFS, EMR, RDS
· Hands on experience in creating real-time data streaming solutions using Apache Spark and Python using PySpark
· Experience in handling messaging services using Apache Kafka
· Experience working with different file formats - Avro, Sequence and JSON
· Good knowledge and experience on Microsoft Business Intelligence (SSIS, SSRS, SSAS)
· Strong analytical and problem-solving skills with excellent oral and written communication skills.
Technical Skills:
· Hadoop Ecosystems: MapReduce v1, YARN, HDFS, HBase, Zookeeper, Hive, Pig, Spark, Sqoop, Flume, Oozie, Impala, Kafka, Storm, MongoDB
· Programming Languages: T-SQL, PL/SQL. PostgreSQL, C, C++, Java.
· Scripting Languages: JavaScript, Python, Scala, Power shell
· Databases: NoSQL, MS SQL SERVER,
· Tools: Eclipse, MS Visual Studio
· Platforms: Windows, Linux
· NoSQL Technologies: MongoDB, Cassandra, CouchDB
· Application Servers: Apache Tomcat 5.x 6.0, Jboss 4.0
· Server Tools: SSMS, SSRS, SSIS, Database Tuning Advisor (DTA), SQL Profiler, DMV.
WORK EXPERIENCE
Echo Global and Logistics – Atlanta GA
April 2016 to Present
HADOOP DATA ENGINEER
Responsibilities
· Developed and executed custom MapReduce programs, Pig Latin scripts and HQL queries.
Used Hadoop FS scripts for HDFS (Hadoop Distributed File System) data loading and manipulation.
· Analyzed business requirements and cross-verified them with functionality and features of NoSQL databases like HBase, Cassandra to determine the optimal DB.
· Monitored workload, job performance and node health using Cloudera Manager.
· Used Flume to collect and aggregate weblog data from different sources and pushed to HDFS.
· Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
· Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive, Pig and Sqoop.
· Developed Pig UDFs to pre-process data for analysis
· Worked with business teams and created Hive queries for ad hoc access.
· Responsible for creating Hive tables, partitions, loading data and writing hive queries.
· Created Pig Latin scripts to sort, group, join and filter the enterprise wise data
· Maintained cluster co-ordination services through Zookeeper.
· Generated summary reports utilizing Hive and Pig and exported these results via Sqoop for Business reporting and Intelligence analysis
· Import millions of structured data from relational databases using SQOOP import to process using Spark and stored the data into HDFS in CSV format.
· Created Hive, Phoenix, HBase tables and HBase integrated Hive tables as per the design using ORC file format and Snappy compression.
· Developed UDF's using both Data Frames/SQL and RDD in Spark for data Aggregation queries and reverting into OLTP through SQOOP.
· Developed servlets and JSPs with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
· Configuring Spark Streaming to receive real time data from the Kafka and Store the stream data to HDFS.
· Implemented various checkpoints on RDDs to disk for fault tolerance.
· Involved in writing the test cases for the application using Junit
· Used Java core concepts Collection Framework Interfaces like List, Set, Queue and Map Interface.
InComm – Atlanta GA
Jan 2014 to March 2016
HADOOP DEVELOPER
Responsibilities:
· Installed and configured Hadoop Tools such as HDFS, Map Reduce, Yarn, Hive, Pig, SQOOP, Flume, Kafka and Oozie.
· Loaded data from LINUX file system to HDFS. Imported and exported data into HDFS and HIVE using SQOOP, processed data in HDFS using Impala (in Hue Interface).
· Processing and analyze the data using MapReduce jobs
· Created HIVE tables to store the processed results in tabular format
· Designed and implemented Apache Spark streaming application
· Worked on pulling the data from MySQL database into HDFS using Sqoop
· Collecting and aggregating large amounts of log data using Flume and staging data HDFS for further analysis, Developed Map Reduce Programs those are running on the cluster, processed unstructured data using Pig. Assisted in monitoring Hadoop cluster using tools like SSIS.
· Implemented test Shell scripts to support test driven development and continuous integration, scheduling Oozie workflow engine to run multiple Hive and pig jobs.
· Dumped data to Cassandra using Kafka, Created Cassandra tables to store various data formats data came from different portfolios.
· Involved in ETL large datasets (Terabytes) of structured, semi-structured and unstructured data
· Responsible for running Hadoop streaming jobs to process Terabytes of xml's data utilized cluster co-ordination services through Zookeeper.
· Involved in developed PIG scripts to extract the data from the web server and did transformations, even joins and some pre-aggregations before storing the data onto HDFS.
· Dumped Online transfer data to HBASE using Kafka. Handled imported data from HBASE and performed transformations using Hive, Created Hive External tables and loaded the data into tables and query data using HQL stored onto HDFS.
Merck - Atlanta, GA
August 2012 to December 2013
SQL Server Database Administrator
Responsibilities:
· Tasks involved upgrading from SQL 2005 to SQL 2008, backup and restores procedures, migrating databases, testing replication, performance monitoring to resolve bottlenecks, resolving security issues at the enterprise level and locally, analyzing data via T-SQL queries, log shipping, utilizing DMV’s for analyses, initiating the SQL profiler to monitor and measure queries, index optimizing, and monitoring SQL Servers to increase performance.
· Involved in capacity planning, sizing and database growth projections
· Scheduled Full and Transactional log backups for the user created and system database in the production environment using the Database Maintenance Plan wizard
· Used Data Transformation Services(DTS)/SQL Integration Services (SSIS) and Extract Transform Loading (ETL) tool of SQL Server to populate data from data sources, creating packages to different data loading operations for application.
· Involved in maintaining, monitoring, and troubleshooting SQL Server performance issues.
· Assisted other DBA in installing, testing and deploying SSRS for reporting.
· Worked closely with the network administrator and senior developer in resolving issues related to capacity planning and procedures enhancing overall performance.
· Responsible for implementing new methods of automating various maintenance processes related to the SQL Server environment.
· Involved in tasks like re-indexing, checking data integrity, backup and recovery.
· Experienced with T-SQL in writing procedures, triggers and functions
Animal Care Services Inc. Lagos
January 2009 to July 2012
Junior Database Administrator
Responsibilities
· Involved in capacity planning, sizing and database growth projections
· Scheduled and maintain routine Jobs, Alerts and Maintenance Plans.
· Created and managing Users, Roles and Groups and handling database security
· Managing Backups daily and Recovery as per request.
· Reviewing SQL Server and SQL Server Agent Error Logs.
· Troubleshooting the HA issues of Log shipping and Clustering.
· Creating SSIS package for keeping data in sync between different domains; also, to migrate data from different heterogenous environments.
· Creation and modification of ETL packages using SSIS, DTS
· Developing and Managing SQL Server Reporting Services based on business requirement.
ComSol Computer Services Inc. Lagos
October 2006 to November 2008
Computer Engineer
· Designed, installed, and configured for small businesses and homes fundamental networks for sharing resources, upgrading security issues, installing routers and hubs and troubleshooting connection problems.
· Additionally, created small databases using Microsoft Access for small businesses and homes.
· Taught clients how to troubleshoot, configure and manage their small computer environments.
· Taught students how to write Computer Programming Language.
EDUCATION
Master of Science in Chemical Engineering University of Lagos.
Bachelor of Science in Chemical Engineering OAU Ife
CERTIFICATIONS
Administering Microsoft SQL Server 2012/2014 Databases