Sunteți pe pagina 1din 3

PremKalyan

premkalyank17@gmail.com

Big Data Engineer

Summery
Over 11 years of IT experience in analyzing, designing, development, and maintenance of
critical web based business applications and applications involving with Data warehousing
methodologies.
Over 4 years of extensive experience in Big Data Ecosystem.
Good exposure to consolidate, validate and cleanse data from a vast range of sources - from
applications and databases to files and Web services.
Good exposure in building RESTful APIs in front of different types of NoSQL storage engines.
Good Experience in Developing Applications using Java, J2EE (Servlets, JSP, struts, spring,
JMS).
Good knowledge of Hadoop architecture and various Hadoop Stack elements.
Hands on experience in installing, configuring, and using Hadoop ecosystem components like
Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
Cutting edge experience on Splunk (Log based performance monitoring tool).
Exposure on Spark, Solr, Kafka and Scala Programming.
Extensively used ETL methodology for supporting Data Extraction, transformations and load.
Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce
programs in Java and Python.
Experience in working with as well as Hosting Hadoop in cloud environments such as Amazon
AWS/EC2 & EMR.
Worked on developing ETL processes (Data Stage & Talend open Studio) to load data from
multiple data sources to HDFS using FLUME and SQOOP, and performed structural
modifications using Map Reduce, HIVE.
Experience in developing pipelines and processing data from various sources and processing
them with Hive and Pig.
Extending Hive and Pig core functionality by writing custom UDFs.
Experienced in building sophisticated distributed systems using REST/hypermedia web APIs
(SOA).
Good Exposure in providing solutions using SOA, Distributed Computing & Enterprise Service
Bus.
Experience on leasing, financing, Telecom, retail and Health care domains by virtue of work
done on different systems.
Authorized to work in the US for any employer

Work Experience

Hadoop Engineer, Big Data Analytics Team


Dell Inc Round Rock, TX

Jan 2016 to Present


Responsibilities:
Understanding Client's DW Application, Interfaces and Business involved.
Involved in creating Data Lake by extracting customer's Big Data from various data
sources into Hadoop HDFS. This included data from Excel, Web sac, databases and also
log data from servers.
Worked on data load from various sources i.e., Oracle, MySQL, DB2, MS SQL Server,
Cassandra, MongoDB, Hadoop using Sqoop and Python script.

Responsible for production Hadoop-cluster set up, administration, maintenance,


monitoring and support.
Developed MapReduce programs to cleanse the data in HDFS obtained from
heterogeneous data sources to make it suitable for ingestion into Hive schema for
analysis.
Back-end Java developer for Data Management Platform (DMP) and building RESTful
APIs in front of different types of NoSQL storage engines allowing other groups to
quickly meet their Big Data needs.
Work closely with architect and clients to define and prioritize their use cases and
iteratively develop APIs and architecture.
Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify
issues and behavioral patterns.
The Hive tables created as per requirement were internal or external tables defined with
appropriate static and dynamic partitions, intended for efficiency.
Worked with Business Developer team in generating customized reports and ETL
workflows in Data Stage.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Environment: CDH Version 4.5 used for the project which includes Apache Hadoop 2.0,
Hive 0.10, Hue 2.1, Pig 0.10, Sqoop 1.4, Oozie 3.2.0, Cassandra, Map Reduce, HDFS,
Hbase, Splunk, Storm, Kafka.
Big Data/Hadoop Engineer, EDW Team
Verizon Communications Inc
Temple Terrace, FL
April 2013 to December 2015
Responsibilities:
Created HBase tables to load large sets of structured, semi-structured and unstructured data
coming from UNIX, NoSQL (Cassandra) and a variety of portfolios.
Loaded the customer profiles data, customer usage information, billing information etc. onto
HDFS using Sqoop and Flume.
Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading
of massive structured and unstructured data.
Installed and configured Apache Hadoop, Hive and Pig environment on the prototype server.
Created data models in CQL for customer data.
Used the machine learning libraries of Mahout to perform advanced statistical procedures like
clustering and classification to determine the usage trends.
Created reports for the BI team using Sqoop to export data into HDFS and Hive.
Collaborated with the infrastructure, network, database, application and BI teams to ensure
data quality and availability.
Administrator for Pig, Hive and Hbase.
Performed various configurations, which includes, networking and IPTable, resolving
hostnames, user accounts and file permissions, http, ftp, SSH key less login.
Environment: CDH4, Flume, Hive, Sqoop, Pig, Oozie, Cassandra, JDK1.6, Map reduce, HDFS,
Hbase, Storm.

Hadoop Developer / Data Power Developer

Medco/ Express Scripts Inc


Franklin Lakes, NJ
April 2011 to March 2013
Responsibilities:
Worked on Hadoop cluster set up, administration, maintenance, monitoring and support.
Analyzing requirements for Optimization and tuned the Hadoop environment to meet the
business requirements.
Designed and documented REST/HTTP APIs, including JSON data formats and API versioning
strategy.
Loaded the customer profiles data, customer claims information, billing information etc. onto
HDFS using Sqoop and Flume.
Gathered requirements from Engineering (Statistical analysis) and Finance (Financial
Reporting) teams to design solutions on the Hadoop ecosystem.
Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some preaggregations before storing the data onto HDFS.
Used Oozie to orchestrate the map reduce jobs and Development of Pig scripts for handling
analysis.
Used Pattern matching algorithms to recognize the abnormal patterns across different sources
and built risk profiles for each customer using Hive and stored the results in HBase.
Environment: Web services, Data Power, Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, LINUX,
and Big Data, Mainframe, DB2, CICS.
Tools: Jmeter Testing Tool, MQ Testing Facility, Test Harness, Intertest, Labs tool

Business Intellegence Developer


Verizon, DC
Skills
HDFS * Design HDFS * Set up Hadoop Cluster * Hadoop Configuration * YARN Configuration *
Hadoop in the Cloud
Map Reduce * Develop Map Reduce Application*Compression*Partitioner
PIG *Install & run Pig * Pig Latin (Structure,Statements,Expressions,Schemas) * Data Processing
Operators
HIVE * Install & run HIVE * HIVE QL * Tables * Querying Data (Joins, Views)
Zookeeper * Install & run Zookeeper * Operations
HBASE * Install & run Hbase * Operations
Splunk * Created data connectors to Splunk
Tableau * Development of Data board reports
Sqoop * Database Imports * Performing an Export
Administration * Audit Logging * Monitoring & Mainten
Full Resume Provided upon request.

S-ar putea să vă placă și