Sunteți pe pagina 1din 7

Vishnu Shivaram

vishnushivaram108@yahoo.com
_____________
Professional Summary
Around 6 years of Information Technology experience with 4 years of experience in
Hadoop Ecosystem.
Strong experience in using PIG/Hive for processing and analyzing large volume of data.
Expert in creating PIG and Hive UDFs using Java in order to analyze the data efficiently.
Expert in using Sqoop for fetching data from different systems to analyze in HDFS and
putting it back to the previous system for further processing.
Experience of creating Map Reduce codes in Java as per the business requirements.
Also used HBase in accordance with PIG/Hive as and when required for real time low
latency queries.
Computing capabilities like Apache Spark written in Scala.
Have technical exposure on Cassandra CLI like creating Key Spaces and Column Families
and analyzing fetched data.
Worked with Big Data distributions like Cloudera (CDH 3 and 4) with Cloudera
Manager.
Worked in ETL tools like Talend to simplify Map Reduce jobs from the front end. Also
have knowledge of Pentaho and Informatica as another working ETL tool with Big Data.
Worked with BI tools like Tableau for report creation and further analysis from the front
end.
Extensive knowledge in using SQL queries for backend database analysis.
Expert in implementing advanced procedures like Text analytics and processing using
the in-memory
Strong knowledge of Software Development Life Cycle (SDLC).
Experienced in creating and analyzing Software Requirement Specifications (SRS) and
Functional Specification Document (FSD).
Extensive knowledge in creating PL/SQL stored Procedures, packages, functions, cursors
etc. against Oracle (9i, 10g, 11g), and MySQL server.
Having strong technical skills in Core Java with working knowledge.
Knowledge on MS Azure, Platfora, Datameer.
Involved in developing distributed Enterprise and Web applications using UML,
Java/J2EE, Web technologies that include EJB, JSP, Servlets, Struts II, JMS, JDBC, JAXWS, JPA HTML, XML, XSL, XSLT, Java Script, Tomcat, spring and Hibernate.
Expertise in Defect Management and Defect Tracking to do performance tuning for
delivering utmost Quality product.
Experienced in provided training to team members as new per the project requirement.
Experienced in creating Product Documentation Presentations.
Worked in Windows, UNIX/LINUX platform with different Technologies such as Big
Data, SQL, PL/SQL, XML, HTML, Core Java, C#, Shell Scripting.
Experienced in working with different scripting technologies like Python.
Good communication interpersonal skills, committed, result oriented, hard working with a
quest to learn new technologies.
Experienced to work with multi-cultural environment with a team and also individually as
per the project requirement.

Technical Skills
Hadoop/Big Data Technologies HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala,
Spark, Zookeeper and Cloudera Manager.
NO SQL Database
HBase, Cassandra
Monitoring and Reporting
Tableau, Custom shell scripts,
Hadoop Distribution
Horton Works, Cloudera, MapR
Build Tools
Maven, SQL Developer
Programming & Scripting
JAVA, C, SQL, Shell Scripting, Python
Java Technologies
Servlets, JavaBeans, JDBC, Spring, Hibernate, SOAP/Rest
services
Databases
Oracle, MY SQL, MS SQL server, Teradata, MongoDB, DB2
Web Dev. Technologies
HTML, XML, JSON, CSS, JQUERY, JavaScript, Angular JS
Version Control
SVN, CVS, GIT
Operating Systems
Linux, Unix, Mac OS-X, Windows 8, Windows 7, Windows
Server 2008/2003
Professional Experience
Fannie Mae, Reston, VA
Nov
2015 to till date
Hadoop Developer
Fannie Mae serves the people who house America. They are a leading source of financing for
mortgage lenders, providing access to affordable mortgage financing in all markets at all
times. It financing makes sustainable homeownership and workforce rental housing a reality
for millions of Americans.
Responsibilities:
Extracted and updated the data into HDFS using Sqoop import and export command line
utility interface.
Responsible for developing data pipeline using Flume, Sqoop, and Pig to extract the
data from weblogs and store in HDFS.
Involved in using HCATLOG to access Hive table metadata from MapReduce and Pig
code.
Involved in developing Hive UDFs for the needed functionality.
Involved in creating Hive tables, loading with data and writing Hive queries.
Managed works including indexing data, tuning relevance, developing custom tokenizes
and filters, adding functionality includes playlist, custom sorting and regionalization with
Solr search engine.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for
reporting.
Used pig to do transformations, event joins, filter boot traffic and some pre-aggregations
before storing the data onto HDFS.
Implemented advanced procedures like text analytics and processing using the inmemory computing capabilities like spark.
Enhanced and optimized product Spark code to aggregate, group and run data mining
tasks using the Spark framework.
Extending Hive and Pig core functionality by writing custom UDFs.
Experience in managing and reviewing Hadoop log files
Developed data pipeline using Flume, Sqoop, pig and java MapReduce to ingest

customer behavioral data and financial histories into HDFS for analysis.
Involved in emitting processed data from Hadoop to relational databases and external file
systems using Sqoop.
Orchestrated hundreds of Sqoop scripts, pig scripts, Hive queries using Oozie workflows
and sub-workflows.
Loaded cache data into HBase using Sqoop.
Experience in custom talend jobs to ingest, enrich and distribute data in MapR, Cloudera
Hadoop ecosystem.
Created lots of external tables on Hive pointed to HBase tables.
Analyzed HBase data in Hive by creating external partitioned and bucketed tables.
Worked with cache data stored in Cassandra.
Injected the data from External and Internal Flow Organizations.
Used the external tables in Impala for data analysis.
Supported MapReduce Programs those are running on the cluster.
Participated in apache Spark POCS for analyzing the sales data based on several business
factors
Participated in daily scrum meetings and iterative development.

Environment: Hadoop, MapReduce, Hdfs, Pig, Hive, HBase, Impala, Sqoop, Flume,
Oozie, Apache Spark, Java, Linux, SQL Server, Zookeeper, Autosys, Tableau,
Cassandra.

Wells Fargo, San Francisco, CA


Oct 2014 to Oct 2015
Hadoop Developer
Wells Fargo & Company is an American multinational diversified financial services
company. The CORE project deals with improving end-to-end approach to real estate-secured
lending, the overall customer experience and achieving the vision of satisfying all the
customers financial needs.
The purpose of the project is to build an enterprise big data platform that would be used to
load, manage and process terabytes of transactional data, machine log data, performance
metrics, and other adhoc data sets and extract meaningful information out of it. The solution
is based on the Cloudera Hadoop.
Responsibilities:
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce
jobs in java for data cleaning and preprocessing.
Experience in installing, configuring and using Hadoop Ecosystem components.
Experience in Importing and exporting data into HDFS and Hive using Sqoop.
Experienced in defining job flows.
Knowledge in performance troubleshooting and tuning Hadoop clusters.
Experienced in managing and reviewing Hadoop log files.
Participated in development/implementation of Cloudera Hadoop environment.
Load and transform large sets of structured, semi structured and unstructured data.
Experience in working with various kinds of data sources such asHBase and Oracle.
Successfully loaded files to Hive and HDFS from HBase.
Installed Oozie workflow engine to run multiple map-reduce programs which run
independently with time anddata.
Performed Data scrubbing and processing.

Responsible for managing data coming from different sources.


Gained good experience with NOSQL database.
Experience in working with Flume to load the log data from multiple sources directly into
HDFS.
Supported Map Reduce Programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing hive queries, which will run
internally in Map Reduce way.
Worked in installing cluster, commissioning & decommissioning of Data Node, Name Node
recovery, capacity planning and slots configuration.
Implemented best income logic using Pig scripts.
Load and transform large sets of structured, semi structured and unstructured data.
Exported the analyzed data to the relational databases using Sqoopfor visualization and
to generate reportsfor the BI team.
Supported in setting up QA environment and updating configurations for implementing
scripts with Pig andSqoop.

Environment: Hadoop, MapReduce, HDFS, Hive,


Zookeeper, Sqoop, Oozie, HBase, CentOS, SOLR.

Java,

SQL,

Datameer,

PIG,

Liberty Mutual, Bedford, NH


Nov 2013 to Sept 2014
Hadoop Developer
Liberty Mutual Group, more commonly known by the name of its primary line of business,
Liberty Mutual Insurance, is an American diversified global insurer, and the second-largest
property and casualty insurer in the United States. It ranks 76th on the Fortune 100 list of
largest corporations in the United States based on 2013 revenue. Based in Boston,
Massachusetts, it employs over 50,000 people in more than 900 locations throughout the
world.
Responsibilities

Worked on a live Hadoop production CDH3 cluster with 35 nodes.


Worked with highly unstructured and semi structured data of 25 TB in size.
Good experience in benchmarking Hadoop cluster
Data pipeline/ETL design
Implemented Flume (Multiplexing) to stream data from upstream pipes in to HDFS.
Worked on custom MapReduce programs using Java
Designed and developed the Apache storm topologies for inbound and outbound data for
real time ETL to find the latest trends and keywords.
Designed and developed PIG data transformation scripts to work against unstructured
data from points and created a base line.
Worked on creating and optimizing Hive scripts for data analysis based on the
requirements.
Very good experience in working with Sequence files and compressed file formats.

Good experience in troubleshooting performance issues and tuning Hadoop cluster.


Worked with performance issues and tuning the Pig and Hive scripts.
Exported the analyzed data to the relational databases using Sqoop for visualization and
to generate reports for the BI team.
Create databases using HBase/Python-MapReduce to replace oracle databases.
Documented tool to perform chunk uploads of big data into google big query.
Worked with the infrastructure and the admin teams to set up monitoring probes to track
the health of the nodes.
Created and maintained technical documentation for launching Hadoop clusters and for
executing Hive queries and Pig scripts.
Environment: Java, Python, Oracle 10g, Cassandra, Hadoop, Flume, Storm, Kafka,
Hive, HBase, Linux, MapReduce, Eclipse, Hdfs, CDH3, SQL.

Becton Dickinson, New Jersey


2013
Hadoop Developer

Aug 2012 Oct

Becton, Dickinson and Company (BD) is an American medical technology company that
manufactures and sells medical devices, instrument systems and reagents. Founded in 1897
and headquartered in Franklin Lakes, New Jersey, BD employs nearly 30,000 people in more
than 50 countries throughout the world
Responsibilities

Worked with systems engineering team to plan and deploy new Hadoop environments
and expand existing Hadoop clusters with agile methodology.
Monitored multiple Hadoop clusters environments using Ganglia, monitored workload, job
performance and capacity planning using Cloudera Manager.
Worked with application teams to install operating system, Hadoop updates, patches,
version upgrades as required.
Experienced with through hands-on experience in all Hadoop, Java, SQL and Python.
Used Flume to collect, aggregate, and store the web log data from different sources like
web servers, mobile and network devices and pushed to HDFS.
Participated in functional reviews, test specifications and documentation review
Performed MapReduce programs on log data to transform into structured way to find
user location, age group, spending time.
Analyzed the web log data using the HiveQL to extract number of unique visitors per
day, page views, visit duration, most purchased product on website.
Exported the analyzed data to the relational databases using Sqoop for visualization and
to generate reports by Business Intelligence tools.
Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of
the box (such as MapReduce, Pig, Hive, and Sqoop) as well as system specific jobs (such
as Java programs and shell scripts).
Involved in Installing and configuring Kerberos for the authentication of users and Hadoop

daemons.
Encryption Mechanisms using Python
Proactively monitored systems and services, architecture design and implementation of
Hadoop deployment, configuration management, backup, and disaster recovery systems
and procedures.
Experience using Talend for ETL tools and also extensive knowledge on Netezza.
Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster,
involved in Analyzing system failures, identifying root causes, and recommended course
of actions.
Documented the systems processes and procedures for future references, responsible to
manage data coming from different sources.
Environment: Hadoop, HDFS, Map Reduce, Flume, Pig, Sqoop, Hive, Pig, Sqoop,
Oozie, Ganglia, HBase, Shell Scripting
TMG Health Jessup, PA
Aug 2010 to July 2012
Java Developer
TMG health is a leading national provider of expert solutions for Medicare advantage.
Medicare part D and managed Medicaid plans. With more than 17 years of experience of
providing technology-enabled services to the government market exclusively, our knowledge
of health plan processes, CMS requirements and the daily challenges plan face within the
government market is second to none..
Responsibilities

Created the database, user, environment, activity and class diagram for the project
(UML).
Implemented the database using oracle database engine.
Designed and developed a fully functional generic n-tiered J2EE application platform, the
environment was oracle technology driven. The entire infrastructure application was
developed using oracle Developer in conjunction with oracle ADF-BC and oracle ADF- rich
formats.
Created an entity object (business rules and policy, validation logic, default value logic,
security).
Created view objects, view links, association objects, application modules with data
validation rules (Exposing linked views in an application module), LOV, dropdown, value
defaulting, transaction management features.
Web application development using J2EE, JSP, Servlets, JDBC, JavaBeans, Struts, Ajax,
Custom Tags, EJB, Hibernate, Ant, Junitand ApacheLog4j, Web Services, Message
queue(MQ).
Designing GUI prototype using ADF 11G GUI component before finalizing it for
development.
Experience in using version controls such as CVS, PVCS.
Involved in consuming, producing Restful web services using JAX-RS.

Collaborated with ETL/Informatica team to determine the necessary data modules and UI
designs to support Cognos reports.
Junitwas used for unit testing for the integration testing tool.
Created modules using task flow with bounded and unbounded.
Generating WSDL (web services) and create work flow using BPEL.
Created the skin for the layout.
Made integrated testing for the application.
Created dynamic report and using JFreechart.
Environment: Java, Servlets, JSF, Adf rich client UI framework ADF-BC (BC4J) 11g,
Web Services using Oracle SOA, Oracle WebLogic.

Education:
Bachelor of Engineering in Computer Science, Anna University, India
Masters in Technology, University of Pennsylvania, Philadelphia, PA

S-ar putea să vă placă și