Documente Academic
Documente Profesional
Documente Cultură
com/profile/Xitiz-Pugalia
You can do it comfortably (even less than a month for basics) by implementing the
following steps:-
1. Install the software in a system accessible to you.
2. Find someone who genuinely knows / or has worked on the tool for training.
3. learn how to extract data from flatfile & a few DB sources
4. learn how to load data to a flatfile & a few DB targets
5. learn how to use basic transformations such as Expression, filter, router, update
strategy, aggregator, sequence generator, sorter.
6. practice some scenarios using these & once comfortable, move on to learn a bit
more complex transformations such as lookup, rank, normalizer, stored procedure ,
xml, etc
7. if you still have time, learn workflow tasks & SCD / CDC / other concepts.
Alternatively, you can go for Informatica's official training (link: What are some
good online sources to learn Informatica PowerCenter?).
https://youtu.be/6P0OfwU2QbM
for Facebook, LinkedIn & Twitter & Informatica Vibe / Developer has connectors
for social media as well as tools like Kapow Katalyst which can help collect /
automate collection of data from almost any website directly.
For completely unstructured data (eg- PDF, Word, industry specific files like
HIPAA, etc), you could use Informatica B2B & for semi-structured data, you can use
Powercenter, or Vibe or cloud or B2B.
It also has an Informatica Connector Toolkit using which you can create your
own connectors using an eclipse based coding platform. Once you create them, you
can integrate it with Informatica Developer / Vibe.
There are numerous parameters to analyse when deciding to go for an ETL Tool or
Coding:-
Visual flow
The single greatest advantage of an ETL tool is that it provides a visual flow of the
systems logic (if the tool is flow based). Each ETL tool presents these flows differently,
but even the least-appealing of these ETL tools compare favorably to custom systems
consisting of plain SQL, stored procedures and system scripts, and perhaps a handful of
other technologies.
Operational resilience
Many of the home-grown data warehouses we have evaluated are rather fragile: they
have many emergent operational problems. ETL tools provide functionality and
standards for operating and monitoring the system in production. It is certainly possible
to design and build a well instrumented hand-coded ETL application. Nonetheless, its
easier for a data warehouse / business intelligence team to build on the features of an
ETL tool to build a resilient ETL system.
Performance
You might be surprised that performance is listed as one of the last under the advantages
of the ETL tools. Its possible to build a high-performance data warehouse whether you
use an ETL tool or not. Its also possible to build an absolute dog of an data warehouse
whether you use an ETL tool or not. Weve never been able to test whether an excellent
hand-coded data warehouse outperforms an excellent tool-based data warehouse; we
believe the answer is that its situational. But the structure imposed by an ETL platform
makes it easier for an (novice) ETL developer to build a high-quality system.
Furthermore many ETL tools provide performance enhancing technologies, such as
Massively Parallel Processing, Symmetric Multi-Processing and Cluster Awareness.
Big Data
A lot of ETL tools are capable of combining structured data with unstructured data in
one mapping. In addition they can handle very large amounts of data, that do not
necessarily have to be stored in data warehouses. Now Hadoop-connectors or similar
interfaces to big data sources are provided by most of the ETL tools nowadays. And the
support for Big Data is growing continually.
Ofcourse there are scenarios where-in Hand written code would be better (not
faster than existing ETL's in development though) but the challenge is to select
the right ETL Tool depending on your scenario instead of thinking of writing it
on your own.
2. For exploring the tool to the utmost level or to solve the most difficult challenges
using Informatica: Opt for any companies with a variety of data sources or
applications to be integrated. If you get a chance, aim for R&D or Centre of
Excellence departments of a company that is a partner of Informatica corp. You can
find a category-wise list of Informatica Partners on the link: Partners - Technology
Partners (Most of them have atleast 6 months of access to latest Informatica
Product Access & training / material access)
3. To keep doing a stable / balanced Informatica work for a long time: Join any large
cap. company / MNC with a number of decent Informatica Projects. Examples
include: Cognizant, Accenture, TCS, Infosys, Wipro, HCL, etc.
If you ask me, I'd always prefer the R&D Department of a firm that has the top-
most level partnership with Informatica as it has its own benefits like: Access to
trainings / latest products & it can help you win clients for yor organisation or even
build some Intellectual Utilities for your organisation plus an opportunity to sell it
on Informatica Marketplace.
Din't get it ?
IOT:-
The Internet of Things (IoT) is the network of physical objects or "things" embedded
with electronics, software, sensors, and network connectivity, which enables these
objects to collect and exchange data.
BDaaS consists of a wide variety of outsourcing of various Big Data functions to the
cloud.
Others might include hybrid approaches mixing up Machine Learning & Artificial
Intelligence with the Big Data / Cloud / Realtime activities.
I'd take Informatica - 9 out of 10 times irrespective of whether you have an interest in
Development or Performance Tuning or Administration activities as Informatica has its
own Administration Tool & features.
As Data Grew, people who used to store data in registers / sheets, started off with Data
Storage in Files, then came Databases, people started storing data in DBs such as Oracle
which followed Codd Rules to the maximum extent.
Then Data Grew Further & Databases had their own limitations & capabilities that
bought out the introduction of Data Warehouses & No SQL / Big Data Specific
Databases.
Informatica processes data from All of the above (& Cloud & Websites & Internet of
Things for OLTP as well as OLAP systems) while Oracle lies in the era where SQL DB's
came into existence & was considered outdated (also termed as Legacy by a few Data
lovers) to a large extent when NoSQL or Big Data specific systems came into picture.
If you are looking at Oracle or SQL DBA at this point of time, it means you are going
back (to a large extent) instead of going ahead with the world of IOT / Big Data / Cloud /
NoSQL DBs, etc.
In terms of career also, there'd always be a limitation in a DBA role at one point of time
in a person's career while Informatica would always open up new windows for you.