E2e Hana Hadoop Technical Demo

DEMO SCRIPT
E2E HANA HADOOP TECHNICAL DEMO
DOCUMENT CLASSIFICATION: INTERNAL
General Information:
HANA HADOOP Data Services 4.1 Lakshmi Narasimhan (I040723)
Authors: Date Last Updated:
8/24/2012
SAP AG 2011 / INTERNAL / SCENARIO ID:
COPYRIGHT 2012 SAP AG. ALL RIGHTS RESERVED.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10, System z9, z10, z9, iSeries, pSeries, xSeries, zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400, S/390 Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, POWER5+, POWER5, POWER, OpenPower, PowerPC, BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix are trademarks or registered trademarks of IBM Corporation. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. SAP, R/3, xApps, xApp, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
1. Demo Story:
Its a technical demo to show E2E Hadoop HANA integration process using Data Services. This demo is shorter version of Real Time Big Data Retail POS HANA HADOOP Integration Scenario but focused on end to end technical process of Hadoop Map/Reduce job and integration with HANA using Data Services 4.1.Instead of 90TB of weblogs we deal with 130MB weblogs in this technical demo.
Two parts for the demo: i. Hadoop Map/Reduce Job converting the weblogs to structured data in HDFS ii. DS 4.1 which loads the data from HDFS to HANA
2. Client Tools required for demo:

1. FileZila - http://filezilla-project.org/download.php?type=client Download the zip file and extract to your local desktop. Required to show the weblogs and output of Hadoop Map/Reduce job 2. Putty - http://www.putty.org/. Required to run the Hadoop Map/Reduce Job
3. HANA studio Revision 26 recommended for this demo. Required to show the data in HANA.
3. System details and User Access:

i.
HANA (require HANA Studio): Host: usphlhana06b.phl.sap.corp Instance Number: 13 UID/PASSWORD: HADOOP/Welcome123 Or just import the xml and update the password mentioned above.
hadoop_hana_landscape.xml
ii.
HADOOP( require putty):
Host: usphlvm1939.phl.sap.corp UID: <mailed on request> PWD: <mailed on request>
iii.
HDFS(require FileZila):
Host Name: usphlvm1939.phl.sap.corp Username: <mailed on request> Password: <mailed on request> Port: 22222
4. Demo Preparation:
Yu may need to delete only the data in the HANA table using studio before your execute the DS job.
Tip: Set SYSTEM as Filters for Catalog and ITEM_SESSIONS2 as Filters for table before the demo. Right click on the table ITEM_SESSIONS2 table from SYSTEM schema > Delete. On the pop up select Delete All Rows
5. Run the demo: PART 1: Hadoop Map/Reduce Process

1. Simulated weblogs (HDFS file system) 1 week worth of data 130MB: Login to FTP client like FileZila to just show the weblogs location. Copy over to local desktop to show the unstructured data in weblogs. /user/i040723/sessionLogDemo/access.log
If required copy to local desktop and show the unstructured data.
2. MapReduce job-The script to run it is (this script is on the Linux file system): /usr/local/mrJob/run.sh Login to Hadoop server using putty to execute this job. This Hadoop job converts the unstructured data (access.log) and creates a output in HDFS.
Wait until the job is executed.
3. Output in HDFS this file is on the HDFS, not on the Linux file system: 475340 structured records /user/i040723/sessionLogDemo/sessionItems/affinity/part-r-00000
PART 2: Run Data Services Job

To load the structured records from HDFS into HANA run the Data Services job
1. Launch DS Management Console http://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=true 2. User Name/ Password : hadoop/welcome -> Log On
3. Click on Administrator
4. Click on HANA repository on the status tab
5. Select Batch Job Configuration and Execute
6. On execute Batch Job click on Execute
7. Click here to check logs and see if the job is completed successfully
8. Wait for 2 minutes for the job to complete successfully
9. Show the results in HANA Studio
10. Right click on ITEM_SESSIONS2 table and Open Data Preview
11. This data can be easily consumed by BO client tools for Analysis.

E2e Hana Hadoop Technical Demo

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

E2e Hana Hadoop Technical Demo

Încărcat de

Drepturi de autor:

Formate disponibile

DEMO SCRIPT

E2E HANA HADOOP TECHNICAL DEMO

DOCUMENT CLASSIFICATION: INTERNAL

HANA HADOOP Data Services 4.1 Lakshmi Narasimhan (I040723)

Authors: Date Last Updated:

SAP AG 2011 / INTERNAL / SCENARIO ID:

COPYRIGHT 2012 SAP AG. ALL RIGHTS RESERVED.

SAP AG 2011 / INTERNAL / SCENARIO ID:

SAP AG 2011 / INTERNAL / SCENARIO ID:

2. Client Tools required for demo:

SAP AG 2011 / INTERNAL / SCENARIO ID:

3. System details and User Access:

HADOOP( require putty):

Host: usphlvm1939.phl.sap.corp UID: <mailed on request> PWD: <mailed on request>

SAP AG 2011 / INTERNAL / SCENARIO ID:

SAP AG 2011 / INTERNAL / SCENARIO ID:

5. Run the demo: PART 1: Hadoop Map/Reduce Process

If required copy to local desktop and show the unstructured data.

Wait until the job is executed.

SAP AG 2011 / INTERNAL / SCENARIO ID:

SAP AG 2011 / INTERNAL / SCENARIO ID:

PART 2: Run Data Services Job

4. Click on HANA repository on the status tab

SAP AG 2011 / INTERNAL / SCENARIO ID:

5. Select Batch Job Configuration and Execute

6. On execute Batch Job click on Execute

SAP AG 2011 / INTERNAL / SCENARIO ID:

8. Wait for 2 minutes for the job to complete successfully

9. Show the results in HANA Studio

SAP AG 2011 / INTERNAL / SCENARIO ID:

10. Right click on ITEM_SESSIONS2 table and Open Data Preview

SAP AG 2011 / INTERNAL / SCENARIO ID:

S-ar putea să vă placă și