Sunteți pe pagina 1din 6

Hitachi Vantara

Smart Data Center


Engineering Troubleshooting Guide
Document Revision Level
Revision Date Who Description
Version 1.0 17 Dec 2019 ChennaKesav Document created
Version 1.0 26 Dec 2019 Venkataramana Document updated

Table of Contents

1. Introduction 3
1.1 Prerequisites 3
1.1.1 Prerequisites for Developers .......................................................................................... 3
1.1.2 Prerequisites for Deployment team member ................................................................. 3
1.1.3 Program Manager ........................................................................................................... 3
1.2 General Prevention 3

2. Resolving Known issues 4


2.1 Slices not displaying in UI 4
2.2 No data:Join Conditions Failed/Pools/Devices missed 4
2.3 Data available in Kafka but not in druid datasource 5
2.4 New tenant job’s scheduling and verifying devices list 5
2.5 Capacity job’s scheduling based on HDCA interval 6
2.6 Scheduling new feature in old tenant
3. Achromous 7

4. References 7

5. Appendix 7

[If any other details to be captured]............................................................................. 7

• Introduction
This Troubleshooting Guide is intended to provide guidance to SDC developers in the
detection and correction of SDC Development issues within in an application. It may also be
useful to IT/Deployment professionals who support this application.

• Prerequisites
The prerequisite skills and knowledge required for various roles is detailed below
• Prerequisites for Developers
• Dual A/c

• MY SQL workbench

• Spark

• Hadoop

• Basic SQL Knowledge

• Prerequisites for Deployment team member


• Same as 1.1.1

• MCS a/c

• Program Manager
• Same as 1.1.2

• MCS a/c

• SNOW

• General Prevention

• Resolving Known issuess


• Slices not displaying in UI(Getting your filters are excluding this info error)
Term Detail
Problem Slices not displaying in UI because of '0' values in HDCA server
Debug 1)Click on QueryBuilder and check data is coming for current day and take
instructions datasource
details for that slice
2)If data is coming,then check values for all devices.
3)If all devices have '0' values then verify the data in HDCA server using
postman.
we need below details to get data from HDCA :
a)HDCA Server url details(In tbletldata we have HOST,PORT,Auth details)
b)Mql Query(From the datasource we can identify the conf file, from there we
take mql query)
c)Start/end time(Check with current day) and time as below
Ex: if datalency=30 mnts, and CurrentTime = 14:00(GMT)
Check with starttime= 13:20 (currentime-latencty-10mnts)
endtime = 13:25 (currentime-latencty-5mnts)
Below are possible scenario's
i)We may not getting any datapoints for time series attribute for the above
specified 5 mnts,
then check for (13:00 - 13:05),If still we are not getting any datapoints for that
timeseries attribute then check with kyle
ii)We are getting data but all are having '0' or 'null'

Cause#1 Resolution by
Team
Cause #1 We are not getting data from HDCA Server itself(probes
issue)
Solution #1 Check with Kyle or cumulus TBF
Cause#2 Teams
Cause #2 Datalatency may be more than Specified in common.conf
Solution #2 Increase the latency in common conf TBF
Teams

• No data:Join Conditions Failed/Pools/Devices missed


Term Detail
Problem When there is no data in CAR/HDCA/mapping b/w CAR&&HDCA
Debug 1) Identify the conf file and datasource details
instructions 2)Take the final CAR table used in output_query for joining b/w HDCA and CAR
3)Using Sparkcontext check the data available in that table or not. If we are
checking for specific device/pool add those conditions.
Here we have two possible scenarios.
i)We don’t have any data in that table, means CAR data not available.
ii)If we have some data means CAR side everything fine, Just validate CAR
columns which are used in output_query.
4)If we have CAR data,check whether we are getting hdca data or not.
We can validate using below ways.
i)Using postman get the mql queries from the conf file for which we are not
getting join condition error and specify job failed date details as start/end dates.
ii)We can check this path for all the mql queries in conf file /user/tenant/sdc-
spark-etl2/hdca-flattened/(conf file)/*, if we have data in this path means from
HDCA we are getting data.
5) If we have data from HDCA server also the join’s b/w CAR and HDCA are
getting failed.
6)Take the all the temp views(CAR&HDCA) used in output_query. As we don’t
store data for temp views(which are in job_queries array).To check whether we
are getting data or not for those temp views and them in output_query and
verify the data. So that we can able to identify from which tempview we are not
getting data.We can inform ETL team for the same,they will look into the issue.

Cause#1 Resolution by
Team
Cause #1 HDCA data not available for that feature
Solution #1 Need to contact Kyle,Cumulus,they can resolve the issue TBF
Cause#2 Teams
Cause #2 CAR data not avalible
Solution #2 Need to contact MarkVanBinjen TBF
Cause#n Teams
Cause #n CAR and HDCA join condtion failed
Solution #n Need to contact ETL team TBF

• Data available in Kafka but not in druid datasource

Term Detail
Proble Sometimes we can’t see the data in UI eventhough job executed successfully because of
m data not there in corresponding druid datasource but available in kafka topic
Debug 1)Check Hive table status for that conf, it should be in completed state.
instruc 2)Check the data in /user/tenant/sdc-spark-etl2/output_data/(conf file), it should have
tions atleast one folder with job execution date.
3)Check the data in all kafka topics which are writing from that conf file.They should have
data with job execution date
Ex:HitachiEnterprise.conf write data in below topics
i)BlockLdevCPTY,ii)BlockParityGroupCPTY,iii)BlockPoolCPTY,iv)BlockDeviceCpty,v)BlockCr
ossServiceDeviceCPTY,vi)BlockCrossServiceTierCPTY
4)If we check the data in druid datasources which are listening to the kafka topics
mentioned above(mapping there in indexer json files),those datasources may not have
data with job execution date.

Cause#1 Resolution by Team


Cause When we scheduled the jobs newly we forgot to run
#1 curl commands for the datasources which are there in
conf files
Solutio We have to execute curl commands for all/missed MS
n #1 datasources in that conf file
Cause#2 Teams
Cause Sometimes because of druid issues It’s not able to isten
#2 the kafka topics
Solutio We need to ‘RESET’ the datsource.If it won’t work need MS
n #2 to shutdown the resource and then execute CURL
command once again
• New tenant job’s scheduling and verifying devices list

Term Detail
Problem Which Job’s need to be scheduled for new tenant
Procedure 1)We have to ask MarkCallan/Customer for Scope document for that tenant
2)If we open scope document we have to take the list of vendor’s specified the
doc.
Ex:Hitachi,CISCO,IBM,VNX vendors there in doc.
Then we need to schedule all hitachi performance and capacity jobs.
All VNX performance and capacity job’s, like that as we don’t performance job’s
for CISCO we schedule only Capacity job(ciscosan.conf)
3)After execution of scheduled job’s we need to verify devices list which are
displaying in UI with scope document.
Ex: Hitachi devices list – 10(Scope doc),5(in UI) , then we need to debug using as
specified in 2.2

• Capacity job’s scheduling based on HDCA interval

Term Detail
Problem What time do we need to schedule capacity job’s like HitachiEnterprise.conf
Procedure 1)Take mql query from conf file which should have atleast one timeseries(starts
with @) attribute.Execute the mql query from Postman by specifying start/end
dates with one complete date.(Ex:20191226_000000 to 20191226_235959)
2)And verify the response for any timeseries attribute(Ex:physicalCapacityUsed
for hitachienterprise.conf).
3)If we can schedule the job based on below logic.
24/ TotalNo.ofValues
Ex1:If we are getting 4 datapoints(values) for that attribute, then 24/4 =6(hr)
Means we are getting data per every 6HR’s. (for 0th hr,6th hr,12th hr,18th hr)
So we have to schedule this job in 18th HR.
EX2:if we are getting 24datapoints then we get data per every HR.So we have
to schedule this job at 23rd HR.

• Scheduling new feature in old tenant

Term Detail
Problem Scheduling new feature in existing tenant
Prerequisite 1)Conf file
2)Job property file
3)Druid IndexerJson’s
Process 1)Ask MarkVB to map new feature in tbletldataetljob
2)Run CARRaw/join pipeline and verify HDCA server details are mapped for new
feature
3)Copy conf file in both adhoc-runonly and Hadoop common path
(/opt/spark-etl2/deployment/adhoc-runonly,/data/spark-
etl2/oozie/common)
4)Run adhocjob for one day
2)Copy druidindexer json’s in source path
3)Copy jobproperty file in (bin/hdca)path and modify the startdate and time on
which it has to trigger.
4)Update the starttime/end time in common.conf which is there in TENANT
specific path(/user/initech/sdc-spark-etl2/oozie/common.conf),whatever we
specified in common.conf it will insert that in hive table.
5)Execute druid_indexer_invoker.sh file by specifying tanant and destination
path details.
6)After executing the above .sh file it replaces tenant/kafkatopic/broker details
from source json files and place it in destination path.
7)Move to destination path and execute curl commans for every datasource in
that conf.

• Achromous

Acronym Defination
CAR Customer Asset Registry
HDCA

• References

S.NO Document Name Document Location

• Appendix
[If any other details to be captured]

S-ar putea să vă placă și