Sunteți pe pagina 1din 36

Oracle Enterprise Taxation Management Version 2.

2 Performance and Scalability Report for Forms Upload Batch Processing

Tax & Utilities Global Business Unit

This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.

This document is the confidential and proprietary information of Oracle Corporation. It should not be disclosed in whole or in part to any third party, without the express written authorization of Oracle Corporation and should not be duplicated in whole or part, for any other purpose whatsoever, and shall be returned upon request. The information contained in all sheets of this document remains the property of Oracle Corporation. It is furnished in confidence, with the understanding that it will not be used or disclosed for any purpose other than evaluation. All rights are reserved by Oracle Corporation. No part of this document may be photocopied, stored in electronic or other form, reproduced, or translated to another language without the prior written consent of Oracle Corporation. This documentation may not be copied, photocopied, reproduced, translated, or reduced to any electronic medium or machine-readable form, in whole or in part, without the prior written consent of ORACLE.

The software described herein is subject to change. Where Oracle USA has suggested a system hardware configuration, such information is Oracle USAs suggestion only, based on its current understanding of the requirements. Where Oracle USA has described features or functionality that it anticipates will be included in future releases of its applications, the description and estimates of their availability are subject to change.

What is Forms Upload? ............................................................................................................................ 5 Objective ................................................................................................................................................... 6 Introduction............................................................................................................................................... 7 System Setup and Execution...................................................................................................................... 8 Seed Data Volumes ............................................................................................................................... 8 Setup ..................................................................................................................................................... 9 Operating System .................................................................................................................................... 10 Oracle Database ..................................................................................................................................... 11 ETM Schema Changes ............................................................................................................................ 12 Cobjrun32 / Hibernate ............................................................................................................................ 12

Summary ................................................................................................................................................. 13 Oracle Database ..................................................................................................................................... 15 Oracle Load Profile ............................................................................................................................. 17 Oracle Top 5 Timed Foreground Events ............................................................................................. 17 Top SQL.............................................................................................................................................. 19 Operating System Metrics ....................................................................................................................... 22 CPU ..................................................................................................................................................... 22 Memory ................................................................................................................................................... 26 Network ................................................................................................................................................... 27 Disk ......................................................................................................................................................... 28 Java Virtual Machine .............................................................................................................................. 29 Appendix A: Sales & Use Tax Form ...................................................................................................... 30 Appendix B: Sales & Use Tax Form after transformation by C1-FUSDM ........................................... 32 Appendix D: Bugs Filed ......................................................................................................................... 35

Reviewers and Approvers List


Ilya Klebaner Ilan Bensimhon Richard Finley

Revision History

Referenced Documents
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. \QA\ETM\PerformanceTest\ETM2.2\ETM 2.2 Perf Test High-level Strategy.doc \QA\ETM\PerformanceTest\ETM2.2\ETM 2.2 Performance and Scalability Test Plan ETM 2.2.0 User Guide ETM 2.2.0 Installation Guide Oracle Database 11g Reference Oracle Performance Tuning Guide Java HotSpot VM Options Java Heap Size Parameters /QA/ETM/PerformanceTest/FormUploadProcessX/New_FormsUploadProcessX.doc //documentation/Design Doc/Tax Management/v2.2.0SF/40048 Forms Upload/40048 Forms Upload Blueprint.doc 11. Oracle Database Installation Guide 11g Release for Linux B32002-07

SUT

System Under Test. SUT is comprised of hardware, OS and other non-application specific software components (like Database and Application Server) Network Attached Storage. A disk array accessed over the network. The SUT uses a NAS storage array for the Oracle database. Graphical User Interface. The applications online user interface via the browser.

NAS

GUI

OLTP Online Transaction Processing. User transactions submitted via the browser GUI. AWR Automated Workload Repository. Oracle collects system utilization statistics at predefined intervals into the AWR, which can be used to generate an overview of the load, performance and scalability characteristics of the load running on the Oracle instance. JVM CBO QEP Java Virtual Machine. Cost Based Optimizer

Query execution Plan. The choice of data access method chosen by the CBO for a particular SQL ETM Oracle Enterprise Taxation Management product P&I Penalty and Interest calculation OUAF Oracle Utilities Application Framework

For most tax authorities, the ability to upload large volumes of tax forms from multiple channels into Oracle Enterprise Taxation Management (ETM) for processing is fundamental for daily operations. The ETM Forms Upload feature is the first step in ETM forms processing. It is critical to ensure that this step performs and scales well. This report describes performance and scalability metrics of this feature, collected by the Oracle ETM Product Development team. These metrics may be used as reference points for ETM implementation planning. ETM Forms Upload performance and scalability metrics were collected on a system comprised of two servers: Sun X4170 Dell PowerEdge Application Server Database Server

Both servers are based on the Intel Xeon processor and list under US$10,000 each. Test results indicate that: Forms Upload feature performs well:

Two form validation and transformation batch processes, running in 10 threads each, were able to upload 100,000 forms in 20 minutes. This means that Form Upload batch should be able to upload up to 300,000 forms per hour on hardware configurations comparable to the one used in this test.

The ETM Forms Upload feature scales well: As throughput increased, CPU consumption exhibited near linear scalability

What is Forms Upload?


Forms Upload is responsible for staging batches of forms of all types and from all channels into ETM for processing. Tax forms may be received in many different formats, from many different channels. Most commonly used channels are: Batch Upload. Most forms are usually captured by internal (OCR / ICR) or 3-d party data captured solutions. Forms are usually received in batches; each batch contains one or more submissions. Mass data entry. The form data is keyed in by data entry clerks. Forms are grouped in batches, each batch has control information that is used to validate that there are no missing forms. Electronic file submission (e.g. E-file, WSS)

Forms that are received from these and other channels are loaded first to the ETM form upload staging tables for validation and reconciliation. This validates that the data is readable, the form type can be identified, and that the forms and payments in the batch reconcile to the forms batch header. Forms Upload is a process that reads forms in raw XML format from the staging tables, transforms/validates them and inserts into the ETM Tax Forms tables for further processing.

The principal audience for this document is Engineers responsible for architecting, planning and implementing ETM 2.2 deployments. It is assumed that the reader is familiar with the ETM functionality under test and supporting technologies and that the reader understands how to use the tools to collect and interpret basic system metrics and how to configure the systems as described. For further information the reader is referred to the relevant product documentation, a number of these documents are listed under Referenced Documents below. This report is not intended to be a tutorial on performance monitoring and tuning. The report will provide an explanation of the how specific conclusions and recommendations were made and the data used to corroborate these findings, but it is beyond the scope of this document to provide a comprehensive explanation of all the information provided by the monitoring tools. Nor is it the aim to provide comprehensive tuning recommendations for all platforms. The report describes the configuration specific to the system under test. As the system scales, some of the configuration information will change, this is particularly true of the database. Configuration and tuning is an iterative process and the configuration described in this document builds on the experience from previous performance tests. Where additional tuning was done specifically for this functionality it will be called out and explained. The report will call out issues or enhancements that were observed while testing and remain unresolved at the time of publication. As it is a prerequisite to the publication of this report that performance meets or exceeds requirements, none of the issues described are considered serious enough to delay release of the software or publication of the report. As performance testing typically completes after the product release, there may be issues described in this report that were unknown when the release notes were published. The performance tests are executed against base ETM functionality with no customizations unless specifically noted. ETM is highly customizable and actual performance may vary significantly from the results described herein due to any number of factor. Therefore the results presented in this report should not be relied on as a substitute for thorough testing of the planned deployment implementation. Where specific factors are known or expected to affect performance, these will be mentioned. However due to time constraints it is not possible to quantify every one. The intent is to make the reader aware of factors to consider in planning their own deployment.

Objective
The Objective of the performance testing was to confirm the performance and scalability of Forms Upload, specifically that the implementation is able to meet or exceed the requirements to upload 350,000 Forms within two hours on the System Under Test.

As processing the Form Batch Header records is a fairly minor task , the main focus of this report is on the two jobs processing the Forms staging records (C1-FUSxx), which operate on the Forms Upload staging records.

Introduction
Forms Upload is an ETM 2.2.0 feature, designed to facilitate mass-upload of forms into ETM. Forms Upload staging consists of two tables, ci_form_batch_hdr and ci_form_upld_stg. Batches of raw forms are populated into these tables via custom code, called process X, from here, Forms Upload jobs included in the ETM product, read forms from the staging tables, transform them into the required format, using customer-provided XSL, and insert the data into the CI_TAX_FORM or CI_REG_FORM table. The following diagram shows an example of a Forms Upload process batch flow. Note that because processing for the objects involved is BO-driven, some of the steps in the flow are assumed iterative

Validate form batch headers

Complete form batch headers

C1-FBHMD
Process FUS records whose FBH is pending completion

C1-FBHM

Fix Suspense Issues

Validate forms Post forms & create payments

Validate FUS records whose FBH is in an upload in progress state

C1-FUSDM

C1-FUSPC

Fix Suspense Issues

1. C1-TXMTD 2. C1-RGFBD

Fix Suspense Issues

Balance FBH tender controls

C1-FBTC

The following notes, taken from the Forms Upload Blueprint [10] describe the processing in more detail. Form upload staging records are not processed until the form batch headers they belong to have passed validation. Any errors during the validation of a form batch header cause it to suspend for user review. The user is responsible for resolving the issues and re-validating the form batch header. In some cases, the user may decide that the batch needs to be canceled instead - which also cancels the related form upload staging records. (Note that at this point, the form upload staging records are still in their initial states.) When the batch header passes validation, its form upload staging records can be processed. The batch sits in that processing state until either all staging records have been processed successfully or some of the staging records are rejected (due to technical issues).

Each form upload staging record goes through a number of processing steps before the corresponding tax form or registration can be added. Any issues from these processing steps prevent the corresponding form from getting added. Any errors during form upload staging validation cause the form upload staging to get suspended for user review. The user is responsible for resolving the issues and re-validating the form upload record. In some cases, the user may decide that the form upload staging needs to be canceled instead. Note that at this point, the corresponding form is not created yet. Note also that when a form upload staging suspends, its batch header status stays the same - i.e. in the 'processing' state. The idea is that an ETM user may be able to fix the issue with the staging record or decide to cancel it. Either action on the current staging record should not prevent other staging records in the batch from being processed. When the form upload staging passes validation, it goes through the important step of mapping the fields from the raw XML into the target Form BO. Any problems with this transformation are likely to be technical issues (e.g. malformed XML) that an ETM user would not be able to fix. Thus, the form upload staging record gets rejected. On the other hand, if the transformation is successful, the form upload staging goes into a state where the tax form or registration form can be added. If all form upload staging records are in this 'ready for load' state (or are a mix of 'ready to load' and 'canceled'), the batch goes into a 'ready to complete' state. This interim state is one where batch cancellation is no longer possible. This ensures two things: That forms for that batch are added at the same time That once the forms are added, the batch cannot be canceled. Allowing batch cancellation when some forms are already added requires cleaning up the forms - i.e. pending/suspended/waiting forms get canceled and any posted tax forms get reversed. When the batch is in a 'ready to complete' state, the forms in that batch can be added. If a form is successfully added, the form upload staging goes to the Form Added (final) state. When the form upload staging records are in a final state, the batch is completed. If any of the form upload staging records is rejected, the batch header requires a user to review the batch and decide what to do next. A user may decide to either cancel the batch (e.g. if a majority of the records are rejected) or let the batch complete (e.g. if only a few records in the batch got rejected). If the batch is canceled, the form upload staging records are also cancelled. Note that at this point, there are no forms existing yet because the batch has not gone into the 'pending complete' (i.e. non-cancelable state).

System Setup and Execution


The row count for selected ETM tables is shown below:
CI_FORM_UPLD_STG_LOG_PARM CI_FORM_UPLD_STG_LOG CI_TAX_FORM_LOG_PARM CI_TAX_FORM_LOG CI_PER_PHONE CI_PER_NAME CI_PER_K CI_PER_ID CI_PER CI_ACCT_PER CI_ACCT_K CI_ACCT 4,398,362 1,598,320 1,132,342 539,043 504,444 451,215 451,157 451,157 451,157 451,112 451,108 451,108

CI_TAX_ROLE_K CI_TAX_ROLE_CAL CI_TAX_ROLE CI_FORM_UPLD_STG CI_OD_EVT CI_OD_EVT_DEP CI_TAX_FORM_K CI_TAX_FORM CI_GEO_TYP_DFLT CI_FT_GL CI_OD_PROC_LOG

451,092 451,092 451,092 361,698 320,370 266,972 266,554 266,554 125,967 117,458 106,800

These row counts were taken after the data was populated into the Forms Upload staging table but before Forms Upload transitioned the records into the ci_tax_form table. After the C1-FUSPC job completes, the ci_tax_form table would include another 100,000 records. As the 300,000 records were pre-populated by Process-X, the number of records in the ci_form_upld_stg table is constant in these tests.

Forms used in this test: Stock Sales & Use Tax Form included with the base ETM install. (Initial size approx. 2KB) Payment Flag = C1NP (i.e. No Payment) All records valid. 3 batches with 100,000 records per batch The ETM Forms Upload process consists of four jobs executed as five steps, C1-FBHMD being executed more than once, the jobs are executed in the order shown below. On successful completion the records transition state as shown. The final status code of the records on successful completion of the end-to-end Forms Upload processing is underlined. The names given here are as they appear in the database, the names displayed in the application may differ slightly. Note that the final status of the header and staging records is different.

Stage 1 2 3 4 5

Job Name C1-FBHMD C1-FUSDM C1-FUSPC C1-FBHMD C1-FBHMD

Table Name CI_FORM_BATCH_HDR CI_FORM_UPLD_STG CI_FORM_UPLD_STG CI_FORM_BATCH_HDR CI_FORM_BATCH_HDR

BO_STATUS_CD Before Job PENDING PENDING READYTOLOAD FORMUPLOAD PENDINGCOMPLETE

BO_STATUS_CD After Job FORMUPLOAD READYTOLOAD FORMADDED PENDINGCOMPLETE COMPLETE

Three batches of 100,000 records were created along with their associated taxpayer accounts. The schema was backed-up using a schema export so that the tests could be repeated as often as required. The ci_tax_form table initially had approximately 250,000 records, which were a mix of business and individual taxpayers created in previous performance testing. The three batches of 100,000 records were run through the five stages shown above sequentially. After the final stage of each batch, the schema statistics were gathered as shown on the assumption that batches would normally be run on a different days.

DBMS_STATS.GATHER_SCHEMA_STATS ( ownname => 'CISADM', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, method_opt => 'FOR ALL COLUMNS SIZE AUTO', cascade => TRUE );

Periodically, the database was restored from the export and the process repeated with a different number of batch threads. The batch run time published in this report is taken from the ETM Batch Run Statistics page. Throughput is calculated as: Batch size/ run time (seconds)

Disclaimer: The tuning parameters presented in this section describe how the SUT was configured for these specific tests. While every effort is made to identify and use tunable settings applicable in a wide range of system configurations, parameter values used in this particular test may not be optimal for other systems and/or batch jobs.

Operating System
The Operating System was configured based on recommendations in the Oracle Server Installation Guide for Linux [11].
/etc/sysctl.conf # Controls the maximum number of shared memory segments, in pages fs.file-max = 655360 kernel.msgmni=2878 kernel.msgmax=65535 kernel.sem=250 256000 1024 kernel.shmmni=4096 kernel.shmall=3279547 kernel.sysrq=1 net.core.rmem_default=262144 # For 11g recommended value for net.core.rmem_max is 4194304 net.core.rmem_max=4194304 # For 10g uncomment the following line, comment other entries for this parameter and re-run sysctl -p # net.core.rmem_max=2097152 net.core.wmem_default=262144 net.core.wmem_max=262144 fs.aio-max-nr=3145728 net.ipv4.ip_local_port_range=1024 65000

kernel.shmmax=8413390848

Oracle Database
The system was configured with 8 redo log groups; each log file was 1GB. Due to the volume of redo it was necessary to significantly increase the size of the redo logs for these tests compared with previous performance tests. Monitor the frequency of redo log switches and check that Checkpoint not complete messages do not exist in the Oracle Alert Log to determine appropriate redo log sizing. Non-default Oracle Initialization parameters:
compatible db_block_size db_file_multiblock_read_count dbwr_io_slaves disk_asynch_io log_buffer open_cursors optimizer_index_cost_adj optimizer_index_caching parallel_max_servers pga_aggregate_target processes sga_target statistics_level timed_statistics trace_enabled undo_management undo_tablespace undo_retention 11.1.0.7 8192 16 4 TRUE 1703936 1000 1 (see Note 1 below) 100 (see Note 1 below) 4 250M 300 3G ALL TRUE FALSE AUTO UNDOTBS1 900

Oracle Initialization parameters are described in [5] The following parameters were enabled to try to improve IO throughput and reduce IO related waits. They are the only DB initialization parameters specifically changed for these tests:
dbwr_io_slaves disk_asynch_io

Oracle database initialization parameters are expected to change as the system scales. The initialization parameters described here were appropriate for the System Under Test. Parameters that may change as the system scales include but are not limited to: sga_target, pga_aggregate_target, open_cursors, processes. Note 1: Oracle System Statistics The results presented in this document were measured with optimizer_index_cost_adj and optimizer_index_caching values set as described above. As an alternative to hard-coded optimizer hints, Oracle 10g introduced System Statistics (see DBMS_STATS.GATHER_SYSTEM_STATS/ DBMS_STATS.IMPORT_SYSTEM_STATS) to give the Cost Based Optimizer (CBO) better information when costing optimal Query Execution Plan (QEP). It is the project teams responsibility to determine the best choice for a specific ETM implementation. Please consult [6] for more detail on the System Statistics feature.

ETM Schema Changes


The following modifications were made to the ci_tax_form and form_upld_stg table CLOBs to improve performance:
ALTER TABLE ci_tax_form MODIFY LOB (bo_data_area) CACHE STORAGE (NEXT 100M) ); (

ALTER TABLE ci_form_upld_stg MODIFY LOB (bo_data_area) STORAGE (NEXT 100M) );

Cobjrun32 / Hibernate
No changes were made to the defaults set in the following properties files spl.properties workersubmitterlog4j.properties threadpoolworker.properties hibernate.properties The threadpoolworker.sh script was modified to increase the maximum JVM heap size to 1.5GB, which was established in previous batch performance testing. Explicit Garbage Collection was disabled and Verbose GC logging was enabled so that garbage collection frequency and duration could be monitored. As there were no obvious issues with JVM Tuning and Garbage Collection, overhead was acceptable (see Java Virtual Machine) and no further tuning of the JVM was attempted.
MEM_ARGS="-Xms512m -Xmx1536m -XX:MaxPermSize=192m -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/tmp/batchgc_1.log"

A full description of the Sun Java Hotspot VM Options can be found in [7],[8]. The Batch threadpoolworker process was started using the following command line arguments overriding the defaults in the threadpoolworker.properties file.
nohup ./threadpoolworker.sh -d Y -p DEFAULT=15 > $SPLOUTPUT/threadpool_out_1_$$.log &

Jobs were run in the DEFAULT queue. In the above example the queue has been started with 15 threads (per process) which was the maximum tested.

Component ETM Operating System Database J2EE Application Server Java JVM Monitoring DB Monitoring OS Monitoring Load Test Tool

Software Version ETM 2.2.0 SP2, OUAF 2.2.0 SP5 Oracle Enterprise Linux AS release 5 Oracle 11 (version 11.2.0.0) see note 1 BEA WebLogic Server 10.0 Java HotSpot(TM) Server VM (build 1.5.0_19-b02, mixed mode) Jconsole Oracle AWR & ASH Reports, SQLT, tkprof, Oracle Enterprise Manager NMON and OS utilities NA

Purpose Application Server

Quantity 1

DB Server

Disk Array Load Driver

1 NA

Description Sun X4170. 2x Quad-core Xeon L5520 2.27GHz 48GB RAM 2x 146GB 10K RPM SAS HDD Dell PowerEdge. 2x quad core Xeon L5410 2.33GHz 32GB RAM 3x HDD (RAID5) NAS attached Netapp Filer (shared).

Note: Although both servers in the SUT use Intel Xeon CPUs and the CPU speed of the database server is actually marginally higher than the application server, the appserver CPU is a later model Xeon and is significantly more powerful than the DB server. When making sizing estimates for ETM, it is important to be specific about the exact make, model and CPU speed, specifying the clock speed by itself is inadequate. Modern x86 based Intel and AMD servers offer a significant price-performance advantage over proprietary CPU architectures, particularly for the stateless application and web servers.

Summary
This report will focus on the two volume batch jobs which process the Forms Upload staging records (C1-FUSDM and C1-FUSPC) as these ran for longer and consumed significantly more resources than C1-FBHMD which processed the batch header records and completed in around a minute. Throughput of the Forms Upload jobs as tested was significantly higher than the requirements for a Medium Tax Agency (i.e. 82,000 records in 2 hours) furthermore, the 2 -server SUT would only be classified as Small.

Table 1 shows the run time and throughput achieved processing the forms upload staging records. With 10 threads in a single batch threadpoolworker, C1-FUSDM and C1-FUSPC were each able to process the 100,000 records in 13-14 minutes. With 15 threads both jobs completed in around 11 minutes. Processing the batch headers with C1_FBHMD completed in under a minute. Run times for both jobs were generally similar although the first job was always slightly faster. DB statistics also show both jobs doing a fairly similar amount of work. As the initial values of 10 threads and commit = 10 produced throughput well in excess of the target, no significant time was spent in tuning the jobs further.

Adding a second process and running the jobs with 20 threads (i.e. 10 threads per process), throughput peaked at around 180 RPS and each job completed in less than 10 minutes. At this load, the first job C1FUSDM consumed over 70% of the appserver CPU, which is considered a safe maximum load.

Though measured throughput was more than sufficient to meet the throughput objectives, both C1FUSDM and C1-FUSPC jobs generated exceptionally high database disk IO, particularly DB writes. For example, to process 100,000 records, C1-FUSDM and C1-FUSPC combined generated almost 65 GB of

DB IO of which 49GB were writes. The root cause of the high IO is that the job was designed to use multiple plug-ins for flexibility and code re-use and as a consequence. This is described in more detail in the following section. Database CPU utilization was approximately the same for both jobs, however C1-FUSPC suffered more from IO wait issues. C1-FUSDM was more CPU intensive on the appserver-side. This may be due to the XML transformation, but the application was not profiled to confirm this.

Oracle Database
Table 2a and 2b show the database IO statistics taken from the AWR report. It can be seen that the two jobs generated a combined total of 49GB of DB writes of which 16GB was redo logging. Disk reads were significantly lower than writes totaling approx 16GB.

The principal cause of the High disk IO as mentioned previously was a consequence of the decision to use multiple plug-ins. Splitting the job into smaller units of work using plug-ins is advantageous for code reuse and flexibility, however , in order to pass control between plug-ins it is necessary to save the record back to the database which increases the overhead on both the database and appserver. To process 100,000 records, CI_FUSDM updated the staging table 400,000 times and C1_FUSPC a further 200,000 times. When designing custom applications, the advantages of using multiple plug-ins should be weighed

against the cost in terms of performance and scalability. CLOB columns are notably more expensive than other less complex datatypes. As a consequence of these updates the size of the database swelled quickly. The Used (M) column in Figure 2 shows the amount of space used in the CISTS as it grows from approx 4GB to 9.5GB. Over 50% of the additional space comes from the first job. In production, growth would be related to the size of the form.

Before C1-FUSDM TABLESPACE_NAME Size (M) Blocks Pieces Free Blocks Used (M) PCT_FREE ------------------------- ---------- ----------- ------ ----------- --------- --------CISTS_01 14,278 1,309,760 1 1,309,760 4,045.00 71.67 UNDOTBS1 7,608 21,504 314 809,488 7,440.00 83.12 After C1-FUSDM TABLESPACE_NAME Size (M) Blocks Pieces Free Blocks Used (M) PCT_FREE ------------------------- ---------- ----------- ------ ----------- --------- --------CISTS_01 14,278 875,104 1 875,104 7,440.75 47.88 UNDOTBS1 7,608 21,504 98 498,704 7,440.00 51.21 After C1-FUSPC TABLESPACE_NAME Size (M) Blocks Pieces Free Blocks Used (M) PCT_FREE ------------------------- ---------- ----------- ------ ----------- --------- --------CISTS_01 14,278 601,344 1 601,344 9,579.50 32.90 UNDOTBS1 7,608 21,504 145 688,032 7,440.00 70.65

Figure 3 identifies which segments in the database are growing. Initially, the LOB is stored in line in the staging table, the XML transforms cause the LOB to grow around 4KB and migrates to the LOB segment. Consequently the CI_FORM_UPLD_STG table size is static, but the LOB segment grows by about 1.5GB initially and over 3GB after the second job. During the second job, the ci_tax_form table grows by about 400MB as the records are inserted which seems about right for 100,000 4KB forms. Overall, the log and log_parm tables each grow by about 50MB each, logging can be disabled if required.

Before C1-FUSDM TABLESPACE SEGMENT_NAME ---------- -----------------------------CISTS_01 SYS_LOB0000152596C00018$$ CISTS_01 CI_FORM_UPLD_STG CISTS_01 CI_TAX_FORM CISTS_01 CI_FORM_UPLD_STG_LOG CISTS_01 CI_FORM_UPLD_STG_LOG_PARM CISTS_01 SYS_LOB0000154671C00011$$ After C1-FUSDM TABLESPACE SEGMENT_NAME ---------- -----------------------------CISTS_01 SYS_LOB0000152596C00018$$ CISTS_01 CI_FORM_UPLD_STG CISTS_01 CI_TAX_FORM CISTS_01 CI_FORM_UPLD_STG_LOG_PARM CISTS_01 CI_FORM_UPLD_STG_LOG CISTS_01 SYS_LOB0000154671C00011$$

TYPE KBYTES BLOCKS EXTENTS ---------- ------------ ---------- -------LOBSEGMENT 5,098,240 637,280 19,915 TABLE 1,426,688 178,336 5,573 TABLE 653,824 81,728 2,554 TABLE 135,680 16,960 530 TABLE 135,424 16,928 529 LOBSEGMENT 100,352 12,544 392 TYPE KBYTES BLOCKS EXTENTS ---------- ------------ ---------- -------LOBSEGMENT 6,625,024 828,128 25,879 TABLE 1,426,688 178,336 5,573 TABLE 653,824 81,728 2,554 TABLE 173,312 21,664 677 TABLE 169,728 21,216 663 LOBSEGMENT 105,728 13,216 413

After C1-FUSPC TABLESPACE SEGMENT_NAME TYPE KBYTES BLOCKS EXTENTS ---------- ------------------------------ ---------- ------------ ---------- --------

CISTS_01 CISTS_01 CISTS_01 CISTS_01 CISTS_01 CISTS_01

SYS_LOB0000152596C00018$$ CI_FORM_UPLD_STG CI_TAX_FORM CI_FORM_UPLD_STG_LOG_PARM CI_FORM_UPLD_STG_LOG SYS_LOB0000154671C00011$$

LOBSEGMENT TABLE TABLE TABLE TABLE LOBSEGMENT

8,279,296 1,426,688 1,055,488 186,368 180,992 139,520

1,034,912 178,336 131,936 23,296 22,624 17,440

32,341 5,573 4,123 728 707 545

It should be possible to reclaim the space associated with the staging records by truncating the table or deleting the records after the Forms Upload processing completes successfully. The before and after state of the Form are both stored in the same LOB field, therefore even a fairly small LOB as used here ends up exceeding 4KB and migrates to the LOB segment. A further functional issue noted and filed as a bug was that the XML transform appears to add extra newline characters to every line of the form, however this is not considered a problem as far as performance is concerned.

The Oracle Load Profile from the AWR report shows that both jobs are doing about the same amount of work. The frequency of Hard Parses is high but analysis of SQL trace files shows that the hard parses are not caused by application SQL.

Tables 4a and 4b show the Top Timed Wait Events from the AWR report for a 15-minute snapshot, the DB server has 8 CPUs. Table 4a shows that the top wait event for C1-FUSDM is db file sequential read, (i.e. a single block index read). Although the average wait time is only 6 ms, indicating that the array is responsive, there are over 75,000 waits for a total of 468 seconds. The root cause is the high number of single row SQL executions, i.e. there is no array processing. By comparison the server only consumed 528 seconds of CPU time. Log file syncs are also high; the root cause of this event is the high frequency of commits (the commit rate for the job was every 10 records). The IO issue is significantly worse for C1-FUSPC. Although the average wait is still only 10 ms, i.e. the responsiveness of the array is not the issue, the total time waited was over 2,600 seconds accounting for 80% of the DB time, so the problem is with the overall number of physical IO requests. Again, by contrast, the job only consumed 525 seconds of CPU time (from an available total of 900s * 8cpus = 7200 cpu-seconds). The root cause as above is the high frequency of individual SQL executions.

Tables 5a and 5b show enq: HW contention waits which were frequently encountered. This event is caused by concurrent requests to grow LOBs, so the root cause is the frequent updates to the CLOB that has already been described. Pre-allocating space above the High-water mark seems to help. MyOracleSupport suggests partitioning to spread the data over more segments, but to be effective; the partitioning strategy would need to use the same algorithm used by the batch process to allocate records to threads, once the partitions are chosen, changing the number of threads would change the thread distribution potentially neutralizing the benefit of partitioning.

The AWR report presents top SQL sorted by various criteria, which helps to identify potential issues consuming different resources within the database. Often, the same SQL will turn up in several of the top SQL reports, for example high buffer gets will consume more CPU and increase the elapsed time of the job. In seeking to reduce the run time of the job, SQL ordered by Elapsed Time is probably the most useful view of the SQL data. If one SQL stands out as contributing to the total elapsed time as the first update does here then a focus on improving that SQL will pay the highest dividends. Almost 62% of the total elapsed time that C1-FUSDM spent in the database comes from the update to ci_form_upld_stg table, which, as already noted, is executed 4 times for every record processed. The same SQL also shows up at the top of the top SQL by CPU Time and by Gets and has the second highest Physical Reads. This update is by far the most significant contributor to the overall time, consuming 60% of the elapsed time. Consequently, reducing the number of plug-ins would have the greatest impact on the run time and resource utilization. It will be seen that several SQL are executed many times against staging table records. The fourth SQL statement in the table below is executed 900,000 times but actually processes no rows. A bug has been filed to correct this as it appears to be unnecessary work, but in comparison the update SQL at the top of the list, this SQL contributed less than 2% to the overall elapsed time. SQL ID 9u1ph53g2pw2g, select distinct FUS.C1_FORM_UP.. also stands out. This SQL is discussed further under SQL ordered by Gets

It can be seen that several of the SQL are common to both jobs. The top SQL in both cases is the update against the form upload staging table. It appears that the update is more expensive for the second job, although there only half as many updates, the elapsed time is greater. With the exception of 3 SQL statements, which are done multiple times per record, the others are executed once per batch record. This indicates that the application is not batching requests (array processing), which would be more efficient.

Also worth noting is that there is an insert and an update against the ci_tax_form table. The update changes all columns in the table except the two referenced in the where clause, including 10 columns which are indexed. Another bug has been filed to investigate whether this can be changed. This is a consequence of separating the functionality into separate plug-ins.

The high buffer gets are principally caused by the high execution count, which has already been mentioned, however the third SQL (SQL ID 9kpy9y872356n, select todotype ) has a high gets per execution and appears to be missing an index on CI_TD_TYPE.CRE_BATCH_CD. This SQL is also present for both jobs. A bug has been raised to address this. The last SQL in this list appears to be inefficient (the gets per execution is extremely high). It was also seen in the top SQL by Elapsed time. Although it is only executed once it consumes over 16 seconds of elapsed time and 10.8s CPU time. It may benefit from additional indexing, but the improvement in performance of the query needs to be weighed against any degradation to updates that the additional indexes would cause, and taking into account the frequency of updates on the staging table vs. the benefit to this select which is only executed once. This SQL may pose a scalability issue if the staging table grows larger as it will likely take longer to run as the data volume increases. Regular clean-up of forms upload staging records to remove completed batches should be done to keep the number of records in the staging table to a minimum and reclaim space.

The number of buffer gets required per update to the form upload staging table has almost doubled for the second job which explains why the elapsed time has increased. This may be a consequence of all the updates.

The first SQL here was previously identified in the top SQL by elapsed time and SQL ordered by Gets. As previously noted, it may benefit from additional indexes, but as the staging table is heavily updated the penalty to updates may mean that this is not worth the expense. This needs to be traded off with the increased cost of this query as the staging table grows larger. Its overall impact to the elapsed time with the staging table size used here is less than 1%.

Operating System Metrics


Chart 1a shows database CPU for C1-FUSDM while running with 10 and 15 threads. With 10 threads, the average CPU is less than 10% and with 15 threads still only averages about 12% after a short initial spike when the job starts. C1-FUSDM has a significant amount of CPU wait throughout the job although not as much as with C1-FUSPC. This may correspond to the IO related wait events in the AWR report (see Table 3a). The regular peaks in CPU wait correspond with the redo log switches (i.e. archiving).

Appserver CPU averaged approx 43% for C1-FUSDM with 10 threads and the load is fairly even. With 15 threads, the average is 55% but this is a little misleading due to variability of the load, as there are regular periods where the load is over 60% interspersed with stalls where the load drops significantly. The stalls are not related to Garbage Collection. The explanation for the stalls is the high waitIO on the database seen in Chart 1a: Database CPU for C1-FUSDM with 15 threads and the DB wait events seen in Table 5a: C1-FUSDM with 15 threads and consequently this indicates that the database is the bottleneck.

Database CPU utilization for C1-FUSPC is similar to C1-FUSDM, however, CPU wait is significantly higher and is also evident with 10 threads. This is also reflected in the IO related waits section of the AWR report (Table 3b), with over 2,600 seconds attributed to db_file_sequential_read, which accounts for 80% of the wait time.

Appserver CPU utilization for C1-FUSPC is significantly lower than C1-FUSDM. CPU utilization is erratic even with 10 threads and becomes even more significant with 15 threads. The variability is significantly more pronounced than with C1-FUSDM. The same explanation is offered and the bottleneck for C1-FUSPC is also the database.

Memory
Neither the DB nor the application server suffered from a memory shortfall. In the case of the appserver this is hardly surprising since there was only one the batch thereadpoolworker process and Weblogic instance running on a server with 48GB of RAM. The database server had 32 GB of RAM and memfree dropped from around 12GB to approx 6GB with 10 threads and 5GB with 15 threads. Memory drops consistently throughout the test and then stabilizes when the test completes. As the job completes in less than twelve minutes, this is insufficient time to determine whether there may be a memory issue if the job runs for significantly longer.

Typically the Unix filesystem cache can account for a large chunk of memory which the OS will free as it is required so there would is potentially more than Memory than memfree shows as available if required.

Network
As the DB Disk array is accessed over the network and DB IO has already been identified as a significant load it is not surprising that the network load on the DB server is high, however it did not appear to be a bottleneck at this load. Writes averaged almost 40 MB/s but were fairly constant. Reads were much lower at about 6MB/s but had substantial peaks as high as 80MB/s about once per minute, these are probably the redo logs being archived. Writes appear to continue at a lower rate for about 2 minutes after the batch job completes.

Note the scale on Chart 3a and 3b are different. With 15 threads the system was processing approx 160 RPS and network IO on the appserver averaged about 4MB/s for reads and approx 5MB/s for writes.

Disk
Database disk IO was insignificant as it is all accounted for over the network.

Note the scale on this graph is in KB/sec Logging accounts for the majority of the disk IO on the appserver. There is an initial peak when the job starts which spikes to ~ 17MB/sec before settling down to a fairly steady rate of about 1.5MB/s. Although the IO rate is not high in absolute terms, the ETM batch logs are overly verbose and most of the information does not seem to be particularly helpful. Conversely, useful information such as the parameters used to start the job is not logged. A bug has been filed.

The scale on this graph is in MB/sec

Java Virtual Machine


As previously noted, Garbage Collection did not appear to be an issue in any of the tests. With 15 threads, garbage collections occurred every 2 4 minutes and consistently took less than one second (an overhead of less than 1%). The heap grew to a little over 1 GB although the maximum heap size was set to 1.5GB, which indicates that the JVM heap size is sufficient for at least 15 threads.

Note the X-axis is not proportional to time.

Overall performance exceeded the requirement to process 350,000 records in less than 2 hours: With 2 processes and 20 threads, 100,000 records were processed in less than 20 minutes (less than 10 minutes per job). This extrapolates to 300,000 records per hour.

As throughput increased, CPU consumption exhibited near linear scalability. To achieve these throughput levels, application server CPU averaged 70% for C1-FUSDM which was the more expensive of the two jobs. Database server CPU averaged less than 15% for both jobs. To achieve these throughput levels, the disk IO bandwidth required by the database is significant. Between them, the two jobs generated 49GB of writes (16GB of redo logging) and 16GB of reads in less than 25 minutes. This was evident in the high cpu wait% statistic recorded by vmstat on the database and the number of IO related waits events recorded in the Oracle AWR report. The size and business rules of the tax form are expected to be significant factors in performance. The form used in these tests is described in Appendix A. The design chosen for Forms Upload uses multiple plug-ins for flexibility and reuse but this comes with a performance overhead, as each additional plug-in requires an update and select against the forms upload staging table. When designing performance critical functionality, consider minimizing reducing the number of plug-ins, particularly where the plug-in is manipulating a CLOB. Although several performance issues and enhancements are identified in this report, none are considered serious enough to prevent Forms Upload meeting performance requirements. The single most effective enhancement would be to consider using fewer plug-ins to reduce the number of updates on the staging table.

Appendix A: Sales & Use Tax Form


<suspenseIssueList> <issuesList> <messageParms/> <messageParmTypes/> </issuesList> </suspenseIssueList> <formData> <salesUseShortForm> <externalFormSourceID>EX-FUSUFFSQA</externalFormSourceID> <formType>SU-SHORT</formType> <documentLocator>EXTFEXTC1266017154437</documentLocator> <periodStartDate>2009-12-01</periodStartDate> <periodEndDate>2009-12-31</periodEndDate> <dateReceived>2010-01-15</dateReceived> <taxpayerName>Fus.Fname4062, Fus.Lname4062</taxpayerName> <taxpayerIdType>SSN</taxpayerIdType> <taxpayerIdNumber>017-15-4062</taxpayerIdNumber> <address1>4062, Address Line1</address1> <city>San Jose</city> <state>CA</state> <country>USA</country> <postal>95112</postal> <accountNumber>0497269484</accountNumber> <filingType>C1NP</filingType>

<returnType>C1OR</returnType> <totalSales>543238.00</totalSales> <purchasesSubjectToUseTax>34500.00</purchasesSubjectToUseTax> <totalSalesPurchases>577738.00</totalSalesPurchases> <nonTaxableSales>0.00</nonTaxableSales> <nonTaxableLabor>0.00</nonTaxableLabor> <salesToGovernment>23890.00</salesToGovernment> <salesTaxDueInState>0.00</salesTaxDueInState> <salesTaxDueOutOfState>0.00</salesTaxDueOutOfState> <otherDeductions>0.00</otherDeductions> <totalExemptionsDeductions>23890.00</totalExemptionsDeductions> <taxableSales>553848.00</taxableSales> <taxRate>8.25</taxRate> <totalTaxDue>45692.46</totalTaxDue> <prepayments>0.00</prepayments> <remainingTaxDue>45692.46</remainingTaxDue> <interestDue>0.00</interestDue> <penaltyDue/> <amountDueOrRefund>45692.46</amountDueOrRefund> <paymentAmount>0</paymentAmount> <taxpayerTelephoneNumber>266-715-4078</taxpayerTelephoneNumber> </salesUseShortForm> </formData> <transformedData/>

Appendix B: Sales & Use Tax Form after transformation by C1-FUSDM


After transformation by C1-FUSDM the original record has been reformatted and the transformed record is appended to the end of the record within <transformedData> tags. The original record has additional whitespace added which has been filed as a bug.
<suspenseIssueList> <issuesList> <messageParms/> <messageParmTypes/> </issuesList> </suspenseIssueList> <formData> <salesUseShortForm> <externalFormSourceID>EX-FUSUFFSQA</externalFormSourceID> <formType>SU-SHORT</formType> <documentLocator>EXTFEXTC1268530752562</documentLocator> <periodStartDate>2009-12-01</periodStartDate> <periodEndDate>2009-12-31</periodEndDate> <dateReceived>2010-01-15</dateReceived> <taxpayerName>Fus.Fname2218, Fus.Lname2218</taxpayerName> <taxpayerIdType>SSN</taxpayerIdType> <taxpayerIdNumber>530-75-2218</taxpayerIdNumber> <address1>2218, Address Line1</address1> <city>San Jose</city> <state>CA</state> <country>USA</country> <postal>95112</postal> <accountNumber>0967957492</accountNumber> <filingType>C1NP</filingType> <returnType>C1OR</returnType> <totalSales>543238.00</totalSales>

<purchasesSubjectToUseTax>34500.00</purchasesSubjectToUseTax> <totalSalesPurchases>577738.00</totalSalesPurchases> <nonTaxableSales>0.00</nonTaxableSales> <nonTaxableLabor>0.00</nonTaxableLabor> <salesToGovernment>23890.00</salesToGovernment> <salesTaxDueInState>0.00</salesTaxDueInState> <salesTaxDueOutOfState>0.00</salesTaxDueOutOfState> <otherDeductions>0.00</otherDeductions> <totalExemptionsDeductions>23890.00</totalExemptionsDeductions> <taxableSales>553848.00</taxableSales> <taxRate>8.25</taxRate> <totalTaxDue>45692.46</totalTaxDue> <prepayments>0.00</prepayments> <remainingTaxDue>45692.46</remainingTaxDue> <interestDue>0.00</interestDue> <penaltyDue/> <amountDueOrRefund>45692.46</amountDueOrRefund> <paymentAmount>0</paymentAmount> <taxpayerTelephoneNumber>268-075-2218</taxpayerTelephoneNumber> </salesUseShortForm> </formData><transformedData> <C1-SalesAndUseTaxForm> <formType>SUSHORT</formType> <bo>C1-SalesAndUseTaxForm</bo> <totalAmountDueOverpaid>45692.46</totalAmountDueOverpaid> <remittanceAmount>0.00</remittanceAmount> <formSource>FUSUFFSQA</formSource> <taxpayerDemographicInformation> <taxpayerName>Fus.Fname2218, Fus.Lname2218</taxpayerName> <taxpayerIdType>SSN</taxpayerIdType> <taxpayerIdValue>530-75-2218</taxpayerIdValue> <address1>2218, Address Line1</address1> <city>San Jose</city> <state>CA</state> <country>USA</country> <postal>95112</postal> </taxpayerDemographicInformation> <receiveDate>2010-01-15</receiveDate> <taxFormFilingType>C1OR</taxFormFilingType>

<documentLocator>EXTFEXTC1268530752562</documentLocator> <accountNumber>0967957492</accountNumber> <filingPeriodStartDate>2009-12-01</filingPeriodStartDate> <filingPeriodEndDate>2009-12-31</filingPeriodEndDate> <totalGrossSales>543238.00</totalGrossSales> <purchasesSubjectToTax>34500.00</purchasesSubjectToTax> <total>577738.00</total> <salesTaxInTotalGrossSales>0.00</salesTaxInTotalGrossSales> <nonTaxableSales>0.00</nonTaxableSales> <nonTaxableLabor>0.00</nonTaxableLabor> <salesToGovernment>23890.00</salesToGovernment> <salesInInterstateForeignCommerce>0.00</salesInInterstateForeignCommerce> <otherDeductions>0.00</otherDeductions> <totalExemptTransactions>23890.00</totalExemptTransactions> <taxableTransactions>553848.00</taxableTransactions> <taxRate>8.25</taxRate> <totalAssessedTaxAmount>45692.46</totalAssessedTaxAmount> <taxPrepayments>0.00</taxPrepayments> <remainingTaxDue>45692.46</remainingTaxDue> <penalty/> <interest>0.00</interest> <totalTaxAmountOwed>45692.46</totalTaxAmountOwed> <taxpayerTelephoneNumber>268-075-2218</taxpayerTelephoneNumber> </C1-SalesAndUseTaxForm> </transformedData>

Appendix D: Bugs Filed


Num 9604450 9589161 9588829 9588655 9564316 9493018 9368627 Reported 19-APR-10 15-APR-10 15-APR-10 15-APR-10 08-APR-10 19-MAR-10 10-FEB-10 Sev Comp Ver 2.2.0SF 2.2.0 2.2.0 2.2.0 2.2.0 2.2.0.0 2.2.0.0 Subject EXTRA BLANK LINES IN STAGING RECORDS TRANSFORMATION C1-FUSPC SEVERAL SQL WHICH NEVER PROCESS ANY ROWS C1-FUSPC UPDATES CI_TAX FORM CHANGING ALL COLUMNS INCLUDING 10 INDEXED COLS C1-FUSDM SQL EXECUTED 9TIMES/REC, DOES 9MILLION GETS BUT PROCESSES NO RECS C1-FUSDM UPDATES CI_FORM_UPLD_STG 4 TIMES PER RECORD. SQL ON CI_TD_TYPE TABLE SHOULD BE CACHED AND INDEXED (C1-FUSDM & C1-FUSPC) DISPLAYING FBH FOR A BATCH OF 1000 RECORDS TAKES OVER 60 SECONDS

3 3 3
3 3 3 3

S-ar putea să vă placă și