Sunteți pe pagina 1din 7

Performance Tuning for Oracle WebCenter Content 11g: Strategies & Tactics

Chris Rothwell, Fishbowl Solutions Paul Heupel, Fishbowl Solutions

Introduction
Oracle WebCenter Content 10g functionality was effectively contained in one container - the Content Server. This fact alone made it easy to deploy, administer, and customize. However, for all of these easy capabilities, the product was somewhat lacking on the scalability and performance side. With Oracle WebCenter Content 11g, and the inclusion of Oracle Weblogic Server, SOA and BPM, the product has expanded its functionality to achieve best-inclass performance and scalability. The new tradeoff is that these these additional infrastructure components have created a layer of complexity that often leads to delayed deployments and non-optimized non optimized systems. The good news is that with the right tuning strategies and appropriate use of reverse proxies and load balancing you you can truly optimize WebCenter Content 11g and maximize your technology investment.

Tuning WebCenter Content


Oracles ECM solution has its roots in the Stellent Content Management offering. From Xpedio 4.5 to Oracle 10gR3 the content management systems core was a Java Standard Standard Edition based solution. WebCenter Content 11g is deployed as a Fusion Middleware solution. Optimization techniques that held true from the Xpedio days to UCM 10gR3 may no longer apply to the latest incarnation of Fusion Middlewares Middlewares Enterprise Content Management (ECM) system.

Content Server architecture

Memory and Java Virtual Machine (JVM) Tuning


Memory is still one of the most significant performance tuning areas with Oracle WebCenter Content. As part of the Fusion Middleware Middleware stack WebCenter Content requires a Java Enterprise Edition container. Prior versions of the Content Server ran as a Java Standard Edition application. Under UCM 10gR3 and earlier, you could specify JVM tuning in the $UCM_HOME/bin $UCM_HOME/bin/intradoc.cfg /intradoc.cfg configuration file. file Tuning was somewhat limited since the JAVA_OPTIONS would append custom parameters with computed values.
2012. Fishbowl Solutions, Inc.

In WebCenter Content 11g, the content management solution is deployed inside Weblogic Server (WLS), with any JVM tuning performed on the application server. You have full control, either by modifying the managed server using the administrative console or modifying the USER_MEM_ARGS environment variable startup scripts. Oracles documentation suggests the following on Unix and Windows with the JRockit JVM: -Xms256m -Xmx1024m XnoOpt The Xmx flag specifies the maximum heap size with this example specifying 1GB of memory. Best practice is to keep your JVM heap settings under 75-80% of the available physical RAM, within limits for machines with excessive amounts of memory. As heap size is increased, CPU load will also increase for larger garbage collections. Under 32-bit operating systems, 1.5GB is the practical maximum limit assuming other services are not consuming resources. The Xms flag specifies the minimum heap size on initial startup. Increasing the heap takes considerable time, so it is best to set the Xmx and Xno parameters to the same size. For example: -Xms1024m -Xmx1024m -XnoOpt -XgcPrio:throughput On x86 and x64 hardware, JRockit should be the preferred JVM. JRockit was a Java virtual machine optimized for x86 hardware by Intel, purchased by BEA, and acquired by Oracle. The JRockit JVM performs significantly faster on x86 or x64 Windows and Linux architectures than Suns architecturally neutral JVM implementation. An example of JVM tuning, from another Oracle whitepaper, started with: -Xms3g -Xmx3g -XX:PermSize=512m -XX:MaxPermSize=512m -XX:+UseParallelGC -XX:ParallelGCThreads=8 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:NewRatio=3 -XX:+UseAdpativeSizePolicy -XX:+AggressiveHeap -XX:+DisableExplicitGC -Xnoclassgc -Xloggc:<file name> and continued to tune WebCenter as: -d64 -server -Xms3g -Xmx3g -XX:PermSize=512m -XX:MaxPermSize=1024m -XX:+AggressiveOpts -XX:+UseParallelGC -XX:ParallelGCThreads=16
2012. Fishbowl Solutions, Inc.

-verbose:gc verbose:gc -XX:+PrintGCDetails XX:+PrintGCDetails -XX:+PrintGCTimeStamps XX:+PrintGCTimeStamps -XX:NewRatio=4 XX:NewRatio=4 -Xnoclassgc Xnoclassgc -Xloggc:<file_name> Xloggc:<file_name> -Dweblogic.threadpool.MinPoolSize=72 Dweblogic.threadpool.MinPoolSize=72 -Dweblogic.threadpool.MaxPoolSize=72 Dweblogic.threadpool.MaxPoolSize=72 -Dweblogic.So Dweblogic.SocketReaders=12 cketReaders=12 -Djps.auth.debug=false Djps.auth.debug=false
Operating system architecture does not on its own provide enough information to properly tune the Content Server. As seen in the above example, repeated tuning and testing was required to find an optimum configuration. configuration The content repository has the additional complexity of requiring different performance configurations for contribution and consumption environments. environments. A heavy ingestion pattern will benefited from a -XgcPrio:throughput XgcPrio:throughput garbage collection, while searching searching may benefited from other GC models. Confirm onfirm your capitalization is correct. In many cases, command-line command line options are case sensitive unless explicitly stated. A configuration flag improperly set may be ignored, or cause unintended consequences.

Disk k Usage
WebCenter Content, like the earlier versions of the content repository, has a variety of disk mounting options, with implications for what type of storage may be appropriate for each area. Directories within the content repository may have different different service level agreements agreemen and performance requirements. Using a single storage system does not produce optimal optimal performance-cost performance cost optimization.

The he latest incarnation of the Oracle Content Repository, a shared file system is still required for clustering. The ECM services run as Java processes. Prior to 11g, these services took the strategy of keeping a memory cache, writing to a shared file system or database, and having the other nodes update their local cache. All content management anagement services continue to be stateless and utilize the same concurrency mechanism even though they are living in a Java Enterprise Edition world.

2012. Fishbowl Solutions, Inc.

Content Server in a clustered configuration


High performance low latency shared disk space is critical for performance in the shared directory. directory When a file is ingested into the content repository, it is placed in the

<MiddlewareHome>/user_projects/domains/<FMW_Domain>/ucm/cs/vault/~temp
directory. From that directory, the file is copied to any refineries, copied for full-text full text indexing, any necessary transformations created, and moved to the appropriate vault and weblayout locations. File IO is key to that ~temp directory, with five or more read operations as part of a standard check che in. All other sub directories within the vault are the native or original file checked into the repository. repository The vault directory is a long long term archive for the asset, and should be viewed from a disaster recovery perspective. A copy of the file, or a version intended for heavy consumption, consumption is typically placed in the weblayout directory. Any file in the weblayout directory could be recreated, so emphasis should be on performance rather than reliability. In 10gR4 and below, Content IDs or dDocNames had had required optimizations like the Fast Checkin component to get around row locking on the counter tables under heavy ingestion. The 11g repository changed the way the identifiers were generated, caching a block of content identifiers. There may be minor gaps in the sequence of content identifiers, which can be ignored. Prior to 11g, a typical installation would have data, search, shared, and weblayout directories that were typically excluded from virus scanning. These directories still exist in 11g, but are now found in the domain directory rather than the base UCM path. For example, in 10g:

<UCM_HOME>/server/weblayout
became

<MiddlewareHome>/user_projects/domains/<FMW_Domain>/ucm/cs/weblayout
WebLogic logging directories should also avoid virus scanning in version 11g and later.

2012. Fishbowl Solutions, Inc.

Logging
11g uses the Weblogic logging. The granularity of information sent to the logging system goes from TRACE, DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, to EMERGENCY. In production environment, change the logging level to ERROR. One could modify the <MiddlewareHome>/user_projects/domains/<FMW_Domain>/config/servers/UCM_server1/logging.xm l or modify the logging levels using the Weblogic administrative console.

File Store Providers


Oracles ECM solution moved a File Store Provider to accommodate different usage patterns. The default file store provider in 11g continues to use the vault/weblayout file structure. Classically, the Oracle ECM solution would store relational data in a database and files in a file system. As the number of managed assets increased, some scalability issues became apparent. Three metadata fields dDocType, dSecurityGroup, and dSecurityAccount were used to spread the assets out to multiple directory structures. There is a limit to how many files can go into a directory structure, and as the number of assets grew into the tens of millions, hundreds of millions, and eventually billions inode issues and disk management became a bottleneck. UCM updated the default file store provider to add additional dispersion directories to spread out the files. A database file store provider was added where the assets are persisted in the database rather than a file system. The Oracle 11gR2 Database SecureFiles API improved performance by over 40% compared to the 10g implementation. Performance matches, and in some cases exceeds, major networked file systems. In addition to the I/O gains, repositories that have Database Compression will automatically have de-duplication performed against content stored the repository. When content is uploaded to the repository, a temporary file is placed in the vault/~temp location with a cache cleanup eventually clearing out that disk space. The current version allows that cache to be limited to one day, so care must be taken when ingesting very large volumes of content. Content must also be indexed before that temporary area becomes a candidate for cleanup.

Virtualization
Oracle differentiates between hard and soft portioning from a licensing perspective. With hard partitioning in use, one only licenses the CPU used by the virtual machine. Soft partitioning requires licensing for all CPUs in the host machine. Oracle VM can be configured to qualify as hard partitioning, but EMC VMWare is considered soft partitioning. Hardware prices are trivial compared to software, so optimize the virtual hosts to consolidate licenses. Typically, multiple smaller instances perform better than fewer larger instances. Attempt to optimize CPU utilization, adding additional CPUs to the host servers as needed. While CPU architecture, socket, and cores impact the licensing costs, memory does not. A physical CPU may be shared among multiple virtual machines, but memory should not be a pooled resource.

Services and Components


WebCenter Content continues the service-based architecture introduced in earlier versions of the content repository. Services that return search results, metadata, or actual assets can be extended or overridden. GET_SEARCH_RESULTS, for example, can return a large amount of data if a repository has many custom metadata fields. The content repository will cache the search results, but network traffic can be significantly reduced by creating a template that returns only the fields and result sets needed.
2012. Fishbowl Solutions, Inc.

IDOC script includes can be cached, so the html will be dynamically rendered and then placed in session scope for a specific user or application scope for all users. The cacheInclude method takes the include name, scope, and life span as required parameters. For example, the std_page_begin include would be cached for ten minutes for each user. <$cacheInclude("std_page_begin", "session", 600)$> 11g continues to lack a default success status code or message returned with all services. Content Services typically indicate an error by setting StatusCode property to a non-zero number. CIS, RIDC, and several other integration methods will potentially throw an exception when there are problems, but will absolutely throw an exception if you look for the StatusCode property and it is missing. One can either trust the service will throw an exception and assume it works, or modify the content server to set a default success status code.

Reverse Proxy
When architecting a website for high performance using Webcenter Content, a reverse proxy can be used to improve both performance and security for your site. A reverse proxy functions as a gateway to your network and adds an additional layer of caching for site visitors that will help to improve page response time, particularly under heavy load. Typically, a reverse proxy will reside in the DMZ of your network and will be the entry point for users accessing your site. The standard process flow for a user accessing a site behind a reverse proxy is as follows: 1. A user enters http://www.mysite.com in a browser.

2. DNS directs the user to the reverse proxy server that is sitting in your DMZ. 3. The reverse proxy determines if the request is being made for static content or dynamic content. 4. If static content is being requested, the reverse proxy will check its cache and will return the cached page to the user if the page is found in the cache.

5. If dynamic content is being requested or if the reverse proxy does not have the page in its cache, it will send
a request through the firewall to a web server inside your network to retrieve the requested page and will return that page to the user. The performance gains from caching at the reverse proxy level are obviously contingent on a number of factors including the number of static resources and pages that users are requesting, the frequency at which those items are accessed, and the hardware-network infrastructure that is being used. One popular reverse caching application, Varnish Cache, claims to improve delivery by a factor of 300 1000x depending on architecture when serving a page from cache (www.varnish-cache.org),. Besides the caching advantage of this model, your site also gains an additional level of security by implementing a reverse proxy. All requests that are made to your site are being filtered through the remote proxy server, which limits an end-user from distinguishing server names or other network architecture information that could potentially be used to compromise your systems. Additionally, there is only a single entry point through your firewall, namely between your proxy server and your web server, so network administrators have considerably more control over limiting the traffic that is allowed past the firewall.

2012. Fishbowl Solutions, Inc.

Under a contribution-consumption contribution consumption site architecture model, utilizing a reverse proxy allows your network administrator to keep both the contribution and the consumption Content Server instances inside the firewall. The network architecture diagram below demonstrates using multiple reverse proxies with a load balancer balancer for added performance in a contribution-consumption contribution consumption Site Studio web site model.

2012. Fishbowl Solutions, Inc.

S-ar putea să vă placă și