Documente Academic
Documente Profesional
Documente Cultură
14
Copyright 2010 by Alfresco and others. Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Alfresco. The trademarks, service marks, logos, or other intellectual property rights of Alfresco and others used in this documentation ("Trademarks") are the property of Alfresco and their respective owners. The furnishing of this document does not give you license to these patents, trademarks, copyrights, or other intellectual property except as expressly provided in any written agreement from Alfresco. The United States export control laws and regulations, including the Export Administration Regulations of the U.S. Department of Commerce, and other applicable laws and regulations apply to this documentation which prohibit the export or re-export of content, products, services, and technology to certain countries and persons. You agree to comply with all export laws, regulations, and restrictions of the United States and any foreign agency or authority and assume sole responsibility for any such unauthorized exportation. You may not use this documentation if you are a competitor of Alfresco, except with Alfresco's prior written consent. In addition, you may not use the documentation for purposes of evaluating its functionality or for any other competitive purposes. This copyright applies to the current version of the licensed program.
ii
Document History
VERSION 0.1 0.2 DATE 2010-02-23 2010-02-24 AUTHOR Peter Monks Peter Monks DESCRIPTION OF CHANGE First draft released for review by Mark Lugert Second draft released for review by US support, SE and consulting teams Added Alfresco support logo, updated port lists, solidified RHEL validation instructions, third draft released for WW review Added more detailed information on tuning databases Added information on 3rd party app (OpenOffice, ImageMagick, pdf2swf) configuration Added extra UTF8 checks for MySQL, courtesy of Scott Ashcraft. Added version check SQL for MySQL Added SQL validation statements for PostgreSQL, Oracle and MS SQL Server Migrated to Alfresco Documentation Template Submitted to docs team Copy edit and comments. Please use Track Changes to accept or reject changes.
0.3
2010-03-01
Peter Monks
0.4
2010-03-25
Peter Monks
0.5
2010-04-12
Peter Monks
0.6
2010-04-29
Peter Monks
0.7
2010-05-06
Helen Mullally
iii
0.8
2010-05-06
Peter Monks
Reviewed changes and accepted / rejected as appropriate. Added links to Environment Validation tool. Removed (now redundant) appendices. Added note about clustering and db.pool.max Added section on hibernate.jdbc.fetch_size Updated virtual file servers thread pool configuration for v3.2+ Added note on JIT compiler exclusions Added recommendation for db.pool.idle Added notes on DB2 Added recommendation for in-transaction indexing Added recommendation for quota calculation Added recommendation for db.pool.validate.query Added recommendation for JVM stack size Added recommendation for index.recovery.mode
0.9
2010-06-14
Peter Monks
0.10
2010-08-17
Peter Monks
0.11
2010-08-25
Peter Monks
0.12
2010-11-22
Peter Monks
0.13 0.14
2010-11-23 2010-12-16
iv
Table of Contents
INTRODUCTION ................................................................................................... 1
DOCUMENT PURPOSE ............................................................................................... 1
INTENDED AUDIENCE ................................................................................................ 1
GLOSSARY .............................................................................................................. 1
ARCHITECTURE VALIDATION ............................................................................ 3
SUPPORTED STACKS FOR ALFRESCO ......................................................................... 3
HARDWARE ............................................................................................................. 3
I/O ................................................................................................................................................................................... 3
CPU ................................................................................................................................................................................ 4
DATABASE ............................................................................................................... 4
Maintenance and tuning ................................................................................................................................................. 4
OPERATING SYSTEM ................................................................................................ 5
JAVA VIRTUAL MACHINE ........................................................................................... 5
ENVIRONMENT VALIDATION ............................................................................. 7
DAY ZERO CONFIGURATION ............................................................................. 9
JVM TUNING ........................................................................................................... 9
Increase JVM heap ......................................................................................................................................................... 9
Reduce JVM stack .......................................................................................................................................................... 9
Remove JIT exclusions .................................................................................................................................................. 9
SET DIR.ROOT TO ABSOLUTE PATH .......................................................................... 10
ENABLE AUTOMATIC SEARCH INDEX RECOVERY ........................................................ 11
DATABASE CONNECTION POOL ................................................................................ 11
Maximum Size .............................................................................................................................................................. 11
Idle Size ........................................................................................................................................................................ 12
Validation Query ........................................................................................................................................................... 12
DATABASE FETCH SIZE ........................................................................................... 13
IN-TRANSACTION FULL TEXT INDEXING (OPTIONAL) .................................................... 13
QUOTA CALCULATIONS (OPTIONAL) ......................................................................... 14
APPLICATION SERVER WORKER THREAD POOL (OPTIONAL) ........................................ 14
VIRTUAL FILE SERVER (VFS) WORKER THREAD POOL (OPTIONAL) ............................. 14
SHAREPOINT PROTOCOL WORKER THREAD POOL (OPTIONAL) ................................... 15
JODCONVERTER-BASED OPENOFFICE INTEGRATION ............................................... 15
CONFIGURE OTHER THIRD PARTY APPLICATIONS ....................................................... 15
Introduction
Introduction
Document purpose
By default, Alfrescos configuration is optimized for single user evaluation of Alfresco. This configuration minimizes resource usage at the expense of scalability (particularly scalability in the presence of large concurrent traffic volumes). Therefore, for any other use of Alfresco (including but not limited to: QA, performance / scalability testing, production, production mirror, disaster recovery), Alfresco strongly recommends that additional configuration be performed. This document describes the universal configuration steps that should be taken to achieve this, regardless of the specific Alfresco use case, and before Alfresco is started for the first time. It does not describe the full breadth of Alfresco configuration options that can be leveraged to scale Alfresco in use case specific ways, however this is described in detail elsewhere (for example, in the product documentation, knowledge base). This document is currently focused on Alfresco 3.3 installations, although many of the recommendations can be applied to earlier versions as well (provided the associated Supported Stack is used, rather than the 3.3 Supported Stack).
Intended audience
This document is intended for developers, system administrators, and anyone who is tasked with installing an Alfresco instance, regardless of the intended use of that instance (evaluation, development, test / QA, production).
Glossary
The following table describes the terms that are used within this document, each of which has a specific meaning within the context of Alfresco:
TERM DBA
DEFINITION DataBase Administrator someone who has been trained and certified to administer a specific relational database product. Note: relational databases vary greatly in their capabilities, so it is critical that any DBA be experienced with exactly the database product you are intending to use for Alfresco. Input/Output in this document refers to I/O performed by Alfresco to some external software or device (such as the network or a disk subsystem). Operating System Random Access Memory
I/O
OS RAM
Introduction
DEFINITION Central Processing Unit Virtual File Server specifically the functionality in Alfresco that provides access to the repository via CIFS, FTP, NFS and WebDAV Simple Mail Transfer Protocol a widely used protocol used for sending email Internet Message Access Protocol a more modern protocol used for interacting with email servers Java Virtual Machine
Architecture validation
Architecture validation
This section describes the steps required to validate the architecture to ensure that it meets the prerequisites for an Alfresco installation. The following summary shows the steps that are required to validate the architecture: 1. Check the supported stacks list. 2. Optimize the hardware settings. 3. Validate the database. 4. Validate the Operating System. 5. Validate and tune the JVM.
http://www.alfresco.com/services/support/stacks/ (summary matrix) https://network.alfresco.com/?f=default&o=workspace://SpacesStore/4defa35168cb-4491-9f23-46fb861ddd05 (comprehensive matrix - requires a subscription to the Alfresco Enterprise Network)
Hardware
This section describes how to validate your I/O subsystems and CPU.
I/O
One of the primary determinants of Alfrescos performance is I/O. Optimize the following, in priority order: 1. I/O to the relational database Alfresco is configured to use. 2. I/O to the disk subsystem on which the Lucene indexes are stored 3. I/O to the disk subsystem on which the content is stored. In each case, the goal is to minimize the latency (response time) between Alfresco and the storage system, while also maximizing bandwidth. Low latency is particularly important for database I/O, and one rudimentary test of this is to ping the database server from the Alfresco server round trip times greater than 1ms indicate a suboptimal network topology or configuration that will adversely impact Alfresco performance. Jitter (highly variable round trip times) is also of concern, as that will increase the variability of Alfrescos performance the standard deviation for round trip times should be less than 0.1ms.
Architecture validation
CPU
Alfresco will function correctly on virtually all modern 32bit and 64bit CPUs, however, for production use, Alfresco recommends a clock speed greater than 2.5Ghz to ensure reasonable response times to the end user. Although it is not strictly necessary, a 64bit architecture is also recommended, primarily because it allows the JVM to utilize more memory (RAM) than a 32bit architecture. Note: CPU clock speed is of particular concern for the Sun UltraSPARC architecture, as some current UltraSPARC based servers ship with CPUs that have clock speeds as low as 900Mhz, well below what is required for adequate Alfresco performance! If you intend to use Sun servers for hosting Alfresco, please ensure that all CPUs have a clock speed of at least 2.5Ghz. At the time of writing, this implies that: an X or M class Sun server is required, with careful CPU selection to ensure 2.5Ghz (or better) clock speed T class servers should not be used, as they do not support CPUs faster than approximately 2Ghz
Understandably, Alfresco is unable to provide specific guidance on Sun server classes, models or configurations, so you should talk with your Sun reseller to confirm that minimum CPU clock speed recommendations will be met.
Database
Disclaimer: Alfresco is unable to provide specialized support for maintaining or tuning your relational database. You MUST have an experienced, certified DBA on staff to support your Alfresco installation(s)1.
Typically
this
will
not
be
a
full
time
role
once
the
database
is
configured
and
tuned
and
automated
maintenance
processes
are
in
place.
However
an
experienced,
certified
DBA
is
required
to
get
to
this
point.
2
Unless
your
DBA
recommends
otherwise,
Alfresco
suggests
performing
this
maintenance
daily.
3
Note:
Relying
on
your
databases
automated
statistics
gathering
mechanism
may
not
be
optimal
consult
an
experienced,
certified
DBA
for
your
database
to
confirm
this.
4 Alfresco Day Zero Configuration Guide
Architecture validation
DATABASE MySQL
EXAMPLE MAINTENANCE COMMANDS ANALYZE4 - consult with an experienced, certified MySQL DBA who has InnoDB experience (Alfresco cannot use a MyISAM database and hence an InnoDB-experienced MySQL DBA is required) VACUUM and ANALYZE5 consult with an experienced, certified PostgreSQL DBA Depends on version6 consult with an experienced, certified Oracle DBA ALTER INDEX REBUILD7, UPDATE STATISTICS8 consult with an experienced, certified MS SQL Server DBA REORGCHK9, RUNSTATS10 consult with an experienced, certified DB2 LUW DBA
DB2
Operating System
You should ensure that your chosen OS has been officially certified for use with Alfresco (refer to the Supported Stacks list for details). Alfresco is not sensitive to changes to the OS configuration, beyond the impact on I/O performance (see I/O on page 3). That said, it is recommended that a 64bit OS be used if the hardware (CPU, and so on) is 64bit capable.
4 5
Architecture validation
For information on configuring and tuning the JVM, refer to the product documentation or the following wiki page: http://wiki.alfresco.com/wiki/JVM_Tuning Note that Alfresco requires an official Sun 1.6 JDK (or IBM JDK, if using Websphere) other JVMs (earlier versions, Harmony, gcj, JRockit, HP, and so on) are NOT supported and are known to cause issues in various parts of the product. Alfresco recommends using a 64bit JVM if the underlying platform (OS and hardware) is 64bit capable.
Environment validation
Environment validation
The following environment-specific items must be validated prior to installing Alfresco. Note that Alfresco now provides an Environment Validation tool that can validate most of the following requirements. This tool is available at:
https://network.alfresco.com/?f=default&o=workspace://SpacesStore/f98ad411510d-444f-8166-432a66fe172a
1. Validate that the hostname of the server is resolvable in DNS.11 2. Validate that the user Alfresco will run as can open sufficient file descriptors (4096 or more). 3. Validate that the ports on which Alfresco listens are available12: o o o o o o o o FTP: SMTP: SMB / NetBT: IMAP: SharePoint Protocol: Tomcat Administration: HTTP: RMI: TCP 2113 TCP 2514 UDP 137,138, TCP 139,44515 TCP 14316 TCP 707017 TCP 8005 TCP 8080 TCP 50500
4. Validate that the installed JVM is Sun version 1.6. 5. Validate that the directory in which the JVM is installed does not contain spaces. 6. Validate that the directory in which Alfresco is installed does not contain spaces. 7. Validate that the directory Alfresco will use for the repository (typically called alf_data) is both readable and writeable by the OS user that the Alfresco process will run as. 8. Validate that you can connect to the database as the Alfresco database user, from the Alfresco server.18
11 12
This
is
required
if
Alfresco
is
going
to
be
configured
in
a
cluster.
Note:
the
ports
listened
here
are
the
defaults.
If
youre
planning
on
reconfiguring
Alfresco
to
use
different
ports,
or
are
enabling
additional
protocols
(such
as
HTTPS,
SMTP,
IMAP
or
NFS)
you
should
update
this
list
with
those
port
numbers.
13
On
Unix-like
OSes
that
offer
so-called
privileged
ports,
Alfresco
will
normally
be
unable
to
bind
to
this
port
unless
run
as
the
root
user
(which
is
not
recommended).
In
this
case,
even
if
this
port
is
available,
Alfresco
will
still
fail
to
bind
to
it,
however
for
FTP
services
this
is
a
non-fatal
error
Alfrescos
FTP
functionality
will
simply
be
disabled
in
the
repository.
14
SMTP
is
not
enabled
by
default.
15
On
Unix-like
OSes
that
offer
so-called
privileged
ports,
Alfresco
will
normally
be
unable
to
bind
to
this
port
unless
run
as
the
root
user
(which
is
not
recommended).
In
this
case,
even
if
this
port
is
available,
Alfresco
will
still
fail
to
bind
to
it,
however
for
CIFS
services
this
is
a
non-fatal
error
Alfrescos
CIFS
functionality
will
simply
be
disabled
in
the
repository.
16
IMAP
is
not
enabled
by
default.
17
Some
of
the
Alfresco
bundles
(specifically
the
WAR,
EAR
and
Tomcat
bundles)
dont
ship
with
the
SharePoint
Protocol
enabled
by
default.
If
youre
using
one
of
these
bundles
you
can
ignore
this
port
until/unless
you
install
support
for
the
SharePoint
Protocol.
18
Note:
this
will
require
installation
of
the
database
vendors
client
tools
on
the
Alfresco
server.
Alfresco Day Zero Configuration Guide 7
Environment validation
9. Validate that the character encoding for the Alfresco database is UTF-8. 10. (MySQL only) Validate that the default storage engine for the database server that Alfresco will use is InnoDB19. 11. Validate that the following third-party software is installed and the correct versions: o o OpenOffice v3.1 or newer ImageMagick v6.2 or newer
12. (RHEL and Solaris only) Validate that OpenOffice is able to run in headless mode. Refer to the appendices in this document for OS and database-specific commands that can be used to perform these validations.
19
JVM tuning
Note: the following recommendations are the bare minimum reconfiguration required by Alfresco, but further tuning of the JVM may be necessary depending on your use of Alfresco. Refer to the product documentation or the following wiki page. http://wiki.alfresco.com/wiki/JVM_Tuning With the exception of the settings listed here, it is not recommended to set any JVM parameter without first analyzing the running JVM and experimentally verifying that the change definitively improves the behavior of Alfresco for your use case. JVM tuning is a highly environment and use case specific activity, and it is trivially easy to destroy the JVMs inherent reliability and scalability with uninformed changes to the JVM settings.
If, as a result of making this change, you start seeing java.lang.StackOverflowError exceptions in the Alfresco log, you may increase this value in 128k increments until the exceptions disappear.
20
${ALFRESCO_HOME}/alfresco.sh
or
%ALFRESCO_HOME%\alfresco.bat
in
versions
up
to
and
including
3.3SP2,
${ALFRESCO_HOME}/tomcat/scripts/ctl.sh
or
%ALFRESCO_HOME%\tomcat\scripts\ctl.bat
in
versions
3.3SP3
and
above
that
use
Tomcat.
Alfresco Day Zero Configuration Guide 9
these classes, something that is no longer relevant now that Alfresco only supports JDK 1.6 and above. Double check that these JIT exclusions are commented out in the startup script, as follows (note the highlighted comment symbol):
# Following only needed for Sun JVMs before to 1.5 update 8 #export JAVA_OPTS="${JAVA_OPTS} XX:CompileCommand=exclude,org/apache/lucene/index/IndexReader\$1,doBody XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/Ind exInfo\$Merger,mergeIndexes XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/Ind exInfo\$Merger,mergeDeletions"
On Windows, the rem command should be used in place of the Unix-shell # comment symbol. Note: newer versions of Alfresco (3.3+) no longer include this option in the start up script so dont be surprised if it is not present.
It is strongly recommended that you always set this value to an absolute file system path before starting Alfresco for the first time. This ensures that no matter how the Alfresco instance is started, it will always find the directories where content has previously been written. With Tomcat, this property is found in:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties21
If you do not set dir.root to an absolute path, you may see a CONTENT INTEGRITY ERROR message in the alfresco.log file during a second or subsequent startup of the server. Other than being an absolute path, Alfresco has no specific requirements for where this directory resides or what it is called. You should optimize the location of the file system portion of the Alfresco repository to maximize I/O performance (as mentioned in I/O on page 3)).
21
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-repository.properties
10
If the index.recovery.mode property is not visible in alfresco-global.properties, it can be added anywhere in the file.
You may add this property anywhere in the file, although for clarity you should place it immediately after the other database properties. Important note: after increasing the size of the Alfresco database connection pools, you must also increase the number of concurrent connections your database can handle to at least the size of the cumulative Alfresco connection pools24. In fact Alfresco recommends configuring at
22 23
As
of
Alfresco
Enterprise
3.2r
this
number
may
change
in
future
versions.
Tomcat
6.0,
for
example,
allows
up
to
200
concurrent
HTTP
requests
by
default.
24
In
a
cluster
each
node
has
its
own
independent
database
connection
pool.
You
must
configure
sufficient
database
connections
for
all
of
the
Alfresco
cluster
nodes
to
be
able
to
connect
simultaneously.
Alfresco Day Zero Configuration Guide 11
least 10 more connections to the database than are configured cumulatively across all of the Alfresco connection pools, to ensure that you can still connect to the database even if Alfresco saturates its own connection pools. Do not forget to factor in cluster nodes (which can each use up to 275 database connections) as well as connections required by other applications that are using the same database server as Alfresco. The precise mechanism for reconfiguring your databases connection limit depends on the relational database product you are using; your DBA should be able to readily configure this.
Idle Size
By default, each Alfresco instance will, when idle, reduce the size of the database connection pool to no more than 8 open connections at any time, in order to minimize resource usage in both the JVM and the database. While appropriate for evaluation and individual developer environments, this setting is not appropriate for any kind of multi-user or high traffic installation, including but not limited to QA, performance / scalability test, production mirror and production environments. For these environments Alfresco recommends disabling the idle connection reclamation logic in the database connection pool, by adding the db.pool.idle property to:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties
Validation Query
By default Alfresco does not periodically validate each database connection retrieved from the database connection pool. Validating connections is, however, very important for long running Alfresco servers, since there are various ways database connections can unexpectedly be closed (for example by transient network glitches and database server timeouts). Enabling periodic validation of database connections involves adding the db.pool.validate.query property to:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties
and setting it to one of the following values, depending on the database thats in use:
SELECT VERSION()
SELECT 1
12
DATABASE DB2
You may add this property anywhere in the file, although for clarity you should place it immediately after the other database properties.
and set it to 0:
lucene.maxAtomicTransformationTime=0
Important note: this setting increases the chances of short-term staleness in the full text indexes. This possibility exists anyway (any full text extraction that takes 20ms or more will, by default, occur asynchronously anyway), but this setting increases that behavior.
25
Notably
Oracle.
Alfresco Day Zero Configuration Guide 13
Important note: this setting globally disables quota calculations the functionality is completely disabled in this installation of Alfresco. For that reason this setting should not be used if there is any requirement to use content quotas in this Alfresco instance. It can, however, be turned back on at a later date with no side effects (beyond the expected impact on Alfresco performance).
to:
26 27
14
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-fileservers-context.xml
Remove all of the <bean> definitions except for the bean with the id fileServerConfiguration. Add the following property block to the fileServerConfiguration bean:
<property name="coreServerConfigBean" ref="coreServerConfig" />
Please take careful note that the first property points to the directory into which the ImageMagick is installed, whereas the second property points to the pdf2swf executable file.
16 Alfresco Day Zero Configuration Guide