Sunteți pe pagina 1din 16

A Developers Guide to Accelerating Microsoft Excel models with Platform Symphony

PLATFORM COMPUTING WHITEPAPER - APRIL 2011 Gord Sissons

Platform Computing
gsissons@platform.com

Contents
1. 2. 3. Introduction ......................................................................................... 2 Overview ............................................................................................ 3 Common Excel Integration Approaches ......................................................... 4 Pattern #1: Invoking a Custom Service ......................................................... 4 Pattern #2: Scripts or Executables as Tasks ................................................... 7 Pattern #3: Excel Instances on the Grid ....................................................... 9 Pattern #4: Extending Excel Services UDFs ................................................... 10 Pattern #5: Hybrid Approaches ................................................................. 13 4. Summary .............................................................................................. 14

4/6/11

1. Introduction
This white paper illustrates how Platform Symphony can help developers improve the performance of Microsoft Excel based computations. After providing some context on the business challenge, this paper examines several alternative approaches to adapting Excel applications to a distributed grid-computing environment. Developers may also wish to review the Platform whitepaper A Developers Guide to Building High Performance Service-Oriented Applications for a more complete discussion of Platform Symphony, its unique architecture, and the Platform Symphony Developer Edition.

4/6/11

2. Overview
Microsoft Excel is widely used in a range of business applications. With a rich set of builtin functions, a straightforward programming model, and a large library of available thirdparty add-ins, Excel has become a staple tool for analysts running numerical simulations in areas including risk management, actuarial analysis and monte carlo simulation. The accessibility and ease of use of Excel has resulted in many financial services organizations making extensive use of Excel based models to automate complex repetitive tasks such as extracting market data from real-time sources and computing pricing scenarios and risk positions. With increased trading volumes, increased electronification of exchanges, and ever more complex financial products, the need to run more sophisticated analysis in a short time, taking into account more risk factors has become a critical source of competitive advantage. Particularly for front-office applications where response times are of critical importance. With Excel 2007 and Excel Services (a SharePoint technology allowing Excel 2007 workbooks to be accessed through a browser), Microsoft have recognized these uses of Excel, and have introduced a number of powerful features intended to make it more practical to deploy Excel based worksheets as shared calculation services. While there is rich functionality in .NET for developers creating new applications, a great number of existing models written in Excel 2003 and older versions are in widespread use. These models often make extensive use of VBA (Visual Basic for Applications) to perform calculations and integrate custom code implement as Excel Link Library add-ins (XLLs).

While flexible and powerful, VBA is an interpreted language and runs slowly its not uncommon for complex spreadsheets to run literally for hours on a single computer to run a complex risk calculation. While it would be nice to be able to simply re-write older spreadsheet models using compiled languages and more modern tools and approaches, its seldom that simple. Regulatory requirements dictate the need for calculation repeatability, meaning that firms are required to retain their legacy Excel models along with historical snapshots of data. Also, even if it were practical to re-architect these spreadsheetbased models around a single framework, it would be both expensive and time-consuming to do so. Most firms employ a variety of calculators and simulation tools both commercial and in-house developed. While many run on Microsoft platforms, others run on commercial UNIX or Linux environments, and calculation services are implemented in a variety of programming languages including Java, C, C++ and C# making flexibility critical. Platform Symphony integrates seamlessly into a .NET environment, but also offers a flexible, language and platform agnostic programming model. This open and flexible approach makes Platform Symphony the industrys best solution for clients needing practical solutions to accelerating a variety of Excel models implemented using multiple Excel versions and technology approaches.

4/6/11

3. Common Excel Integration Approaches


Table 1 shows some common design patterns used to accelerate Excel based models by running computations in parallel. While there are other potential integration approaches, the remainder of this paper focuses on explaining the patterns below and showing how a developer can easily parallelize computations and improve performance using Platform Symphony. In some cases we have provided coding examples as well so that the reader can appreciate how straightforward the integration effort is.

Pattern #1
Custom Developed Services Description Excel spreadsheet calls a distributed compute services via the Symphony COM API

Pattern #2
Command line utilities as tasks An Excel client invokes services that are simple scripts or binaries running on compute nodes

Pattern #3
Excel Instances on the Grid Multiple Excel instances run in parallel called by client spreadsheets or other clients

Pattern #4
Extending Excel Services via UDFs Web-based clients access Excel Services via sharepoint employing UDFs to distribute computations to a grid

Pattern #5
Hybrid Deployment Scenarios Combined approaches sharepoint services using UDFs to invoke multiple services written in Java, C++ or running as other Excel services optional

Excel Required on Compute Nodes Server-side Environment

no

no

yes

no

Windows, UNIX or Linux

Windows, UNIX or Linux

Windows (running Excel)

Windows, UNIX or Linux Excel running on a Sharepoint server Excel, C, C++, VBA, VB6, Java

Windows, UNIX or Linux or compute hosts running Excel Excel, C, C++, VBA, VB6, Java

Client-side Developer Environment Developer Effort Required

Excel

Excel

Excel, C, C++, VBA, VB6, Java

low

very low

medium

low

low

Table 1: common approaches to accelerating Microsoft Excel models

4/6/11

Pattern #1: Invoking a Custom Service


To avoid the relatively poor performance of VBA scripts, Excel add-ins are often written in languages such as C++ or C# using Visual Studio or other tools. Once these compiled add-in functions (referred to as XLLs) are associated with a spreadsheet, calculations run at native speed providing a significant performance advantage over code written in VBA. Performance is still constrained however by the resources of a single machine, and multiple calls to these compiled XLLs will run serially within a spreadsheet. Excel users seeking greater performance will inevitably want to distribute the calculations performed by these high-performance XLLs and run them in a service oriented model where calculations can take place in parallel harnessing the power of several machines running concurrently. This approach is depicted in Figure 1.

Figure 1: Distributing XLL / add-in Services to a Grid

The code fragment below illustrates how VBA code in the Excel worksheet can directly open a connection to a Platform Symphony Service, start a session and send tasks to a hypothetical Platform Symphony managed compute service called OptionChain() running on the grid. This OptionChain() function may have been previously implemented in an XLL. Platform Symphony hides considerable complexity from the developer, starting the compute service on appropriate nodes based on demand and in accordance with policy, and handling error conditions that inevitably arise when running a service in production and making them transparent to both the developer and the application user.
Open a connection to Platform Symphony Dim connection As CsoamConnection Set connection = soamApi.Connect(OptionChain, callback) Open a session on the grid Dim session As CSoamSession Set session = connection.CreateSession(attributes) send parameter set to the compute service For k = 0 To taskToSend - 1 Dim message As MyMessage Set message = New MyMessage .. Dim inputHandler As CSoamTaskInputHandle Set inputHandler = session.SendTaskInput(message) Next K

Once each of the tasks have been sent to the compute service, Platform Symphony can receive the results of these calculations using a few more lines of code.
4/6/11 5

Adapting existing code to run as a Platform Symphony service is similarly straightforward. The code example below shows how our sample OptionChain() service written in C++ is implemented as a Platform Symphony Service. The C++ code is wrapped in a Platform Symphony Service Container class. When Platform Symphony invokes the calculation service, it will pass a taskContext to the Platform Symphony service. The onInvoke() method retrieves the calculation parameters from Platform Symphony, calls the existing OptionChain() code with the received parameter set, and then returns the results of the calculation to Platform Symphony via the SetOuputMessage() method. The code modifications required to grid-enable the service are shown below. Note that the main method simply creates an instance of the service class and invokes the run() method so that the new service is ready to receive work from Platform Symphony. Platform Symphony automatically handles starting and stopping services instances based on demand and sharing policies reflecting how various users and groups are entitled to share the virtualized resources of the grid. A Sample Compute Service Adapted to Platform Symphony
class MyServiceContainer : public ServiceContainer { public: virtual void onInvoke (TaskContextPtr& taskContext) throw(SoamException) { MyMessage inMsg, outMsg; taskContext->populateTaskInput(inMsg); .. OptionChain optc(istr); optc.Value(); .. taskContext->setOutputMessage(outMsg); } }; int main(int argc, char* argv[]) { int retVal = 0; try { // Create the container and run it MyServiceContainer myContainer; myContainer.run(argc, argv); } catch(SoamException& exp) { // report exception to stdout cout << "exception caught ... " << exp.what() << endl; retVal = -1; } return retVal; }

The above example, while simplified slightly, shows how straightforward it is to adapt existing code to a grid-computing environment. The freely downloadable Platform Symphony Developers Edition provides numerous coding examples in different languages and programming environments.

4/6/11

Pattern #2: Scripts or Executables as Tasks


Platform Symphony execution tasks are a unique and powerful feature that allow existing executables to be made callable as tasks in a grid computing environment. Using this approach, organizations can realize the benefits of Platform Symphony without changing application code. This is particularly useful in cases where source code for a service is not available, or developers simply do not have the time to integrate their application using the Platform Symphony API. Workload management systems including Platform LSF and Sun Grid Engine have for a long-time supported the distribution of command-line oriented workloads to compute nodes on a cluster. Unlike these solutions however, Platform Symphony fully abstracts these binary applications or scripts as service oriented tasks, making them callable in a variety of ways including via the client API, through the Platform Symphony Management console or via a special command line utility provided for this purpose called symexec. Using Platform Symphony execution tasks, developers can avoid changing their server-side code; however they can still realize all of the benefits of a serviceoriented application model including session management, service reliability and automated service management. The option to use the client-side symexec component to invoke the service provides developers the option of avoiding software development on the client side as well. Parameters can be sent to the execution service in a variety of ways including as scriptable name value pairs configured in environment variables. Figure 2 below illustrates how this works conceptually. The Platform Symphony client opens a connection to the grid as before, but connects to a service that is implemented as a Platform Symphony Execution Service to run tasks.

Figure 2 : Invoking Server Side Scripts or Binaries from an Excel application

4/6/11

The steps involved from the client perspective are: The client-side application opens a connection to the Platform Symphony Service via the Platform Symphony client-side API or the provided symexec utility. A client-side Excel application can avoid the use of the Platform Symphony API by using VBA functions to invoke symexec with values extracted from spreadsheet cells and constructed dynamically at run-time symexec can be called with arguments including create, open, send, fetch, close and run so that execution tasks can be run on the service side without needing to use the client-side API at all. For example in VB6:
Call Shell cmd.exe \\server\path_to_symphony\symexec send + args

Or in VB.NET
System.Diagnostics.Process.Start(cmd.exe \\server\path_to_symphony\symexec send +args)

The execution task can have an optional pre-execution command configured via the application profile useful for setting up the environment prior to running the service. The client will then send tasks to the execution service including an executable string with arguments or an optional execution task context including environment variables set as name value pairs. Upon receiving input from the client, the Platform Symphony execution service will spawn the execution task as a process and execute it with the input arguments provided. When the execution task has completed, the exit code of the process and task output is sent back to Platform Symphony and an optional post-execution script may be run. Error handling is configurable in the XML-based application profile.

Platform Symphony execution tasks are a unique and powerful Platform Symphony capability that provide developers the freedom to make virtually any application callable on the service side. Moreover, while weve described the case of calling service side tasks from an Excel client above, in practice these tasks can be invoked from the client in a variety of ways including through scripts on Windows or Linux or via .NET applications.

4/6/11

Pattern #3: Excel Instances on the Grid


While Excel Services and Excel 2007 provide exciting new functionality allowing Excel to be run as a service, many organizations have significant investments in complex Excel worksheets written in older environments including Excel 2000, Excel 2003 etc. These legacy spreadsheets often make extensive use of VBA, database access methods and in-house or commercially developed add-in components. While there is a need to make these existing models run more quickly, organizations simply do not have the time or resources to re-write and re-validate all of these models. To address the need to grid enable complex spreadsheet models making extensive use of VBA, Platform offers a grid integration option referred to as the Platform Symphony Connector for Microsoft Excel. This integration option allows Excel 2000, 2002 and 2003 worksheets to run as encapsulated services in Platform Symphony. This means that calculations can be performed in parallel on compute hosts on a cluster without the need to modify the spreadsheets themselves in most cases. In this grid deployment model, Excel action runs as a managed Platform Symphony service on the cluster. The Platform Symphony Connector for Microsoft Excel includes a Platform Symphony service that acts as a wrapper for Microsoft Excel, and allows information to be passed from a Platform Symphony client including the spreadsheet name, a macro name, and execution related parameters for the macro. Using this approach, spreadsheets may be implemented as services and called by a client-side instance of Microsoft Excel acting as a control sheet calling server-side macros with subsets of the data that needs to be computed. Because the spreadsheet models can work on subsets of the data in parallel, the time to run these models is dramatically accelerated.

Figure 3 : Wrapping Worksheets as callable Platform Symphony Services

The Platform Symphony Connector for Microsoft Excel includes numerous coding examples showing how services implemented as spreadsheets can be made callable from client-side applications. While the client-side application is frequently a spreadsheet, it could just as easily written in C++. Platform has gone to significant effort to address the practical issues associated which deploying Excel instances as encapsulated services. A feature-rich dialog sniffer makes it straightforward to
4/6/11 9

debug issues that can impact Excel instances such as popup-dialogs warning of particular error conditions (out of memory errors or security related settings with macros as example) that left unaddressed would simply cause service instances to freeze and become non-responsive. While newly developed Excel-based models may use Excel Services approaches, the Platform Symphony Connector for Excel with its ability to run existing spreadsheets as services without modification, is a critical part of a developers portfolio of solutions to grid-enable Excel-based models.

Pattern #4: Extending Excel Services UDFs


As described earlier, Microsoft has implemented exciting new features in Excel 2007 that make it easy to deploy Excel-based applications to the web. Based on Microsoft Office SharePoint Server 2007, the Excel Services deployment model offers a number of significant advantages. Among them are: Users are no longer required to have access to Microsoft Excel or particular worksheets to run Excel-based models Worksheets are made inherently multi-user when deployed via SharePoint Issues such as version control are simplified since spreadsheet developers can control how spreadsheets are presented and avoid the problem of multiple versions of spreadsheets floating around an enterprise.

Figure 4 depicts the SharePoint-based Excel Services architecture. Clients may interact with Excel either via a web-browser interface or clients may interact programmatically via an exposed webservices interface. The actual Excel calculation services run inside an application server container and draws content from the SharePoint content database as well as other external sources. In the Excel Services architecture, user-defined functions (UDFs) can be written to extend spreadsheet functionality and run in the context of the Excel Services Application Container.

Figure 4 : Integrating Platform Symphony in an Excel Service Model

The use of user-defined functions has several advantages. Excel functionality can be extended via a library of user-defined functions that can be stored in the SharePoint database (referred to as managed functions) so that they are callable by multiple Excel spreadsheets. Also, user-defined
4/6/11 10

functions are very straightforward to call from within Excel worksheets and can be called in exactly the same manner as other functions e.g. =MyFunction($A$1*3.6) Microsoft Visual Studio includes a class library template that makes it easy to build your own userdefined functions. Manage-code UDFs require that a dynamic link library (DLL) named Microsoft.Office.Excel.Server.Udf.dll is present. UDFs not having this attribute are ignored by the Excel Calculation Service. Sample .NET C# code below illustrates how a user-defined function called MyFunction() can be written that calls a Platform Symphony service deployed to the grid. The Platform Symphony API for .NET is in the name space Platform.Symphony.Soam as shown in the example below. This example code contains the [UdfClass] and [UdfMethod] directives that make this a managed function known to the Excel calculation service. Although some details of the code have been omitted below, it is straightforward to write a UDF using the Platform Symphony .NET client API that will invoke any Platform Symphony service. A key benefit of Platform Symphony is that the service can be running virtually anywhere and may be implemented using a different software framework.
using System; using Platform.Symphony.Soam; using Microsoft.Office.Excel.Server.Udf; namespace Platform.Symphony.Clients { [UdfClass] class SyncClient { [UdfMethod] static void MyFunction(string[] args) { try { SoamFactory.Initialize(); String applicationName = "SampleApplication"; try { connection = SoamFactory.Connect(applicationName, securityCb); try session = connection.CreateSession(attributes); for (int taskCount = 0; taskCount < numTasksToSend; taskCount++) { MyMessage inputMessage = new MyMessage(taskCount, true, senddata); TaskInputHandle input = session.SendTaskInput(taskAttr); } EnumItems enumItems = session.FetchTaskOutput((ulong) numTasksToSend); foreach(TaskOutputHandle output in enumItems) { if ( output.IsSuccessful == true ) { MyMessage outputMessage =
4/6/11 11

output.GetTaskOutput() as MyMessage; } else { SoamException ex = output.Exception; } } } finally { // Mandatory session close if (session != null) { session.Close(); } } } finally { if (connection != null) { connection.Close(); } } } catch( Exception ex ) { Console.WriteLine("Exception caught"); } finally { SoamFactory.Uninitialize(); } } } }

Some of the key benefits of using Excel Services in conjunction with Platform Symphony are: A single user running a compute intensive spreadsheet based model via SharePoint has the potential to consume all of the CPU resources on a SharePoint server making it unusable to others. By implementing UDFs that interact with the Platform Symphony client API to call Platform Symphony Services, calculations can be moved to the grid leaving the SharePoint server free to support a greater number of concurrent users and applications. Using this approach, a variety of services implementing in different languages and running on different technology platforms may be abstracted to users as simple spreadsheet services. Users may interact with the familiar well-known spreadsheet interface, while developers and system administrators benefit from an ability to better manage code and simplify new service deployment. Finally, unlike other approaches, this is an architecture that will scale. Platform Symphonys proven ability to support large numbers or current users on multi-application grids, and its ability to shift resources in accordance with application demand and site defined policies coupled with its ability to automatically provision, start and stop application services makes it straightforward for administrators to scale services in multiple dimensions i.e. more users, more departments, more compute resources and more applications.

4/6/11

12

Pattern #5: Hybrid Approaches


While not really a design pattern on its own, wed be remiss if we didnt point out that there are multiple other ways to integrate Excel-based calculation services. These tend to be hybrids of the approaches discussed thus far. With its any-to-any programming model Platform Symphony provides developers with freedom of choice. This is of critical importance since it means that technology limitations will not get in the way of the needs of the business, and developers can often choose from among several technology solutions and approaches for a particular problem. Clients and services may be written in multiple software frameworks and may run on multiple hardware and operating system platforms. In addition to supported APIs for C++, Java & .NET languages, the COM API allows VBA or VB6 scripts to call Platform Symphony services directly, and the Platform Symphony Execution Task facility allows scripts and binaries to be easily wrapped as services. All of these deployment choices coupled with the ability to wrap entire spreadsheets as managed calculation services combine to provide flexibility that is simply unmatched in the industry.

Figure 5 - Hybrid Implementation Approaches

Figure 5 shows a scenario where a spreadsheet model is deployed to the web using SharePoint based Excel Calculation Services, but behind the scenes we may want to actually invoke multiple instances of a compute service in running in parallel provided by a legacy spreadsheet application running tightly coupled macros and VBA code implemented in Excel 2003. Similarly we may want the UDFs running in our SharePoint calculation service to simultaneously invoke existing C++ or Java based services written to run on Linux or UNIX systems. Although not shown here, these calculations in turn may want to invoke other Platform Symphony services themselves via the client API or the symexec facility discussed earlier. Using Excel Services, the user interface is simplified so that the end-user need only interact with a simple spreadsheet. Platform Symphony handles all of the details of provisioning calculation services on appropriate nodes, providing session management and session recovery capability and handling any transient errors that may occur with particular task calculations. The end-user sees exceptionally fast response time owing to Platform Symphonys lowlatency design. With 1.6 millisecond latency, 100 synchronous transactions involving the grid can be
4/6/11 13

completed with over-head of less than two tenths of a second. This level of performance and flexibility in deployment approaches is simply not possible with other Excel friendly grid computing solutions.

4. Summary
Weve examined a number of scenarios where Excel-based models can be accelerated with Platform Symphony both for spreadsheets making use of use of VBA tightly coupled with Excel 2000-2003 worksheets and the more contemporary Excel services models where parallelizable Excel computations are deployed via a web interface. Platform Symphony provides several unique features and capabilities making it ideally suited as a grid development and deployment platform for Excel spreadsheets and other grid friendly workloads. The freely downloadable Developer Edition (obtained from http://my.platform.com/products/platform-symphony-de) provides developers and I.T. managers with everything needed to grid-enable spreadsheet models. Platform Symphony, with the Platform Symphony Connector for Microsoft Excel component (not included in the developers edition) provides support for running Excel 2000 to Excel 2003 spreadsheet instances as encapsulated services with a rich set of tools to aid in debugging and managing encapsulated spreadsheet models Platform Symphony makes it easy to off-load compute intensive functions from the Excel Calculation Service running on the SharePoint Server by calling Platform Symphony services via UDFs as shown in our example. Platform Symphony works well with 32 and 64 bit .NET environments, but also integrates seamlessly with a variety of UNIX and Linux based platforms and in-house developed software clients or services written in C++, Java and other support languages. Platform Symphony provides very low latency operation in the range of 1.6 milliseconds this means that Platform Symphony can scale with much higher efficiency than competing grid solutions. Low-latency is an exceptionally important requirement as it directly impacts the performance observed in front-office environments where traders & analysts need sub-second response times. Organizations can start with just a few compute nodes with the Platform Symphony Developer Edition or deploy applications at very large scale. Platform Symphony has been validated to 20,000 CPUs with single applications scaling with 95% efficiency to 4,000 CPUs taking advantage of multi-core nodes seamlessly Although not dealt with in this paper, Platform Symphony provides grid administrators with a rich set of tools required to manage production grids at scale including innovations such as XML based application profiles and a sophisticated consumer-lender model that allows grid-based resources to be shared equitably according to policy and in response to demand for resources.

For organizations running compute intensive spreadsheet based models, Platform Symphony can provided needed flexibility and agility making it easer to respond to changing business needs. By accommodating a variety of different use-cases and design patterns Platform Symphony minimizes the need to new code development reducing risk, increasing developer productivity and accelerating your ability to gain business advantage. Finally organizations have the strategic benefit of being on a technology platform that will scale both in terms of the number or nodes under management and in terms of management tools. By supporting all of this functionality across heterogeneous grids, enterprises enjoy maximum flexibility helping ensure that business priorities are achieved.
4/6/11 14

Platform Computing is the leader in cluster, grid and cloud management software - serving more than 2,000 of the worlds most demanding organizations for over 18 years. Our workload and resource management solutions deliver IT responsiveness and lower costs for enterprise and HPC applications. Platform has strategic relationships with Cray, DellTM, HP, IBM, Intel, Microsoft, Red Hat, and SAS. Visit www.platform.com. World Headquarters Platform Computing Corporation 3760 14th Avenue Markham, Ontario Canada L3R 3T7 Tel: +1 905 948 8448 Fax: +1 905 948 9975 Toll-free Tel: 1 877 528 3676 info@platform.com Sales - Headquarters Toll-free Tel: 1 877 710 4477 Tel: +1 905 948 8448 North America New York: +1 646 290 5070 San Jose: +1 408 392 4900 Europe Bramley: +44 (0) 1256 883756 London: +44 (0) 20 3206 1470 Paris: +33 (0) 1 41 10 09 20 Dsseldorf: +49 2102 61039 0 info-europe@platform.com Beijing: +86 10 82276000 Xian: +86 029 87607400 asia@platform.com Tokyo: +81(0)3 6302 2901 info-japan@platform.com Singapore: +65 6307 6590 wliaw@platform.com

trademarks of their respective owners, errors and omissions excepted. Printed in Canada. Platform and Platform Computing refer to Platform Computing Corporation and each of its subsidiaries. 040611

S-ar putea să vă placă și