Documente Academic
Documente Profesional
Documente Cultură
Platform Computing
gsissons@platform.com
Contents
1. 2. 3. Introduction ......................................................................................... 2 Overview ............................................................................................ 3 Common Excel Integration Approaches ......................................................... 4 Pattern #1: Invoking a Custom Service ......................................................... 4 Pattern #2: Scripts or Executables as Tasks ................................................... 7 Pattern #3: Excel Instances on the Grid ....................................................... 9 Pattern #4: Extending Excel Services UDFs ................................................... 10 Pattern #5: Hybrid Approaches ................................................................. 13 4. Summary .............................................................................................. 14
4/6/11
1. Introduction
This white paper illustrates how Platform Symphony can help developers improve the performance of Microsoft Excel based computations. After providing some context on the business challenge, this paper examines several alternative approaches to adapting Excel applications to a distributed grid-computing environment. Developers may also wish to review the Platform whitepaper A Developers Guide to Building High Performance Service-Oriented Applications for a more complete discussion of Platform Symphony, its unique architecture, and the Platform Symphony Developer Edition.
4/6/11
2. Overview
Microsoft Excel is widely used in a range of business applications. With a rich set of builtin functions, a straightforward programming model, and a large library of available thirdparty add-ins, Excel has become a staple tool for analysts running numerical simulations in areas including risk management, actuarial analysis and monte carlo simulation. The accessibility and ease of use of Excel has resulted in many financial services organizations making extensive use of Excel based models to automate complex repetitive tasks such as extracting market data from real-time sources and computing pricing scenarios and risk positions. With increased trading volumes, increased electronification of exchanges, and ever more complex financial products, the need to run more sophisticated analysis in a short time, taking into account more risk factors has become a critical source of competitive advantage. Particularly for front-office applications where response times are of critical importance. With Excel 2007 and Excel Services (a SharePoint technology allowing Excel 2007 workbooks to be accessed through a browser), Microsoft have recognized these uses of Excel, and have introduced a number of powerful features intended to make it more practical to deploy Excel based worksheets as shared calculation services. While there is rich functionality in .NET for developers creating new applications, a great number of existing models written in Excel 2003 and older versions are in widespread use. These models often make extensive use of VBA (Visual Basic for Applications) to perform calculations and integrate custom code implement as Excel Link Library add-ins (XLLs).
While flexible and powerful, VBA is an interpreted language and runs slowly its not uncommon for complex spreadsheets to run literally for hours on a single computer to run a complex risk calculation. While it would be nice to be able to simply re-write older spreadsheet models using compiled languages and more modern tools and approaches, its seldom that simple. Regulatory requirements dictate the need for calculation repeatability, meaning that firms are required to retain their legacy Excel models along with historical snapshots of data. Also, even if it were practical to re-architect these spreadsheetbased models around a single framework, it would be both expensive and time-consuming to do so. Most firms employ a variety of calculators and simulation tools both commercial and in-house developed. While many run on Microsoft platforms, others run on commercial UNIX or Linux environments, and calculation services are implemented in a variety of programming languages including Java, C, C++ and C# making flexibility critical. Platform Symphony integrates seamlessly into a .NET environment, but also offers a flexible, language and platform agnostic programming model. This open and flexible approach makes Platform Symphony the industrys best solution for clients needing practical solutions to accelerating a variety of Excel models implemented using multiple Excel versions and technology approaches.
4/6/11
Pattern #1
Custom Developed Services Description Excel spreadsheet calls a distributed compute services via the Symphony COM API
Pattern #2
Command line utilities as tasks An Excel client invokes services that are simple scripts or binaries running on compute nodes
Pattern #3
Excel Instances on the Grid Multiple Excel instances run in parallel called by client spreadsheets or other clients
Pattern #4
Extending Excel Services via UDFs Web-based clients access Excel Services via sharepoint employing UDFs to distribute computations to a grid
Pattern #5
Hybrid Deployment Scenarios Combined approaches sharepoint services using UDFs to invoke multiple services written in Java, C++ or running as other Excel services optional
no
no
yes
no
Windows, UNIX or Linux Excel running on a Sharepoint server Excel, C, C++, VBA, VB6, Java
Windows, UNIX or Linux or compute hosts running Excel Excel, C, C++, VBA, VB6, Java
Excel
Excel
low
very low
medium
low
low
4/6/11
The code fragment below illustrates how VBA code in the Excel worksheet can directly open a connection to a Platform Symphony Service, start a session and send tasks to a hypothetical Platform Symphony managed compute service called OptionChain() running on the grid. This OptionChain() function may have been previously implemented in an XLL. Platform Symphony hides considerable complexity from the developer, starting the compute service on appropriate nodes based on demand and in accordance with policy, and handling error conditions that inevitably arise when running a service in production and making them transparent to both the developer and the application user.
Open a connection to Platform Symphony Dim connection As CsoamConnection Set connection = soamApi.Connect(OptionChain, callback) Open a session on the grid Dim session As CSoamSession Set session = connection.CreateSession(attributes) send parameter set to the compute service For k = 0 To taskToSend - 1 Dim message As MyMessage Set message = New MyMessage .. Dim inputHandler As CSoamTaskInputHandle Set inputHandler = session.SendTaskInput(message) Next K
Once each of the tasks have been sent to the compute service, Platform Symphony can receive the results of these calculations using a few more lines of code.
4/6/11 5
Adapting existing code to run as a Platform Symphony service is similarly straightforward. The code example below shows how our sample OptionChain() service written in C++ is implemented as a Platform Symphony Service. The C++ code is wrapped in a Platform Symphony Service Container class. When Platform Symphony invokes the calculation service, it will pass a taskContext to the Platform Symphony service. The onInvoke() method retrieves the calculation parameters from Platform Symphony, calls the existing OptionChain() code with the received parameter set, and then returns the results of the calculation to Platform Symphony via the SetOuputMessage() method. The code modifications required to grid-enable the service are shown below. Note that the main method simply creates an instance of the service class and invokes the run() method so that the new service is ready to receive work from Platform Symphony. Platform Symphony automatically handles starting and stopping services instances based on demand and sharing policies reflecting how various users and groups are entitled to share the virtualized resources of the grid. A Sample Compute Service Adapted to Platform Symphony
class MyServiceContainer : public ServiceContainer { public: virtual void onInvoke (TaskContextPtr& taskContext) throw(SoamException) { MyMessage inMsg, outMsg; taskContext->populateTaskInput(inMsg); .. OptionChain optc(istr); optc.Value(); .. taskContext->setOutputMessage(outMsg); } }; int main(int argc, char* argv[]) { int retVal = 0; try { // Create the container and run it MyServiceContainer myContainer; myContainer.run(argc, argv); } catch(SoamException& exp) { // report exception to stdout cout << "exception caught ... " << exp.what() << endl; retVal = -1; } return retVal; }
The above example, while simplified slightly, shows how straightforward it is to adapt existing code to a grid-computing environment. The freely downloadable Platform Symphony Developers Edition provides numerous coding examples in different languages and programming environments.
4/6/11
4/6/11
The steps involved from the client perspective are: The client-side application opens a connection to the Platform Symphony Service via the Platform Symphony client-side API or the provided symexec utility. A client-side Excel application can avoid the use of the Platform Symphony API by using VBA functions to invoke symexec with values extracted from spreadsheet cells and constructed dynamically at run-time symexec can be called with arguments including create, open, send, fetch, close and run so that execution tasks can be run on the service side without needing to use the client-side API at all. For example in VB6:
Call Shell cmd.exe \\server\path_to_symphony\symexec send + args
Or in VB.NET
System.Diagnostics.Process.Start(cmd.exe \\server\path_to_symphony\symexec send +args)
The execution task can have an optional pre-execution command configured via the application profile useful for setting up the environment prior to running the service. The client will then send tasks to the execution service including an executable string with arguments or an optional execution task context including environment variables set as name value pairs. Upon receiving input from the client, the Platform Symphony execution service will spawn the execution task as a process and execute it with the input arguments provided. When the execution task has completed, the exit code of the process and task output is sent back to Platform Symphony and an optional post-execution script may be run. Error handling is configurable in the XML-based application profile.
Platform Symphony execution tasks are a unique and powerful Platform Symphony capability that provide developers the freedom to make virtually any application callable on the service side. Moreover, while weve described the case of calling service side tasks from an Excel client above, in practice these tasks can be invoked from the client in a variety of ways including through scripts on Windows or Linux or via .NET applications.
4/6/11
The Platform Symphony Connector for Microsoft Excel includes numerous coding examples showing how services implemented as spreadsheets can be made callable from client-side applications. While the client-side application is frequently a spreadsheet, it could just as easily written in C++. Platform has gone to significant effort to address the practical issues associated which deploying Excel instances as encapsulated services. A feature-rich dialog sniffer makes it straightforward to
4/6/11 9
debug issues that can impact Excel instances such as popup-dialogs warning of particular error conditions (out of memory errors or security related settings with macros as example) that left unaddressed would simply cause service instances to freeze and become non-responsive. While newly developed Excel-based models may use Excel Services approaches, the Platform Symphony Connector for Excel with its ability to run existing spreadsheets as services without modification, is a critical part of a developers portfolio of solutions to grid-enable Excel-based models.
Figure 4 depicts the SharePoint-based Excel Services architecture. Clients may interact with Excel either via a web-browser interface or clients may interact programmatically via an exposed webservices interface. The actual Excel calculation services run inside an application server container and draws content from the SharePoint content database as well as other external sources. In the Excel Services architecture, user-defined functions (UDFs) can be written to extend spreadsheet functionality and run in the context of the Excel Services Application Container.
The use of user-defined functions has several advantages. Excel functionality can be extended via a library of user-defined functions that can be stored in the SharePoint database (referred to as managed functions) so that they are callable by multiple Excel spreadsheets. Also, user-defined
4/6/11 10
functions are very straightforward to call from within Excel worksheets and can be called in exactly the same manner as other functions e.g. =MyFunction($A$1*3.6) Microsoft Visual Studio includes a class library template that makes it easy to build your own userdefined functions. Manage-code UDFs require that a dynamic link library (DLL) named Microsoft.Office.Excel.Server.Udf.dll is present. UDFs not having this attribute are ignored by the Excel Calculation Service. Sample .NET C# code below illustrates how a user-defined function called MyFunction() can be written that calls a Platform Symphony service deployed to the grid. The Platform Symphony API for .NET is in the name space Platform.Symphony.Soam as shown in the example below. This example code contains the [UdfClass] and [UdfMethod] directives that make this a managed function known to the Excel calculation service. Although some details of the code have been omitted below, it is straightforward to write a UDF using the Platform Symphony .NET client API that will invoke any Platform Symphony service. A key benefit of Platform Symphony is that the service can be running virtually anywhere and may be implemented using a different software framework.
using System; using Platform.Symphony.Soam; using Microsoft.Office.Excel.Server.Udf; namespace Platform.Symphony.Clients { [UdfClass] class SyncClient { [UdfMethod] static void MyFunction(string[] args) { try { SoamFactory.Initialize(); String applicationName = "SampleApplication"; try { connection = SoamFactory.Connect(applicationName, securityCb); try session = connection.CreateSession(attributes); for (int taskCount = 0; taskCount < numTasksToSend; taskCount++) { MyMessage inputMessage = new MyMessage(taskCount, true, senddata); TaskInputHandle input = session.SendTaskInput(taskAttr); } EnumItems enumItems = session.FetchTaskOutput((ulong) numTasksToSend); foreach(TaskOutputHandle output in enumItems) { if ( output.IsSuccessful == true ) { MyMessage outputMessage =
4/6/11 11
output.GetTaskOutput() as MyMessage; } else { SoamException ex = output.Exception; } } } finally { // Mandatory session close if (session != null) { session.Close(); } } } finally { if (connection != null) { connection.Close(); } } } catch( Exception ex ) { Console.WriteLine("Exception caught"); } finally { SoamFactory.Uninitialize(); } } } }
Some of the key benefits of using Excel Services in conjunction with Platform Symphony are: A single user running a compute intensive spreadsheet based model via SharePoint has the potential to consume all of the CPU resources on a SharePoint server making it unusable to others. By implementing UDFs that interact with the Platform Symphony client API to call Platform Symphony Services, calculations can be moved to the grid leaving the SharePoint server free to support a greater number of concurrent users and applications. Using this approach, a variety of services implementing in different languages and running on different technology platforms may be abstracted to users as simple spreadsheet services. Users may interact with the familiar well-known spreadsheet interface, while developers and system administrators benefit from an ability to better manage code and simplify new service deployment. Finally, unlike other approaches, this is an architecture that will scale. Platform Symphonys proven ability to support large numbers or current users on multi-application grids, and its ability to shift resources in accordance with application demand and site defined policies coupled with its ability to automatically provision, start and stop application services makes it straightforward for administrators to scale services in multiple dimensions i.e. more users, more departments, more compute resources and more applications.
4/6/11
12
Figure 5 shows a scenario where a spreadsheet model is deployed to the web using SharePoint based Excel Calculation Services, but behind the scenes we may want to actually invoke multiple instances of a compute service in running in parallel provided by a legacy spreadsheet application running tightly coupled macros and VBA code implemented in Excel 2003. Similarly we may want the UDFs running in our SharePoint calculation service to simultaneously invoke existing C++ or Java based services written to run on Linux or UNIX systems. Although not shown here, these calculations in turn may want to invoke other Platform Symphony services themselves via the client API or the symexec facility discussed earlier. Using Excel Services, the user interface is simplified so that the end-user need only interact with a simple spreadsheet. Platform Symphony handles all of the details of provisioning calculation services on appropriate nodes, providing session management and session recovery capability and handling any transient errors that may occur with particular task calculations. The end-user sees exceptionally fast response time owing to Platform Symphonys lowlatency design. With 1.6 millisecond latency, 100 synchronous transactions involving the grid can be
4/6/11 13
completed with over-head of less than two tenths of a second. This level of performance and flexibility in deployment approaches is simply not possible with other Excel friendly grid computing solutions.
4. Summary
Weve examined a number of scenarios where Excel-based models can be accelerated with Platform Symphony both for spreadsheets making use of use of VBA tightly coupled with Excel 2000-2003 worksheets and the more contemporary Excel services models where parallelizable Excel computations are deployed via a web interface. Platform Symphony provides several unique features and capabilities making it ideally suited as a grid development and deployment platform for Excel spreadsheets and other grid friendly workloads. The freely downloadable Developer Edition (obtained from http://my.platform.com/products/platform-symphony-de) provides developers and I.T. managers with everything needed to grid-enable spreadsheet models. Platform Symphony, with the Platform Symphony Connector for Microsoft Excel component (not included in the developers edition) provides support for running Excel 2000 to Excel 2003 spreadsheet instances as encapsulated services with a rich set of tools to aid in debugging and managing encapsulated spreadsheet models Platform Symphony makes it easy to off-load compute intensive functions from the Excel Calculation Service running on the SharePoint Server by calling Platform Symphony services via UDFs as shown in our example. Platform Symphony works well with 32 and 64 bit .NET environments, but also integrates seamlessly with a variety of UNIX and Linux based platforms and in-house developed software clients or services written in C++, Java and other support languages. Platform Symphony provides very low latency operation in the range of 1.6 milliseconds this means that Platform Symphony can scale with much higher efficiency than competing grid solutions. Low-latency is an exceptionally important requirement as it directly impacts the performance observed in front-office environments where traders & analysts need sub-second response times. Organizations can start with just a few compute nodes with the Platform Symphony Developer Edition or deploy applications at very large scale. Platform Symphony has been validated to 20,000 CPUs with single applications scaling with 95% efficiency to 4,000 CPUs taking advantage of multi-core nodes seamlessly Although not dealt with in this paper, Platform Symphony provides grid administrators with a rich set of tools required to manage production grids at scale including innovations such as XML based application profiles and a sophisticated consumer-lender model that allows grid-based resources to be shared equitably according to policy and in response to demand for resources.
For organizations running compute intensive spreadsheet based models, Platform Symphony can provided needed flexibility and agility making it easer to respond to changing business needs. By accommodating a variety of different use-cases and design patterns Platform Symphony minimizes the need to new code development reducing risk, increasing developer productivity and accelerating your ability to gain business advantage. Finally organizations have the strategic benefit of being on a technology platform that will scale both in terms of the number or nodes under management and in terms of management tools. By supporting all of this functionality across heterogeneous grids, enterprises enjoy maximum flexibility helping ensure that business priorities are achieved.
4/6/11 14
Platform Computing is the leader in cluster, grid and cloud management software - serving more than 2,000 of the worlds most demanding organizations for over 18 years. Our workload and resource management solutions deliver IT responsiveness and lower costs for enterprise and HPC applications. Platform has strategic relationships with Cray, DellTM, HP, IBM, Intel, Microsoft, Red Hat, and SAS. Visit www.platform.com. World Headquarters Platform Computing Corporation 3760 14th Avenue Markham, Ontario Canada L3R 3T7 Tel: +1 905 948 8448 Fax: +1 905 948 9975 Toll-free Tel: 1 877 528 3676 info@platform.com Sales - Headquarters Toll-free Tel: 1 877 710 4477 Tel: +1 905 948 8448 North America New York: +1 646 290 5070 San Jose: +1 408 392 4900 Europe Bramley: +44 (0) 1256 883756 London: +44 (0) 20 3206 1470 Paris: +33 (0) 1 41 10 09 20 Dsseldorf: +49 2102 61039 0 info-europe@platform.com Beijing: +86 10 82276000 Xian: +86 029 87607400 asia@platform.com Tokyo: +81(0)3 6302 2901 info-japan@platform.com Singapore: +65 6307 6590 wliaw@platform.com
trademarks of their respective owners, errors and omissions excepted. Printed in Canada. Platform and Platform Computing refer to Platform Computing Corporation and each of its subsidiaries. 040611