Documente Academic
Documente Profesional
Documente Cultură
LYQUIDITY SOLUTIONS.
www.lyquidity.com/strongtyping/.
Page |2
Executive summary
Studies have shown that any moderately complex new spreadsheets model is likely to
contain errors. Moreover, those studies show that even expert users are able to find
barely half the errors.
A spreadsheet model created by a user is a form of program, the spreadsheet tool a
programming environment and cells are its lines of code. The design of spreadsheet
applications makes them easy to use, but that ease of use comes at the price of
omitting most of functionality a programmer would
expect to find in a programming environment,
tools that help to identify and mitigate errors.
Is it any wonder, then, that spreadsheet
models contain errors?
Strong Typing adds to Excel key features a
programmer would take for granted but which do
not exist in any mainstream spreadsheet
environment. Using these features, the author of
a spreadsheet model can more easily find,
identify and fix errors such as invalid or incorrect
cell references.
1
2
http://www.eusprig.org/horror-stories.htm
Panko What we know about spreadsheet errors
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models
Page |3
Best practices
Responsible spreadsheet users take steps to minimize the incidence of errors. For
example, separating input, calculation and output cells; using external reviewers;
documenting the spreadsheet models and lots of testing. But even so, errors persist.
Strong Typing
If a spreadsheet model is a program then its cells are the lines of code and formulas are
its language. There are many languages used by software engineers and there are
different ways to characterize these languages. For example, declarative vs imperative,
functional vs non-functional, general vs domain-specific. Another relevant way
characterize languages is dynamic vs statically typed.
Dynamic
Dynamic languages include JavaScript and PHP. Most web pages contain some
JavaScript, while PHP powers many web sites and is used to create the popular Blog
software, WordPress.
Dynamic languages are considered
easy-to-use because theres no
compile step. Just write code and run.
Statically Typed
Statically typed languages include C,
C++, Java, C#. Any operating system
you care to think about has been
developed using C or C++, Microsoft
Office is created using C++ while
Android is a mobile operating system
created using Java.
Dynamic vs Static
Dynamic languages are considered
easy-to-use, not only because theres
no compile step, they dont require the
programmer to do as much abstract
thinking. For example, there no need
to define more types than the small
number of built-in ones provided by the
language, such as Number, String and
Boolean.
Dynamic languages are also more
forgiving.
For example they will
coerce values from one type to
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models
Page |4
another. If the program is asked to add a piece of text containing the digits 1 and 2 to
the number 8 it will coerce the text to the number 12 and then add it to the number 8 to
yield a result of 20.
Dodging the need to add types, sounds like a great way to go, but the result is usually
error prone code. Adding Type information to program code allows programming tools
to warn the programmer when program code includes logical inconsistencies. The
compile step performed by statically typed languages is responsible for identifying these
inconsistencies and for testing all routes through the code,
code that when using a dynamic language may never be
exercised during testing.
For these reason it is uncommon for significant programming
projects to depend upon a dynamic language if there is an
alternative. Where a dynamic language cannot be avoided,
there are often tools to retro-fit a type system.
For example EBay, which must rely on JavaScript because it is
the only practical choice for browser-based applications,
created VJET3, so their programmers are able to write
JavaScript, but in a way that allows them to include type
information. Microsoft has recently released a tool called
TypeScript4. This is another tool to allow JavaScript to be
written in a way that can include type information.
What is a Type?
In a spreadsheet a value representing, say, Revenues is just a
number. Likewise, a value representing the concept Cost of
Goods Sold is also just a number. We all know what numbers
are but in this context, number is a Type.
Because both concepts are numbers and especially if any
local label is removed (or wrong) then unless the spreadsheet
user knows one value represents the concept Revenue its
not possible for that user to distinguish the value for Revenue
from the value for Cost of Goods Sold as both values are just
numbers.
In a language with a Type system such as C++ and Java the
values can be tagged with custom Types. For example, there
might be a Type called revenues and another called cogs
both of which are extensions of the more fundamental Type
number.
If the value representing Revenues is assigned the Type revenue and the value
representing Cost of Goods Sold is assigned the Type cogs, then the language tool
being used is able to generate warnings if the user attempts to use Revenues when
Cost of Goods Sold is expected. This eliminates a whole class of potential errors that
plague applications created using dynamic languages.
In the context of a spreadsheet model, this means it becomes possible to ensure that
when the formula for Gross Profit includes a reference so to the value for Revenue it
really is the value for revenue not some other invalid account.
3
4
http://eclipse.org/vjet/
http://www.typescriptlang.org/
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models
Page |5
Cant a Name be used to provide Type information?
Names are commonly used to provide a human readable tag for a cell and when used
in cell references within formulas, it can make the whole formula more readable.
The formula =A1-A2 might become =Revenue-CostOfGoodsSold
Using Names to identify a cell or range of cells is a great idea. But the use of Names
doesnt provide any validation that the reference is correct. If the Name definition is
changed the changed definition may or may not reference a valid cell in the context of
the formula.
Scopes
When creating a cell reference in a formula, the scope of the reference is the sheet.
That is, it is valid that the reference is to any cell in the same sheet. But what if you
want to ensure the reference is to a cell or cells in a column of Actuals or that the
reference is to a cell in the set of input values for this years Budget?
In a programming language, a scope defines when variables are valid to use. In an
analogous sense, Strong Typing Scopes allow you to define a set of cells that are valid
in a specific context. As a result, Scopes allow formula validations to be fine-tuned.
Assertions
Its common for programmers to want to test that values being passed to functions in a
program meet specific
requirements such as being
positive or having a value
between two bounds. These
tests are called Assertions.
Failed assertions cause the
output of a message to alert
the programmer of potential
errors.
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models
Page |6
Assertions in Strong Typing fulfil this role. Assertions apply to cell references and allow
the model builder to make assertions about characteristics of the reference. There are
many assertions and examples include checking that the referenced cell is in a specific
or column/row, that the referenced cell is not blank or has positive value, does or does
not contain a formula or is formatted in a specific way.
Excel and other spreadsheet tools include facilities that can be used to implement some
assertions. For example, conditional formats can be used to color code cells based on
their result value. However this is limited to the end result value. Strong Typing
assertions apply at the level of the cell reference so can be used to guard the values
being supplied to, say, built-in functions.
Componentization
Most software applications are composed of components. The use of components
allows code to be re-used so that multiple applications can benefit from using code that
has been tested and will be updated to address any issues identified in the future.
Excel supports the use of templates which also promotes the re-use of proven
spreadsheets. Strong Typing meta-data is stored within a workbook so this meta-data
will be part of a template if that template is based on a workbook that has been marked
up using Strong Typing. But an Excel template will be a whole spreadsheet like a
whole application. Sometimes it will be better to be able to re-use just portions of a
spreadsheet that has been marked up with meta-data.
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models
Page |7
Strong Typing for Excel from Lyquidity: create more robust spreadsheet models