For the past few weeks one of my weekend side-projects was to improve TorqueScript, the scripting language behind the Torque3D game engine.
This time I decided to tackle TorqueScript’s type conversion system. This basically handles converting strings such as “0.1” into native values such as 0.1f, and vice versa. Since everything must be converted to and from strings all the time when manipulating object fields and object variables, it’s a great bottleneck for performance if scripting is used too heavily.
A Little Background on the Type System…
In TorqueScript, values can be one of four types:
- Unsigned Integers
- Custom Types
The first three are self-explanatory. The fourth is more involved as it is used to expose engine variables and object fields as strings. Getters are implemented using the ConsoleGetType macro which implements a function returning a string based on the value, setters are implemented with the ConsoleSetType macro which implements a function which parses a string and sets the value.
All access to registered engine variables and objects goes through this type system, so any improvements to it have a good potential to improve overall performance. For instance point values in TorqueScript are always manipulated as strings, so any complex vector math incurs the overhead of string conversion. For this reason I also decided to implement value-based arrays.
What went wrong
One big problem with altering something as monolithic as TorqueScript is little changes can cause big problems since the whole implementation is essentially an inter-connected mess. Changed the syntax? Be prepared to debug the AST. Changed the AST? Be prepared to debug the bytecode. Changed the bytecode? Be prepared to debug the interpreter.
For instance, when changing parts of the AST I frequently found that it wouldn’t generate the correct opcodes. The issue was that the AST nodes had been designed assuming the other statement nodes behaved in a specific way, so when I added new variable types to support arrays it completely failed.
In another case simple changes to the interpreter and the AST caused the float stack to become unbalanced, which required painfully tedious step-by-step debugging in order to resolve the issue.
Problems from the ground up
TorqueScript suffers from a key design issue: stack values are split into three pools: strings, integers and floats. The only generic value type is a string e.g. when calling functions, the compiler has to assume the return value is a string.
There is some code present in the interpreter which skips the conversion if the next opcodes indicate the value will be converted to an integer or float, but this is more of an elaborate hack around a poorly designed system.
Since there is no generic stack value, operator overloading is impossible (thus the need for specific string concatenation operators). Opcodes relating to field access need to be duplicated in order to handle different types. Object references need to be resolved every time a field is accessed or when an object function is invoked.
TorqueScript also lacks tests to determine if it functions correctly or not, which ultimately means everything needs to be manually tested in order to verify it still works correctly… unless you have the time and resources to create a complete testing suite.
While this project has been interesting, ultimately I’ve decided to shelve it for the time being.
It turns out fixing a scripting language is harder than it first looks. While I managed to implement arrays and improve the way certain types are set, fixing all the bugs caused by the changes turned out to be an even bigger hurdle. In the end, I was stuck trying to resolve some tricky killer bugs in the editor scripts.
In light of my recent experience, I have the uttermost respect for anyone who maintains a scripting language.
For anyone interested, this code is available in my Torque3D fork.