Building one Statically Typed System atop Another
One of the most challenging (and definitely most fun) problems that I've attacked in the course of writing my own script interpreter has been allowing scripts to call native C++ functions1. What makes it tricky is the arguments for the call won't actually exist until run-time, but the C++ compiler won't (with good reason) allow you to call a function unless you can convince it at compile-time that you're calling with the right number and type of arguments. The C++ objects that my script interpreter uses to represent script types do have a well defined C++ type (cleverly called value) but that's not a type that any standard function accepts. If I want to call abs, sin, or pow I need arguments that are ints, floats, or suchlike, nothing more or less.
Let's suppose that I want to call long abs(long), specifically. To do so, I need to have a long, or an expression whose type is long. The input I have is an object (or expression) of type value. Given
struct value{
VariableType type;
union{
long iVal;
double fVal;
stringWrapper* sVal;
//some other union members omitted
};
//various operator declarations omitted
};
what I need to do seems pretty obvious. Assuming the input is v, all I have to do is check its type and if it's okay to do so, pull out its iVal:
result = abs( v.fVal );
That was easy enough. If I want to call double abs(double) I can write almost the same thing. This is all great, but the point is that I won't know what I want to call until run-time when I read the script. I could write a wrapper function for every C++ function I might want to call, which wouldn't be hard but it would be a lot of typing for a rather inelegant solution. I want to make the compiler write the code for me, which is another way of saying that I plan to use templating.
What should go into the template? Well, it's going to need a pointer to the function I want to call, and it's going to need to know about the type of that function pointer. The type could be a single template parameter, but if it's broken up into the return type and the arguments life will be easier later on. Again, the tricky bit is that the types of the inputs won't be the same as the types the function pointer expects. It's easy to write manually the necessary code to transform the generic value type into a C++ type in any specific case, so what's needed is to tech the compiler a recipe to do that. However, we can't just blithely say to the compiler 'oh, you just look in the struct and pull out the member whose type is the same as the type you want'. The compiler doesn't dirty it's hands composing code; it's not a typist. What it will do is fill in the blanks of a template, like a human filling out Mad Libs. What needs to be done is to give it the Mad Libs to fill out.
Okay, so another template is needed to convert a value into an arbitrary type. Something like:
template< class T >
class TypeExtractor{
public:
T evaluate(value v);
};
Great, but how to write TypeExtractor<T>::evaluate()? It needs to be written differently depending on whether T is long or double, and that points to template specialization:
template<>
class TypeExtractor<long>{
long evaluate(value v){
//if v meets certain criteria
return(v.iVal);
}
What this actually is is that we manually write the conversion for each type we care about, but having done so we can get the compiler to mix and match them for us as needed. If it needs to prepare a function long schnobble(long, double, string) we can just tell it that since argument types are long, double, and string, it just needs to make a TypeExtractor with each of those types as a parameter. Then, assuming we've written all of the necessary TypeExtractor specializations, the compiler will be happily satisfied that it can take three values and call the schnobble function. A way to do it would look something like this:
template <class R, class T1, class T2, class T3>
class BuiltinFunctionExpression3:public FunctionExpression{
protected:
R (*func)(T1,T2,T3);
public:
BuiltinFunctionExpression3(R (*f)(T1,T2,T3)):func(f){}
value evaluate(value v1, value v2, value v3){
return(TypeWrapper<R>::wrap(
func(TypeExtractor<T1>().evaluate(v1),
TypeExtractor<T2>().evaluate(v2),
TypeExtractor<T3>().evaluate(v3))
));
}
};
Note that I've assumed that there is a TypeWrapper template which performs the opposite service of the TypeExtractor. Regretably, until Variadic templates are well supported2 the best we can do is to write a separate template for each arity of function we wish to wrap3. While all this is a lot of code to start with, it won't grow; unlike writing a wrapper for every additional function.
Okay, that was all very nice, and it does in fact work well in practice. But what happens if I decide to be really ambitious and allow templated types to exist in script? Suppose I want to let script writers create variables of type list<int>, list<string>, list<list<float>> and so on? They might all actually be vector<value> under the hood, Things now get tricky if I want to write native C++ functions which need to work with these script templated types. It's possible to use the above system to do things like pass a value whose script type is list<int> into a C++ function which accepts an argument of type vector<long> and so on, but what about writing a C++ function which works generally on list<T>, say in order to be able to print any list? Literally speaking, doing so is impossible; the script type of the argument can be a type that does not exist until it gets defined at run-time, so there's no way to write a C++ template for it since templates are read and instantiated at compile-time. If that can't be done, what about writing a C++ function that accepts a value object, but expects that value to have a certain sort of script type, like list<T>? I'm still groping toward a solution, but I think it can be done. What I envision roughly is a TypeExtractor<value> which takes a value and gives back another value, but gives a guarantee about the script type of the value returned. That way type safety will be fully observed throughout both the script and C++ layers.
The detail that I'm still trying to figure out is how to communicate to a TypeExtractor<value> a specification for script types to permit. Such a thing can be easily expressed as an object, but there isn't a good way to send an object into the BuiltinFunctionExpressionN template so that it could be passed to the TypeExtractor<value> constructor. What I'm trying now to wrap my mind around is expressing a script type requirement as a C++ type. After all, types can be used as template parameters, which would entirely dodge the need to create and hand around extra objects. I think it should be possible, but the whole idea hasn't yet quite fallen into place in my mind.
-
Only those that the script interpreter exposes access to, of course. We can't just have people calling
systemwilly-nilly. ↩ -
Read that: "Until Apple updates the version of gcc in the Developer Tools to 4.3 or higher. . . " Fingers are crossed for that happening along with the release of Snow Leopard. ↩
-
And a duplicate of each for functions returning
void, unfortunately, since the given template breaks in that case. ↩