A Problem Which Turned Interesting
I've been working on trying to put together a new library for storing game object definition data in the Blades of Avernum Editor. The existing system is very bare-bones: a struct type for each object type and a lot of ugly hard-coded switch statements to parse the various object properties from a definition script file. Specifically, the main definition parsing function (a 300 line brute) has multiple repetitive switches over the type of object definition being read, which then call helper functions which are themselves massive switch statements over arbitrary id codes assigned to the various property names which then actually store the parsed data into the structs. Since most of it's witches over densely packed integers, I'd bet it's about the fastest way the job could be done, but it involves a lot of repetitive, hard to read, and tedious to extend code. Naturally, based on the arrogant belief that I can solve any problem better, I've been trying to improve upon this method of structuring the parsing.
After a bit of work, I think I've found a way to do it more elegantly, although perhaps sacrificing some of the brute-force method's raw speed. I began by assuming that the object definitions themselves will be in the form of very basic text book object-oriented design: classes with member variables, and 'getters' and 'setters' to manipulate them. So obviously, it's not the parser's problem to figure out how to store a the maximum health property for a creature definition; it should hand the work off to the creature definition object by calling the appropriate setter function. Ideally, the creature definition object should be able to inform the parser about the properties it has which can be set, and which functions to call to set them. That way, the parser doesn't need to have built into it the details of every possible definition type, only the basic syntax of how definitions and properties are structured. This all sounds pretty good: The parser will read an identifier in the form of a string. Then, it'll need to look up what function to call, which sounds like we'll need a map. The map needs to map strings to setter functions, or rather setter function pointers. Here's the tricky bit though: the type of the function pointer which should result will depend on the type of definition being read. Here's what I mean, in a simplified pseudo-defintion script format:
define monster 1
name = "ogre"
define monster 2
name = "dragon"
define weapon 1
name = "rusty dagger"
Here we assume we have two kinds of definitions, monsters and weapons, and that among other properties, each definition type has a name property. Players like that sort of thing, after all. So, anyway, we imagine our parser reading this file. It starts reading the definition for the ogre, and finds the name property being set. Assumedly our program already contains datatypes monsterDef_t and weaponDef_t. So, the parser one way or another causes a lookup to be done in a map of identifier strings to setter function pointers. Obviously it's going to need to get a function pointer for a member function of the monsterDef_t type which takes one string argument in order to set that name. That all sounds fine. But loom what happens later on: When the parser reads the dagger definition, it will need to do a lookup in another map and get back a very different function pointer, namely a member function of the weaponDef_t type which takes a string argument. This is not the same kind of pointer as the one from before. In properly awful function pointer syntax they will look like void(monsterDef_t::)(std::string) and void(weaponDef_t::)(std::string).
It might not seem problematic that we have multiple types of pointers in play. After all, it's not like we want to jam them all into the same collection, in order to distinguish the properties of different kinds of definitions each definition type will need it's own map of identifiers to pointers to setters. But one piece of code, somewhere in the guts of the parser is going to handle any pointer that results from looking up which setter to call. There's no type we can choose1 to use to, say, store the result of the lookup because there's nothing in common between the two types of pointers we need to deal with. The textbook OO approach of 'make them both/all derive from a common base class' can't2 get you out of this either; you can't define a pointer type to be derived from another type!
So how can this be solved? The way I thought about it was by working in from both ends of the problem. On one side are the various definition datatypes. They want to only have variables, getters, and setters. Let's assume that they are in fact not part of any given class hierarchy, each just existing independently of the others. So, the extent of the interface at this level is that there are bunch of types T each of which has some functions which take arguments like single integers or strings. At the other end we have the parser, which wants to be able to call a function or function and get back something like and enum to find out what type of property it's in the process of reading, and then after having read the property value a function it can call with the property name and value, which will be one of several types to set the value and have done with it. Obviously even at the parser level there will have to be multiple functions to call to store away data since the data to be stored will be f different types like integers and strings which can't go as parameters to the same function. So the parser expects to have a separate function to call for each type of property.
We could in theory path this together with a single chuck of code stuck into the middle. The lazy way to do it would be to jam in a common base class for all of the definition datatypes which presents a set of virtual functions matching the interface the parser is prepared to work with. By doing that, though, we would force every definition type to implement several virtual functions to accept property names along with integer, string, and other value types and set them if they match acceptably. This would in principle work, but it has the ugly feature that it is rather invasive, infecting the basic data storage types with a bunch of logic and structure related to the problems of parsing.
So as an alternative, why not go ahead and make the common base class, but make it's child classes entirely separate from the data storage classes. This would, according to my vague understanding, be more or less an instance of an adapter pattern. It's attractive here because the storage types can remain lean-and-mean with only static accessors that the majority of the editor code can call3, the base adapter class can present a unified interface to the parser, and the adapter classes can encapsulate all of the details of what property names are valid for a given definition type and to which setter functions they should be mapped. Again, this would solve the problem, and much better than the previous attempt, but it still involves writing a lot of boilerplate code for each concrete adapter type: it will need to juggle multiple maps and functions to do look ups in those maps. So, there's one more improvement I wanted to make.
This is my favorite part. Boilerplate code is boring, so can we get the compiler to write it for us? That means it's time for templates! In the preceding discussion templates haven't appeared, although I though a lot about them as I was bumbling through deriving my ideas up to this point. For the earlier problems templates couldn't help because they don't unify types, a template wraps a type (or combination of types) into a unique new type. Now though, we've safely buffered the unique types away by themselves. SInce we know that every adapter is going to have to have several variables of its own, like a map of identifier strings to enums representing property types, a map of identifier strings to integer setter functions, a map to string setter functions, and so on, so we can bundle that all up into a class together. That class should be a superclass of the actual adapters, but it can't be the base class we already have, since that has to be a single, storage-type agnostic class, so we certainly can't make it a template parameterized with the storage type. So, we should insert it into the inheritance hierarchy between the adapter base and the actual adapters. It can then handle the boring work of actually doing lookups in the various maps, since the mechanisms for doing that will be common to all adapters. All the the adapters actually need to do is populate the maps with the appropriate identifiers and pointers, then sit back and let the it's superclass do all of the actual work. Here's what my prototype of the adapter framework came out looking like:
class objectManager{
public:
enum propertyType{INVALID, INTEGER, STRING};
virtual void beginDefinition(unsigned int index)=0;
virtual propertyType checkPropertyType(const string& propName)=0;
virtual void setProperty(const string& propName, int val)=0;
virtual void setProperty(const string& propName, const string& val)=0;
};
template <typename T>
class objectManagerImpl : public objectManager{
protected:
unsigned int lastItem;
map<unsigned int,T>& objectList;
map<string,propertyType> propertyList;
map<string,void(T::*)(int)> intProperties;
map<string,void(T::*)(const string&)> strProperties;
public:
objectManagerImpl<T>(map<unsigned int,T>& m):objectList(m),lastItem(-1){}
virtual propertyType checkPropertyType(const string& propName){
typename map<string,propertyType>::const_iterator it
=propertyList.find(propName);
if(it!=propertyList.end())
return(it->second);
return(INVALID);
}
virtual void beginDefinition(unsigned int index){
typename map<unsigned int,T>::iterator it=objectList.find(index);
if(it==objectList.end())
objectList[index]=T();
lastItem=index;
}
virtual void setProperty(const string& propName, int val){
if(lastItem==(unsigned int)-1)
return; //eventually should actually indicate error
typename map<string,void(T::*)(int)>::const_iterator it
=intProperties.find(propName);
if(it==intProperties.end())
return; //eventually should actually indicate error
(objectList[lastItem].*(it->second))(val);
}
virtual void setProperty(const string& propName, const string& val){
if(lastItem==(unsigned int)-1)
return; //eventually should actually indicate error
typename map<string,void(T::*)(const string&)>::const_iterator it
=strProperties.find(propName);
if(it==strProperties.end())
return; //eventually should actually indicate error
(objectList[lastItem].*(it->second))(val);
}
void addProperty(const string& propName, void(T::*propSet)(int)){
typename map<string,propertyType>::iterator it
=propertyList.find(propName);
if(it!=propertyList.end())
return; //need to indicate collision with an error/exception
propertyList.insert(make_pair(propName,INTEGER));
intProperties.insert(make_pair(propName,propSet));
}
void addProperty(const string& propName, void(T::*propSet)(const string&)){
typename map<string,propertyType>::iterator it
=propertyList.find(propName);
if(it!=propertyList.end())
return; //need to indicate collision with an error/exception
propertyList.insert(make_pair(propName,STRING));
strProperties.insert(make_pair(propName,propSet));
}
};
This is designed with the assumption that I'm going to be storing the actual data objects in maps of integers to object definitions. Then, supposing I want to work with two types of game objects, fish and jars of marmalade:
struct fish{
int length;
int angriness;
fish():length(0),angriness(0){}
void setLength(int len){
length=len;
}
int getLength() const{
return(length);
}
void setAngriness(int a){
angriness=a;
}
int getAngriness() const{
return(angriness);
}
};
struct marmalade{
int calories;
std::string labelText;
marmalade():calories(0),labelText(""){}
void setCalories(int c){
calories=c;
}
int getCalories() const{
return(calories);
}
void setLabelText(const std::string& s){
labelText=s;
}
const std::string& getLabelText() const{
return(labelText);
}
};
The entire interface for the parser will be pretty much taken care of with:
class fishManager : public objectManagerImpl<fish>{
public:
fishManager(std::map<unsigned int,fish>& m)
:objectManagerImpl<fish>(m){
addProperty("length",&fish::setLength);
addProperty("angriness",&fish::setAngriness);
}
};
class marmaladeManager : public objectManagerImpl<marmalade>{
public:
marmaladeManager(std::map<unsigned int,marmalade>& m)
:objectManagerImpl<marmalade>(m){
addProperty("calories",&marmalade::setCalories);
addProperty("label_text",&marmalade::setLabelText);
}
};
I still haven't written the actually parser, the above prototype code lacks error handling, and I'd like to include provisions for limiting properties to specified ranges, but overall I'm pretty happy with how neatly and compactly this worked out. Not in the least because I got to use templates and function pointers.
-
void*, besides being generally horrible, isn't properly applicable to member function pointers anyway ↩
-
at least not superficially ↩
-
After all, for much of the editor code there's no use for polymorphism of definition data; floors and monsters are very rarely interchangeable. In an unrelated aside, I've just had a brilliant idea for the defense of a future evil base/laboratory. ↩