A Refined Visitor

I had an idea a couple of hours ago that for someting I was doing the Visitor Pattern was probably what I should be using: I had a tree of objects I wanted to scan in different ways.

There were two concerns that sprang to mind. Firstly, the Visitor pattern is usually written with a lot of virtual member functions. The nodes being visited need to expose a virtual Accept function so that double dispatch can work properly in the presence of polymorphism. The visitors need expose virtual Visit functions because they must derive from a specific base class so that the nodes know what to accept in their Accept functions. While polymorphism and the use of virtual functions was entirely idiomatic when the GoF wrote their book, it has fallen out of favor in current C++ practice1. Also, while a large part of the point of the pattern is to require only that each visitable object expose a single Accept function, it works partially by shifting the burden onto the visitor, which must have a separate Visit function for each visitable type2. It's not too hard to guess how I'll try to eliminate virtual functions and extra typing at the same time.

The second concern is that I'd like different visitors to be able to visit the children of various node in different orders. I've seen this sort of thing done before via tricks like doubling interface of the visitors so that for each Visit function (which will be called after the node's children have been visited) there is also a PreVisit function which is called before the node's children are visited. This way the visitor can implement either or both and mostly get the control it wants. The two disadvantages of this system are that it involves even more typing, and that the visitor still doesn't get to choose in what order to visit the nodes children; it can only move them, as a block, before of after (or in between) the node itself.

To address the first source of bother, I will of course attempt to use templates. If the nodes' Accept function is templated on the visitor type we no longer need a common base class for visitors. In turn, this frees up the visitors to supply their Visit functions as templates, since they no longer have any reason to make them virtual. That means that the visitor doesn't have to enumerate all the things it can visit: it can supply a generic Visit which does nothing, and provide specializations which actually do something for the cases it cares about. It's all very neat and tidy, and requires very little typing. It comes, however with a substantial downside, for which I can think of no viable solution: If the nodes' Accept is a template, it cannot be virtual. This means that if a polymorphic node* is ever encountered, the doble dispatch will happen based on its static type, rather than its dynamic type, which is almost certainly wrong. For my intended use this is no problem, as I intend to use no polymorphism anyway, just deeply nested objects of statically known types. In cases where polymorphism is needed however, this seems to be a deal breaker. If, however, polymorphism isn't needed, one can simply do:

class SomeNode{
public:
    template<typename VisitorType>
    void Accept(VisitorType& visitor){
        visitor.Visit(*this);
    }
};

class SomeVisitor{
public:
    template<typename NodeType>
    void Visit(NodeType& node){}
};

template<>
void SomeVisitor::Visit<SomeNode>(SomeNode& node){
    //do something with node
}

To addres the second annoyance I'd like to give each node a way to define a default visiting order for its children (which can be used by visitors which don't care much about order), and give each visitor a way to specify a nondefault visiting order for the children of specific node types. So, a visitor should have the choice between, let's say, supplying a Visit function in which it takes responsibility for choosing the order of visiting for the node's children, and a VisitOrdered3 in which it doesn't worry about the children, knowing that the node will take care of taking it to its children at some unspecified point. Since if the visitor exposes a non-trivial Visit we don't want to also call VisitOrdered, so the node should call Visit first. Rather than requiring that Visit somehow indicate whether or not VisitOrdered should be called, why not just give it a default implementation that just triggers the call to VisitOrdered directly? That way, when it is overridden the call to VisitOrdered will just never happen. There's a wrinkle here, in that the default Visit can't just call VisitOrdered directly, since it needs to tell the node to take care of visiting the children. We can solve this by adding a second version of Accept to node, AcceptOrdered, which send the visitor to the node's children in some order, and also invokes VisitOrdered. So, we now have a chain of calls Accept->Visit->AcceptOrdered->VisitOrdered, which the visitor can choose to stop early, at the Visit stage. It ends up looking like this:

class SomeNode{
public:
    template<typename VisitorType>
    void Accept(VisitorType& visitor){
        visitor.Visit(*this);
    }
    template<typename VisitorType>
    void AcceptOrdered(VisitorType& visitor){
        //for each child, call:
        //child.Accept(visitor)
        visitor.VisitOrdered(*this);
    }
};

class SomeVisitor{
public:
    template<typename NodeType>
    void Visit(NodeType& node){
        node.AcceptOrdered(*this);
    }
    template<typename NodeType>
    void VisitOrdered(NodeType& node){}
};

//If the visitor wants to control the order
//of visiting the children of SomeNode:
template<>
void SomeVisitor::Visit<SomeNode>(SomeNode& node){
    //for each interesting child of node, explicitly call:
    //child.Accept(*this)
    //also, do whatever needs to be done with node itself
}

//Otherwise, let the node choose the order:
template<>
void SomeVisitor::VisitOrdered<SomeNode>(SomeNode& node){
    //do something with node
}

It isn't much longer than before (most of the length above is because I showed both the Visit and VisitOrdered cases, which would actually exist at the same time), and greatly increases the pattern's flexibility. Also, while instead of a pair of calls for each node visited we now have as many as four calls ping-ponging back and forth, the compiler should be able to easily see which are trivial and eliminate them by inlining4. This refinement could, of course be perfectly well applied while leaving all interfaces virtual; it would mostly just entail some extra typing in the visitor base class.

Here, finally, is an example of using all of this on a somwhat stupid tree of objects with statically known types: a linked list of pairs of wrapped integers. We define one visitor that just goes through an prints all of the integers, and another which makes a deliberate effort to print them all in the reverse order.

#include <iostream>

#define VISITABLE \
    template<typename VistorType> \
    void AcceptVisitor(VistorType& visitor){ \
        visitor.Visit(*this); \
    } \
    template<typename VistorType> \
    void AcceptVisitorOrdered(VistorType& visitor)
#define TRIVIALLY_VISITABLE \
    VISITABLE{ \
        visitor.VisitOrdered(*this); \
    }
#define VISITOR \
    template<typename NodeType> \
    void Visit(NodeType& node){ \
        node.AcceptVisitorOrdered(*this); \
    } \
    template<typename NodeType> \
    void VisitOrdered(NodeType& node){}

class Foo{
int i;
public:
    Foo(int i):i(i){}
    int getValue(){ return(i); }

    TRIVIALLY_VISITABLE;
};

class Bar{
Foo a,b;
public:
    Bar(int a, int b):a(a),b(b){}
    Foo& getA(){ return(a); }
    Foo& getB(){ return(b); }

    VISITABLE{
        a.AcceptVisitor(visitor);
        b.AcceptVisitor(visitor);
        visitor.VisitOrdered(*this);
    }
};

class Baz{
Baz* next;
Bar b;
public:
    Baz():next(NULL),b(0,0){}
    void setB(Bar nb){ b=nb; }
    Bar& getB(){ return(b); }
    void setNext(Baz* n){ next=n; }
    Baz* getNext(){ return(next); }

    VISITABLE{
        b.AcceptVisitor(visitor);
        visitor.VisitOrdered(*this);
        if(next)
            next->AcceptVisitor(visitor);
    }
};

class PlainVisitor{
public:
    VISITOR;
};

template<>
void PlainVisitor::VisitOrdered<Foo>(Foo& node){
    std::cout << node.getValue() << std::endl;
}

class ReverseVisitor{
public:
    VISITOR;
};

template<>
void ReverseVisitor::Visit<Baz>(Baz& node){
    if(node.getNext())
        node.getNext()->AcceptVisitor(*this);
    node.getB().AcceptVisitor(*this);
}
template<>
void ReverseVisitor::Visit<Bar>(Bar& node){
    node.getB().AcceptVisitor(*this);
    node.getA().AcceptVisitor(*this);
}
template<>
void ReverseVisitor::VisitOrdered<Foo>(Foo& node){
    std::cout << node.getValue() << std::endl;
}

int main(){
    Baz bz1, bz2;
    bz1.setB(Bar(1,2));
    bz2.setB(Bar(3,4));
    bz1.setNext(&bz2);

    PlainVisitor v1;
    ReverseVisitor v2;

    bz1.AcceptVisitor(v1);
    bz1.AcceptVisitor(v2);

    return(0);
}

Output:

1
2
3
4
4
3
2
1

  1. Which is not to say that it isn't used, but rather that it is used only as a method of last resort. 

  2. Although, to be fair, when polymorphism is used for the visitors only the base class needs to contain all of the boiler plate for these; the derived visitors can usually implement just the subset of versions of Visit that they need for their actual task. 

  3. This naming doesn't feel quite right, but I'm too tired or uncreative to think of a better one. 

  4. An added benefit of all types involved being statically known! 

No Comments

Comment on this post