December 31, 2004

Java or C#?

Now that I have successfully implemented prototype translators from both Java and C# to C++ (see my previous post), I have some thoughts on what would make a good language to use in combination with C++ for higher level game code. I am not proposing the replacement of C++. C++ is an excellent language for all kinds of games related problems.

Rather, I am proposing using reflection oriented language like C# where it is clearly beneficial to do so. The clear win is the ability of an alternative language to integrate more cleanly with the asset pipeline than is possible with C++. I have also considered the idea of using scripting languages. I am going to eliminate scripting languages from my shortlist. Not because they don't have a place in games development, quite the opposite, but because they cannot be used in all the cases where there is a need for a clean binding between game code and game assets.

I want a language that can fill the role that UnrealScript fills in Unreal titles, but in a console friendly way.

I am left with Java and C# on my shortlist. What other possibilities are there? I want a high performance statically-typed language that can compete head-to-head with C++ in most areas of game development. I envision maybe 50% of the code being written in C++, with the rest being written partially in Java or C# and partially in a scripting language like Lua or Python.

Lisp is an option. Naughty Dog seemed to use it effectively for a number of projects before loosing their Lisp guru. They are also searching for an alternative to C++. I have a broad understanding of Lisp but I have never written any program of significant size in it so I am hardly an expert. But I think this is a good point. Few games programmers have anything but a very basic understanding of Lisp. Why rock the boat? The key role I am trying to fill is for a language that can bridge the gap between code and assets. There is no need to switch to a whole new programming paradigm. Java and C# are ideal because, when used in the way that I envision, they will be like C++ with reflection. An experienced C++ will become an expert in either of these languages in a matter of weeks.

With respect to the role I want it to fill, Java has a killer flaw. It does not allow objects to be allocated on the stack or embedded within other objects. This would not be a problem for a scripting language. But it means that Java cannot go head-to-head with C++ in the areas where I need it. The memory management overhead would be too much. Compare:

struct GameObject1

{
Matrix LocalToWorld;
Vector Velocity;
Vector Acceleration;
};

struct GameObject2
{
Matrix *LocalToWorld; // all allocated separately on the heap
Vector *Velocity;
Vector *Acceleration;
};

Another significant failing in Java is its inability to use anything other than pointer types as template parameters. You can't do this:

template <typename T>

class Vector
{
size_t size;
T* array;
// ...
};

void foo()
{
Vector<int> intArray;
}

The best you can do in Java is a vector of pointers to ints, where each int is allocated on the heap separately. Massive overhead. C# has neither of these flaws.

Having taken garbage collection out of C#, are there any remaining reasons why it should not perform as well as C++? That makes a good subject for a future post...


It lives!

I finally got my C# to C++ translator doing something useful! It's still early days but I successfully translated a C# program into C++. The C# program was this:

class Program

{
static void Main(string[] args)
{
for (int i = 0; i != 10; ++i)
{
Console.WriteLine("Hello world!");
}
}
}


I compiled this with the Microsoft C# compiler, which generated Program.exe and Program.pdb. My translator loaded these two files, extracted all the information (metadata, debug information and MSIL assembly code) and then output this:

void Main()

{
NativeInt si0;
void* sr0;
NativeInt si1;

#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L0: {
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
{
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
Int8 CS_36_4_36_0000;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
{
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
Int32 i;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
si0 = 0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L1: i = si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L2: goto L20;
#line 17 "f:/permanent/dev/ilvisitor/test1/program.cs"
L4: sr0 = InternString("Hello world!");
#line 17 "f:/permanent/dev/ilvisitor/test1/program.cs"
L9: ::System::Console::WriteLine((::System::String*) sr0);
#line 17 "f:/permanent/dev/ilvisitor/test1/program.cs"
L14: void(0);
#line 18 "f:/permanent/dev/ilvisitor/test1/program.cs"
L15: void(0);
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L16: si0 = i;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L17: si1 = 1;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L18: si0 = si1 + si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L19: i = si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L20: si0 = i;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L21: si1 = 10;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L23: si0 = si1 == si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L25: si1 = 0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L26: si0 = si1 == si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L28: CS_36_4_36_0000 = si0;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L29: si0 = CS_36_4_36_0000;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
L30: if (si0) goto L4;
#line 15 "f:/permanent/dev/ilvisitor/test1/program.cs"
}
#line 19 "f:/permanent/dev/ilvisitor/test1/program.cs"
L32: return;
#line 19 "f:/permanent/dev/ilvisitor/test1/program.cs"
}
#line 19 "f:/permanent/dev/ilvisitor/test1/program.cs"
}
}


It looks like I was right when I said it would not be comprehensible to humans! Anyway, I compiled this C++ function, which generated another executable. I ran that and got the expected output:

Hello world!

Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!


Notice all the "#line" directives. These allow me to debug the C# program using a C++ debugger. This is working pretty well. I can set breakpoints in the C# source file (rather than the translated C++ one) and single step through the C# code. I also use the same variable names in the translation so I can see their values in the debugger as I step through. It's just as good as using a C# debugger.

For fun I might try running it on a console. It would be cool to get a screenshot of CodeWarrior debugging a C# program running on a non-Microsoft platform :)


Garbage collection for robust game code

In this port, I considered the performance issues associated with garbage collection in game code. I came to the conclusion that it was inappropriate for use during gameplay because it would burn a considerable amount of CPU time and make the frame rate stutter. I also thought that it might be useful for detecting memory leaks during development. Now I think it might actually have wider application.

As a game is developed, people make mistakes. Artists and designers check in buggy assets and programmers check in buggy code. The impact these errors have on development depends on the robustness of the game code. Ideally such errors will not stop the game from running and their impact will be limited to the parts of the game that have an essential dependency on them.

For example, if an artist checks in an animation with the wrong name, the game should not crash. It should complain loudly that the animation is wrongly named but it should run and all the other animations should play correctly. Likewise, if a programmer introduces a bug into one of the characters, that bug should ideally not make the game crash and all the other characters should work correctly.

This allows development to continue relatively unhampered while the asset or code is fixed.

It is difficult to achieve this kind of robustness in the face of certain kinds of memory bug: memory fragmentation and memory leaks. The problem is that memory is a resource that it shared between all parts of the game code. If fragmentation or leaks allow the memory to get into a state where allocations cannot be satisfied by the memory manager, all the code comes to a grinding halt and the game has no option but to crash.

Garbage collection could help here. I still hold the opinion that garbage collection is too slow for use during gameplay. But what if it was held in reserve until leaks and fragmentation actually caused the memory manager to fail? It is better to drop a few frames and let the garbage collector sort it out than to have the game crash. Then although the rest of the team might be frustrated with the frame rate, at least they can still get on with their work.

My idea is to use a garbage collector as a secondary disaster recovery system. The primary memory management methods would be the kind we already use: manual memory management, reference counting, weak references, etc. If these techniques were applied correctly, as one would hope they would be in the final release of a game, the garbage collector would never be activated and would thus have absolutely no impact on performance. Of course, should a memory bug make it into the final release, the difference between an occasional frame drop and a crash would be the difference between TRC pass and failure!

Here are some other potential uses of garbage collection that would not affect the frame rate:

The point of this post is not demonstrate another useful feature of C#. There is a garbage collector available for C++: http://www.hpl.hp.com/personal/Hans_Boehm/gc/.


December 28, 2004

The key benefit of C# for game code

I am a little unsatisfied with the answer I gave to a comment on my previous post. It was somewhat abstract and hand-wavy. So I thought a concrete example might be a good idea.

The reason I am writing a C# to C++ translator prototype is because there is one feature missing from C++ that I think would be extremely beneficial for game development. That feature is the ability for a program to examine its own structure, or the structure of another program, through reflection. This feature is not unique to C#. But I'm not trying to start a revolution here. I want a language that is very similar to C++ that has this additional feature. C# is ideal in this regard.

As a concrete example, consider the way most games tie together a game's asset editing tools, the assets themselves and the game code. We have some tools: level editors, modeling tools, animation tools, texturing tools, etc. And we have a game that needs to load the assets produced by those tools. In order to accomplish this we need various data formats known to both the game and the tools. The tools need to be able to output these formats and the game needs to be able to load these formats.

An effective practice is to have an intermediate format description language. These have become common place in recent years and are key to the idea of data-driven development. They are typically some kind of structured text file, often XML, that describes a tree of data that both the tools and the game understand.

One of the advantages is that they decouple the game from the tools so that a change in one does not require an immediate change in the other. It allows the various teams to work on slightly different timelines, thus eliminating task dependencies. Another advantage is that they can contain other useful stuff, such as how a the information should be displayed to a designer or artist so that they can edit it. They can also contain information about legacy versions of the format to allow backwards compatibility. All good stuff.

We do not use format description languages for all our formats. But that does not mean they would not all benefit from one. How often do we here a programmer say to an artist, "Yeah you need to re-export the file" ?

However, these schema languages are only necessary because C++ is incapable of reflection. Unlike C++ programs, C# programs are also schemas. A game written in C# would have no need for a separate format description language. Here is a concrete example. Suppose we have a game where an avatar walks around a maze collecting fruit. We might express the level format using a schema like this:

[levelSchema]

[element name="Fruit" abstract="true"]
[attribute name="Position" type="Vector2" default="0,0"
edittingTool="DragAndDrop"/]
[attribute name="Points" type="int" default="1"/]
[/element]
[element name="Apple" extends="Fruit"]
[attribute name="Speedup" type="float" default="1.5"/]
[/element]
[/levelSchema]


Okay I admit it won't win any contests for best schema language but it serves as a simple example. In a C++ game, we would most likely have a object model that followed the structure of the format and some code to load data of that format. The code might be written by hand or somehow automatically generated. Whenever a programmer wanted to change the object model or the format, they would update the schema, the object model and the loading code. If they are lucky they won't need to change the tools. That's a key advantage of using a schema language.

So how does C# help us? In C#, the game code might look like this:

[GameObject]

public abstract class Fruit
{
[Edittable(true)]
[Default(1)]
public int Points;

[Edittable(true)]
[Default(Vector2(0, 0))]
[EdittingTool("DragAndDrop")]
public Vector2 Position;
}

[GameObject]
public class Apple : Fruit
{
[Edittable(true)]
[Default(1.5f)]
public float SpeedUp;
}


This is contrived but, as you can see, the schema is embedded in the C# code, which eliminates the redundancy of expressing the format in both C++ (in the object model and the loading code) and in the schema. Programmers have less work to do and there is less possibility of error or things getting out-of-sync.

Furthermore, because the program can examine it's own structure, i.e. examine the schema embedded in itself, it can automatically determine how to load data of the format described by the schema. Alternatively, an offline compiler plugin can examine the structure of a partially compiled game and automatically generate all the loading code.

Additionally, in the same way as a compiler plugin can examine compiled code, a tool can simply load the compiled game as an input file and deduce what the file format is. Alternatively, if the tool and the game are connected over a network so that a designer can edit the level as the game is running, the game can simply use the reflection API to query it's schema and send it to the tool just after the network connection is established. There is no need for the tool to know anything about the game.

There are many other ways reflection could be used effectively. For example, it might be used to automatically generate network protocols. Certain fields could be annotated in the schema to indicate how they are synchronized.

What appeals to me most is it would make things just work. Imagine, you add two lines of code to your game:

[Edittable]

public float MaximumSpeed;


In C++, all this would do is add a member variable to a class. In C#, in addition to this, it automatically introduces a new GUI element into a designers level editor, extends the level file format and automatically generates all the loading and saving code for the new field.

Don't Repeat Yourself!


December 25, 2004

.NET disassembler

I made a lot of progress on my C# to C++ translator today. I have written a simple disassembler that loads the EXE and PDB for a .NET assembly and outputs all the interesting metadata, debug information (C# source line numbers and variable names) and IL assembly using the new .NET 2.0 reflection API. It was surprisingly easy, only 300 lines of code. All I need to do now is modify it to output C++ code instead of IL disassembly. I did have some problems disassembling C# code that uses generics. I suspect the new reflection API may not be quite finished yet, it is only a beta release after all.

I have been thinking further about where C# might fit into games development. We are already using it for tools, I am considering whether it is useful for actual game code. Apart from the need to write a translator in order to use C# on a console, I think my next biggest concern is the .NET garbage collector. Although garbage collection is one of the features of the .NET framework that makes it so easy to use, I don't think consoles are sufficiently powerful to use it yet. It isn't just the number of cycles it burns in order to free memory. Another problem is that it doesn't spread the CPU cost evenly, which could make the frame rate stutter.

I think the simplest solution is simply to abandon garbage collection and use something simpler like reference counting, combined with manual memory management where necessary. This will probably also mean abandoning a large portion of the .NET class library.

Fortunately the C# language is sufficiently independent from the rest of the .NET framework that this will not be a problem. I will probably look into taking the C# source code for the Mono implementation of the .NET class library and just pick out the useful bits that do not rely on garbage collection.

A garbage collector of some sort might be useful for diagnostics of course. So long as the game does not need to run it in release builds it might be a useful way of finding memory leaks.

To anyone concerned that I spent Christmas day messing around programming, be assured that I did do Christmas things. I had a very tasty prime rib with some of my friends from work. Then I went to the beach and did some reading. This is the first Christmas where I haven't had to wear a big winter coat. I love California!



December 24, 2004

Powerpoint in powerpoint

This blog post made me laugh. If you have ever had the misfortune of sitting through a poor powerpoint presentation, you will understand. The author presents the failings of powerpoint as a written form using the very form that he ridicules. Hilariously circular. But he does such a good job, I can only conclude that it is not the form itself at fault but its users.

Atom, RSS, feeds and feed aggregators

What do these funny words mean?

Last week I heard mention of the word "feed" in the context of the Internet in two different conversations. My brain, being trained to look out for this sort of thing, advised me that I should probably investigate further. I was pleasantly surprised to discover that the Internet continues to evolve in new and interesting ways.

A feed is a simple thing. Most sites where people make regular postings, e.g. news sites, blogs, forums, etc, support feeds. A feed is simply a mechanism that a web site uses to tell the outside world that a new post has become available to read. It doesn't sound like such a big deal and that's probably why I didn't put a lot of effort into researching it until now.

What makes feeds a big deal is feed aggregators. Feed aggregators are programs that let people manage feeds. They let you choose which feeds you are interested in (subscribe) and which you are not (unsubscribe) and they notify you when someone posts on a site that you track.

Now rather than visiting lots of sites every day to see if there are any interesting new posts, I just wait for my feed aggregator program to display a little red flag in my computer's system tray. This will save me precious minutes.

Another useful aspect of feed aggregators is the way they can link people's feed preferences together. Let's say I am interested in cooking so I subscribe to some cooking feeds. The most obvious benefit is now my feed aggregator will deliver some new recipes to me every day. But having subscribed to a feed, I can also find out who else has subscribed to it and which feeds they, in turn, have subscribed to. If I like any of the feeds that they subscribe to then I can choose to subscribe to them as well.

It's a bit like a self-regulating filter that finds the best bits of the Internet and abandons the others. If a site does not have content that interests people then nobody will subscribe to it and thus nobody will learn about it. On the other hand, if a site has content that people like then they will subscribe to it. Then other people will see that they have subscribed and add it to their own subscriptions. And from there the site will become widely known.

The feed aggregator I use is actually a web site called http://www.bloglines.com. I can log on to bloglines wherever I can find an Internet connection and check my feeds. They also have a little program that you can install on your computer, which monitors your feeds and notifies you whenever a feed has new content.

You will notice that I have added Atom and RSS feed links to this site's sidebar. These are the URLs that you would copy and paste into a feed aggregator in order to subscribe to this blog. A lot of sites have them and now I know what they're for!

Happy Christmas everyone!


December 23, 2004

C# 2.0 to C++ translator

Generics are just one of the many new features of C# 2.0. Other notable additions are anonymous methods and iterators. C# has always had language features without direct C++ equivalents. But these latest additions, and the possibility of subtle differences between the semantics of C# generics and C++ templates, are giving me second thoughts about generating C++ by traversing a C# syntax tree. For example, consider this iterator in C#:


class Stack<T>: IEnumerable<T>

{
T[] items;
int count;
public void Push(T item) {
if (items == null) {
items = new T[4];
}
else if (items.Length == count) {
T[] newItems = new T[count * 2];
Array.Copy(items, 0, newItems, 0, count);
items = newItems;
}
items[count++] = item;
}
public T Pop() {
T result = items[--count];
items[count] = default(T);
return result;
}
public IEnumerator<T> GetEnumerator() {
for (int i = count - 1; i >= 0; --i) yield items[i];
}
}
How would the "GetEnumerator" method be translated into C++? Notice the use of the new "yield" keyword. This new feature automatically generates an iterator class that can be used, for example, with the existing foreach construct:

Stack<int> stack;

stack.Push(1);
stack.Push(2);
stack.Push(4);
foreach(int i in stack)
{
Console.WriteLine(i);
}
// Outputs 4, 2, 1
Essentially, the body of the "GetEnumerator" method and the body of the foreach construct are woven together. The yield instruction is substituted with the body of the foreach loop and the foreach loop itself is then substituted with the body of the "GetEnumerator" method. This is very powerful stuff! Unfortunately C++ has no equivalent. My C# to C++ translator would be responsible for all the substitutions. Too much hard work for a lazy programmer like me! And this is just one of the examples of C# constructs without direct C++ equivalents. Delegates?

Fortunately I think C# 2.0 might come to my rescue, or at least the new .NET class library supplied with C# 2.0. The reflection API has been extended with an awesome new feature. The program structure is now exposed all the way down to the level of individual MSIL assembly instructions. It should be relatively easy for me to go the MSIL byte code to C++ translator route, as I did for my second Java compiler. The process would be:
  1. Use any C# compiler (e.g. Microsoft's) to compile C# 2.0 source code into byte-code assemblies.
  2. My translator loads the assemblies, examines them using the new reflection API and outputs new C++ source code for those that have changed.
  3. Any C++ compiler compiles the C++ code into native code for whatever target platform.
The main advantage of translating from byte-code is I avoid having to deal with any higher-level language features not present in C++. I can treat C++ as a kind of structured assembly language. The main disadvantage is the generated C++ code will loose most of its high level structure and will most likely be practically unreadable to a human.


December 19, 2004

Visitor pattern through runtime code generation

I admit this is a little silly. It could be described as overkill for implementing the visitor pattern. However, I thought it might be fun try out some of the .NET framework's runtime code generation facilities. Specifically, I found out how to instantiate generic methods by binding their generic parameters at runtime. Try and do that in C++!

This implementation is based on the previous one, which used reflection to locate the appropriate Visit method and invoke it. This time, I am using runtime code generation to instantiate a new method that will call it directly. A delegate reference to the new method is cached in a dictionary so that it can be invoked again if the same Visit method is required again.

For some things, this might be faster than the reflection based approach because it is not necessary to search the metadata at runtime every time Visit needs to be invoked.

Also, over the course of several refactors, I reorganized the visitor code so that it is entirely independent of the modem code and vice-versa. The visitor code is now fully generic and could be used to apply the visitor pattern to any object hierarchy (which is good because I need to use it to traverse ASTs for my C# to C++ compiler!).

I am sure there is a less complicated way of doing this, which I hope to uncover in time.


public abstract class Modem
{ }
public class HayesModem : Modem
{ }
public class Hayes2Modem : HayesModem
{ }
public class ZoomModem : Modem
{ }
public class VroomModem : Modem
{ }

public class ConfigureDOSModemVisitor: Visitor
{
public void Visit(HayesModem modem)
{ }

public void Visit(ZoomModem modem)
{ }
}

public class VisitorBindException : Exception
{
public VisitorBindException(Type visitorType, Type visitedType):
base(String.Format("Visitor {0} has no suitable Visit method for visited {1}.",
visitorType, visitedType))
{
}
}

public class Visitor
{
public void Accept(object visited)
{
Type visitedType = visited.GetType();
VisitInvoker invoker;
if (!invokers.TryGetValue(visitedType, out invoker))
{
MethodInfo visitMethod = FindCompatibleVisitMethod(visitedType);
invoker = CreateVisitDelegate(visitMethod.GetParameters()[0].ParameterType);
invokers[visitedType] = invoker;
}
invoker(visited);
}

private MethodInfo FindCompatibleVisitMethod(Type visitedType)
{
Type visitorType = GetType();
MethodInfo visitMethod = visitorType.GetMethod("Visit",
BindingFlags.Public | BindingFlags.Instance,
null, new Type[] { visitedType }, null);
if (visitMethod == null)
{
throw new VisitorBindException(visitorType, visitedType);
}
return visitMethod;
}

private VisitInvoker CreateVisitDelegate(Type paramType)
{
MethodInfo genericMethod = typeof(Visitor).GetMethod("CreateVisitDelegate",
BindingFlags.NonPublic | BindingFlags.Instance,
null, new Type[0], null);
MethodInfo method = genericMethod.BindGenericParameters(
new Type[] { paramType });
return (VisitInvoker)method.Invoke(this, new object[0]);
}

private VisitInvoker CreateVisitDelegate<VisitParamType>()
where VisitParamType : class
{
Visit<VisitParamType> visit = (Visit<VisitParamType>)
Delegate.CreateDelegate(typeof(Visit<VisitParamType>),
this, "Visit");

return delegate(object visited)
{
visit((VisitParamType)visited);
};
}

private delegate void Visit<VisitParamType>(VisitParamType visited);
private delegate void VisitInvoker(object visited);
private Dictionary<Type, VisitInvoker> invokers = new Dictionary<Type, VisitInvoker>();
}

public class Program
{
public static void Main(string[] args)
{
ConfigureDOSModemVisitor visitor = new ConfigureDOSModemVisitor();
visitor.Accept(new HayesModem());
visitor.Accept(new Hayes2Modem());
visitor.Accept(new ZoomModem());

// Throws exception
visitor.Accept(new VroomModem());
}
}


Visitor pattern using reflection

As I said in this post, I think reflection is one of the most powerful features of languages like Java and C#. By using reflection, I can eliminate all of the redundancy from the my simple implementation of the visitor pattern. Admittedly, there might be some performance issues if I make heavy use of it.

Notice that there is absolutey no code needed in the modem classes to support their visitors. The modem base class does all the work. It uses reflection to search the visitor class for an appropriate accept method at runtime. Also, there is no need to implement visitor interfaces in the visitor class.


public abstract class Modem
{
public void Accept(object visitor)
{
Type modemType = GetType();
Type visitorType = visitor.GetType();
MethodBase acceptMethod = visitorType.GetMethod("Visit",
BindingFlags.Public BindingFlags.Instance, null,
new Type[] { modemType }, null);
if (acceptMethod != null)
{
acceptMethod.Invoke(visitor, new object[] { this });
}
else
{
// Throw exception or do nothing as appropriate
}
}
}

public class HayesModem : Modem
{
}

public class ZoomModem : Modem
{
}

public class ConfigureDOSModemVisitor
{
public void Visit(HayesModem modem)
{ }

public void Visit(ZoomModem modem)
{ }
}


Revisiting the Visitor pattern

The visitor pattern is a very useful technique. I intend to use it for implementing the various passes over the C# abstract syntax tree (AST) that my C# to C++ translator will need to make.

One of the key benefits of the visitor pattern is that it gives you the effect of injecting new virtual functions into a class hierarchy without actually having to modify those classes. The big win for me is I will need to make only minimal changes to the C# compiler code, which will make it easy for me to merge in future versions of the code when they become available.

This article addresses some issues with the visitor pattern.

It occurred to me that by using the new features of C# 2.0, I can do a little better than the improved visitor pattern proposed in this article. Following the Don't Repeat Yourself principle, I can use C# generics to eliminate some of the redundancy.

Compare this example code to that in the aforementioned article. The individual multiply-inherited visitor base classes for specific kinds of modem are replaced by a single generic interface. The Accept methods in each modem class are simplified with most of the logic moving into the generic Dispatch method of the base class.

Sorry about the code formatting. I'm having to type it in by hand as HTML. There must be an easier way!


public abstract class Modem
{
public abstract void Accept(object visitor);

protected static void Dispatch<M>(M modem, object visitor) where M : Modem
{
IModemVisitor<M> iface = visitor as IModemVisitor<M>;
if (iface != null)
{
iface.Visit(modem);
}
else
{
// throw exception or do nothing as appropriate
}
}
}

public class HayesModem : Modem
{
public override void Accept(object visitor) { Dispatch(this, visitor); }
}

public class ZoomModem : Modem
{
public override void Accept(object visitor) { Dispatch(this, visitor); }
}

public interface IModemVisitor<M> where M : Modem
{
void Visit(M modem);
}

public class ConfigureDOSModemVisitor:
IModemVisitor<HayesModem>,
IModemVisitor<ZoomModem>
{
public void Visit(HayesModem modem)
{ }

public void Visit(ZoomModem modem)
{ }
}


December 18, 2004

C# to C++ translator

I've had this idea running around my brain for a couple of years now. As you may have guessed from previous posts, I am not the world's biggest fan of C++. That has not always been the case. Let's start from the beginning...

Some time ago (oh boy I can't quite remember exactly how long but I think it was about 13 years ago) I started to learn C++. Before that I had been using C. Then in a typical "imperatively thinking programmer tries to learn OO" manner, over the next 13 years, I gradually started to "get OO". I'm still not there but I'm making continued progress.

Now, a couple of years ago I realized that C++ doesn't "get OO" either. To continue my journey in search of OO nirvana, I would have to switch languages. So I actually switched to two languages: Java and C#. I had the "now I get it" moment while reading Martin Fowler's excellent book called "Refactoring", which I think every programmer should read.

For games development, C# appeals to the pragmatist in me. It's sufficiently like C++ that it won't scare a less open-minded programmer in the way a pure-OO language like Smalltalk would. Also, for most applications, it generates code with similar performance to that generated by C++. And performance is, I think, the line of argument most likely to be used against the idea of using a different language. That and "what's wrong with C++, I know it already?".

How about Java? Java would be good as well but I wrote a couple of Java compilers already and I want to try something new.

And hey it's the weekend and I can do whatever I want!

My overall strategy is to take the C# compiler that comes with the Mono project and modify the back-end to output C++ source code instead of MSIL byte-code. This way most of the work is done for me. The Mono C# compiler already does most of the front-end work and any C++ compiler can act as the back-end. All I have to do is glue them together in the middle.

Another option would be to translate byte-code into C++. In fact that was the way my Java compilers worked. The big disadvantage is that a lot of the program structure is lost in the byte-code, so I would have little option but to generate spaghetti code.

Then I can write C# programs for any platform with a C++ compiler, including game consoles.


Other games industry blogs

I discovered this games industry blog. In addition to being interesting in its own right, it is also cool because it links to lots of other industry blogs.

One thing I notice is very few of my fellow game developer bloggers are programmers. Well, I am going to do my best to change that! Expect more posts on the technical aspects of games development in the future.

I also need to think about why programmers are not blogging as much as those in other disciplines. It doesn't seem healthy. We need to communicate with each other more. Of course, blogging is just one way of communicating ideas.

For example, there are some excellent games programming mailing lists. But unfortunately these can get quite competitive and often degenerate into "I'm smarter", "No you're wrong, I'm smarter" type exchanges.

There are the annual games development conferences like GDC. These are great for one-way communication where the speaker is (hopefully) very knowledgeable about a certain subject and can perhaps impart some useful information to those of us attending the conference. There isn't usually very much ad-hoc peer-to-peer communication though, between programmers at least.

I can highly recommend visiting Brian Hook's site. It isn't strictly speaking a blog but I think it serves a similar purpose. Brian programs games but I am not sure if he would appreciate being categorized as a "games programmer". He's more of a renaissance man I think :)


December 16, 2004

Progress bars and why they are awful

The most interesting thing I can say about progress bars is that they start on the left and finish (eventually) on the right.

Brief analysis of "Salutation to the dawn"

I posted this poem recently. I have been thinking about what it means to me.

It focuses on a single day: "Look to this Day". Or alternatively, the time between "For Yesterday is but a Dream" and "Tomorrow is only a Vision".

The metaphor, "For it is Life", means that life and today are the same. This is true in the sense that we experience life in the present. As we experience life, we are experiencing today. Everything prior to today is a memory or a "dream". Yesterday cannot be changed. And everything beyond today is a "vision". Tomorrow cannot yet be lived. The only point in time that we can experience or change is now or today.

The poem is fast paced. The poem as a whole is short, as are the lines. This reflects the fast pace of life as lived by a person who focuses on today.

It also casts today in a positive light. After all, no matter how bad the past has been, it doesn't matter. We can always make things better today. The past can be viewed as the path that lead us to the point where we can make things better today, a "Dream of Happiness". And for a person who makes things better one day at a time, the future is a "Vision of Hope".

Even if the world around us won't give us a clean slate, we can always give ourselves a clean slate.

So I agree that we should recognize the potential of the dawn and the opportunities it opens. Or as the Romans said, "carpe diem" or "seize the day".


December 14, 2004

What is failure?

What does failure mean? A recent experience at work taught me that failure means different things to different people.

Some people have a natural curiosity. They delight in discovering new things: in learning. They have no fixed goal, or at least, with every success or failure, their goals change. They are motivated simply by the desire to solve problems and to improve themselves and those around them. For these people, failure is not something to be feared. It is a regular and necessary part of their approach to making progress.

On the other hand, some people have a different motivation. They feel a need to succeed in the eyes of others. In fact, to do otherwise would be a failure. And to these people, failure is something to be feared. They need to show results now. The skills that are familiar to them are a safer bet because, in the short term, their existing knowledge will get them the results they want in the shortest possible time. These people resist new ideas because things that are unknown to them open the door to the possibility of failure.

In a nutshell, some people pave the way forward and others follow their path.

I am not saying that everybody is one of these two kinds of people. We all have some of both. And it also depends on the context. For example, I think of myself as an open minded programmer. I like to try new kinds of food at every opportunity. But when I go shopping for clothes, I am quite conservative. I try and stick to wearing what I have found is acceptable to others. I have little desire to experiment.

Getting back to games development, where would the ideal programmer fit into this model? I initially thought that she would be somewhere in the middle: somewhat motivated by learning and somewhat motivated by getting immediate results. But now I am not so sure. Getting results is obviously important. We work for businesses that want commercial success.

But I think that getting results by blinding ourselves to the world around us is the wrong way of going about it. Instead it is better to be as open to new ideas as possible. But to counter this, we need to appreciate what our real motivations are. Then we must apply the discipline necessary to get the job done. It is much easier for the curious programmer to be disciplined than for the closed minded programmer to accept new ideas.

We must sometimes pave the way forward but also stand on the shoulders of giants.


A poem I liked

Salutation to the Dawn

Listen to the Exhortation of the Dawn!
Look to this Day!
For it is Life,
The very Life of Life.
In its brief course lie all
The verities and realities
Of your Existence;
The Bliss of Growth,
The Glory of Action,
The Splendor of Beauty;
For Yesterday is but a Dream,
And Tomorrow is only a Vision;
But Today well lived makes every
Yesterday a Dream of Happiness,
And every Tomorrow
A Vision of Hope.
Look well therefore to this Day!
Such is the Salutation of the Dawn....

- Kadilasa
(Indian dramatist)


December 12, 2004

Hypocrisy

Yesterday I said that "if it's not C++ then it must be a scripting language" was a false dichotomy. I think I am guilty of a false dichotomy, actually a "false trichotomy". I have been thinking in terms of "C++ and higher level languages". Of course it follows that there should be languages that are lower level than C++. So I am categorizing languages in terms of being C++, being lower level or being higher level. This is clearly wrong.

Is Pascal lower level than C++? In a sense it is, because C++ has higher-level features like OO and templates that Pascal does not. However, Pascal has features like runtime checking on array bounds that C++ does not. According to the model I have been using, this would make Pascal a higher level language. So we have a contradiction. It is clear to me now that my model is wrong. Perhaps because I spend so much time programming in C++, I have a very C++-centric perspective.

From now on I am going to try to think in terms of "C++ and other languages". And when my understanding of other languages is sufficient, I will try and just think in terms of languages and how they compare. C++ will just be one of the other languages.

Does "higher-level language" mean anything? It is certainly part of our lexicon. In a sense it means something. Who would argue that 68000 assembly was a higher-level language than Lisp? But perhaps that is because there is not a single feature of 68000 assembly that makes it higher-level than Lisp.

I think it is reasonable to say that one language is higher-level than another with respect to a particular feature set. So we can say that C++ is higher-level than Pascal with respect to OO and that Pascal is higher-level than C++ with respect to arrays.

Critique of C++

I read this critique of C++. I agree with some of it. I don't think anyone would argue that C++ is perfect. C++ has its problems. And this critique digs deep into every aspect of C++ that might be considered as a failing. However, it was very negative and the author is clearly a highly biased Eiffel advocate so I didn't take it terribly seriously.


Introspection kicks ass

Roughly 5 years ago I experienced an epiphany. Until that point I had been using C++ for a few years and I was quite confident that although other languages, like Java, had some advantages over C++, they were pretty minor and the better performance of the resulting code made C++ the best language for my problem (writing games). The advantages of Java, as I viewed them then, were essentially these:

I was also unhappy to loose C++ templates. Constantly having to use explicit dynamic type conversions when using container objects really irritated me and I was aware of their performance cost. That said, I preferred Java for non-games projects, where performance was not a concern for me.

But I was completely missing the point. One thing that lifts Java above languages like C and C++ (and a feature present in many other higher level languages) is introspection (also called reflection). This is an awesome language feature. Since then I have tried to emulate this feature in C++ with varying degrees of success.

Introspection is where a program is able to examine its own structure as it runs. So for example, if it has a reference to an object it can iterate over all the properties of that object, get their names and values and then save them to a file. This has all kinds of useful applications for games. It can be used for loading and saving game state, meaning you don't need to write any per-class serialization code. A level editor can use it to provide a view of the game state to a designer, who can then edit it. It can be used to automatically synchronize game state in a network game, etc, etc.

This may not seem like such a big deal. Writing explicit code to do all this stuff is pretty straightforward. But having used introspection, I understand that this is the code that causes a lot of the friction when you are mid-way through a large project with a large team. When that code just works, everything goes much more smoothly.

It's like inheritance on steroids. When I started out with OO programming, the idea that I could just derive from another class and get all of it's features for free amazed me. With introspection you can reuse other code but in a way that cuts straight across the class hierarchy. For example, if you want to be able to load game state from a file, you write some general purpose code that knows how to serialize a graph of objects by examining their state through an introspection API. Having written that general purpose code, any object that conforms to the object model expected by the serialization code can be loaded from a file.

Some things have changed since then. We are far less concerned about performance than we were then. The higher level languages have matured a lot. They have really good IDEs and tools. Java even has templates now!

So I'm not saying we should all abandon C++ for Java. That is not a good idea or even feasible. C++ is an excellent language for all kinds of things:

Neither am I saying we should immediately start using Java wherever it is appropriate. Java was just an example. There are many excellent languages that could fill roles as higher level languages in games.



Reinventing the wheel

In thinking about using other languages in games, I can't help but consider inventing a new language tailor made for the higher level aspects of game development. This is exactly what Tim Sweeney did with UnrealScript for the Unreal engine. I have worked on a title that used the Unreal engine and it certainly improved programmer productivity for that project. But what worked for Epic might not work out so well for the rest of us.

Lets way up the pros and cons. The obvious advantage is that you get a language that is exactly what you need, or at least exactly what you think you need as you develop the language. Only the test of time will tell you whether it really is exactly what you need. And needs always change. To my mind an advantage of using an existing language is that, although it is not exactly what you need, at least it has been proven for a lot of other projects in the broader domain of IT. So although it might not be perfect, at least you are less likely to discover that it is really bad for anything when you get further into game development.

It is also worth considering what happened to Naughty Dog. They developed their own Lisp variant and used it for a number of projects. But when their Lisp guru moved on to other things they were no longer able to continue using that language. Like me, they are currently looking into alternatives to C++. Depending so heavily on one programmer who really understands the language is not a good idea.

An obvious disadvantage of developing a language in house is that it is a lot of effort. Not only do you need to implement a new compiler or interpreter, you potentially need to implement a runtime environment, a debugger and IDE support. You also need to document the language. By using an existing language, you are reusing not only the language itself but all the supporting tools and documentation.

You will also find that you are debugging your new language throughout development. If there is a bug, is it a bug in the game code or the language?

If you have succumbed to the temptation to write a new language, you may be tempted to add new features to the language throughout development of the game. That is both good and bad.

All of the other programmers on the team will need to learn the new language. If you use an existing language then some of them might know it already and you might be able to hire programmers who already know it.

There will be no third party libraries available for your new language. If you want an XML parser then you have to write a new one. If you used Python or Java you would have 100s (slight exaggeration) of options.

An advantage of rolling your own is that there are absolutely no licensing issues. On the other hand, there are plenty of existing languages with unrestrictive licenses.

There is another option. I am still on the fence on this one. I have written a prototype that translates Java byte code (.class files) into C++ source code. Now although I have written a compiler, I haven't invented a new language. I can reuse existing IDEs, debuggers, 3rd party libraries and language documentation.


An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl

If only it was that easy. But, in his paper, this fellow has attempted just that. It's really quite interesting but flawed. I don't want to spoil it for you but what basically happens is he specifies a simple problem and has 80 different programmers independently solve it using whatever language they want. Then he collates the results and analyzes them in terms of programmer productivity, program reliability, program performance, etc.

It sounds like a really objective way of comparing languages right? Well almost. Unfortunately he chooses a problem that is ideally suited to scripting languages so it's hardly any surprise when they come out best. I don't think that this kind of experiment is going to produce useful results unless the problem being specified is reasonably large scale. That is not because I think scripting languages are bad for larger scale programming. It is because languages like Java, C and C++ are especially bad for very simple programs. But I don't really care about simple programs because I spend virtually no time writing them.

But it's not a complete loss. I am interested in how much more productive programmers can become when they use a scripting language. So any productivity boost for a problem tailored to scripting languages is a kind of upper bound on the improvement I might see for a problem that was not so tailored. I think.

The conclusion was that the script programmers took half as long as the Java / C / C++ programmers and produced half as many lines of code to solve the same (tailor made) problem. That comes as a relief to me. Some scripting language advocates will try and make you believe scripting language X is going to make you 10 times faster.

So it looks as though there might be some potential benefit in using a scripting language for game code, but it is by no means a silver bullet.


December 11, 2004

Scripting and false dichotomy

I read this article about scripting languages for games and I thought there was some useful content and some not so useful. Specifically, I found the the comparison of Lua, Ruby and Python quite interesting. But I don't think the author's experience of some of the other languages, or understanding of their problem domain, is insufficient to include them in the comparison.

When discussing the topic of higher-level languages for games with colleagues, an attitude I commonly come across, and one which is implied by this article, goes something like, "if it's not C++ then it must be a scripting language". I think this is wrong for a number of reasons and it is a barrier to objectively considering which languages are valuable for use in games development.

First of all, what exactly is scripting and what is a scripting language? I would argue that scripting is what you do when you write scripts. A script is a simple program that automates a process, often gluing together a sequence of steps that might be individually carried out by hand.

Many people would describe Python, Ruby, Lua, Rexx and JavaScript as scripting languages. I think that is a good description. The inventors of these languages would, I think, be happy with the description because, when designing these languages, they considered scripting as one of the major applications. This doesn't mean that these languages can only be used for scripting or that scripts cannot be written in other languages.

Whole games can be written in Python. Although I don't think we're about to see a console FPS written exclusively in Python for some time! And if you wanted to write scripts in C, you could do it, but it is hardly the ideal language.

When a technical designer uses a language to customize the behavior of game objects, that could accurately be described as scripting. And a scripting language, like Python, is a good language for them to use. But by restricting ourselves to that application, we are missing opportunities both in terms of what we use higher languages for and which higher-level languages we use.

Also, "if it's not C++ then it must be a scripting language" is a false dichotomy. If Java is not C++, does that make it a scripting language? How about SQL? Java is a statically typed object oriented language with more in common to C++ than a scripting language. SQL is a database query language and too domain specific to be a scripting language.

So when a games programmer thinks about using a language other than C or C++, they often automatically think about a single application: customizing the behaviour of game objects. Also, such a programmer will tend to evaluate the suitability of a higher-level language in those terms. No surprise then that C#, Lisp and all the other languages got such a bad "review" in the aforementioned article.

For no particular reason, lets take C# as an example. The author of the article is right in a sense that C# would make a bad scripting language. It would not be a good language for a technical designer to use to customize the behavior of game objects. A language like Python would be a better choice. It is easier to learn, has faster turnaround time, etc, etc.

But that does not mean C# is inappropriate for use in games. It means that C# would be best used to solve different problems. In terms of syntax and semantics, C# is a lot like C++. At the risk of oversimplifying, C# trades off some performance for safer, cleaner and more expressive semantics than C++. It is also typically compiled using a JIT compiler rather than a stand-alone compiler but that is really just an implementation detail. Someone could write an stand-alone C# to native code compiler if they wanted to.

So my point is that a question like, "what is the best scripting language to use in my game?", is dangerous because it is a loaded question. A better question would be "given these game problems, what are the most appropriate languages to use?". There is no single correct answer. It depends on the game, platform(s), availability and quality of tools, ability of the chosen languages to inter-operate, experience of other programmers on the project and all kinds of other factors.


December 05, 2004

The state of the art

Before delving too deeply into the potential possibilities of using higher-level languages in games, I'm going to attempt to analyze what the current state of the art is in this area. I'm not going to consider the exceptions like Humongous or Naughty Dog. I'm going to try and focus on which languages the majority of teams are using and where they are applying them. Obviously every team is going to do things differently and I have only worked in two studios over the past 5 years so this can't all come from my first hand experience. There is a lot of information available on the Internet and clues as to what other teams are doing can be found on various industry mailing lists. I am going to focus on console games because that is what I am most familiar with.

So I think everyone will agree that the primary language is C or C++. C++ has probably taken the lead these days. The majority of programmers in the industry are, I think, using C or C++ as their exclusive imperative language for game code. There is a little more diversity when it comes to tools. In particular, I have noticed increased use of C#.

We see the most variety when it comes to non-imperative languages, most commonly schema languages. These are XML or other structured text files that describe the format of the game data structures. They are used to help the game code bind to it's data, whether loaded from a file or pulled over a network or whatever. They are also frequently used to specify how the data should be presented to designers in their level editing tool and to drive the level editing tool when it saves the data. These languages are usually proprietary languages designed specifically for use with a particular engine.

Many teams have some kind of special purpose state machine language, usually some kind of proprietary structured text syntax. It is also common to code state machines directly in C or C++.

A lot of teams allow technical designers to customize game behavior using some kind of scripting language like Lua or Python or a graphical waypoint or flow-chart language. In most cases this is entirely the domain of the designer. The programmers do not use the design language, unless they are responsible for dealing with C++ / design language bindings.

Schema languages are also sometimes used to automatically generate bindings between scripting languages and C++.

I will add to this page as I gain a better overall understanding of the languages we currently use.


December 01, 2004

Languages and there uses in video games

The more I think about it, the more I think the time is right to start using higher-level languages for games. As an industry, we predominantly use C/C++ as our primary programming language. It is also common to see a scripting languages, such as Lua, used as a mechanism to allow technical designers to customize the behavior of game objects. But these are, on the whole, treated as second class languages.

There are of course some exceptions. For example, Humongous Entertainment embedded Python in their game engine and claimed that it made their programmers more productive. Jak and Daxter II was implemented in Lisp. But they had to stop using it when the programmer who understood their Lisp compiler moved on to other things.

So I think there is potentially a lot to be gained by leveraging a higher-level language. But of course there are also some pretty scary pitfalls. I want to better understand the potential benefits and the pitfalls.


So I created a blog

That is all.

This page is powered by Blogger. Isn't yours?