December 23, 2004

C# 2.0 to C++ translator

Generics are just one of the many new features of C# 2.0. Other notable additions are anonymous methods and iterators. C# has always had language features without direct C++ equivalents. But these latest additions, and the possibility of subtle differences between the semantics of C# generics and C++ templates, are giving me second thoughts about generating C++ by traversing a C# syntax tree. For example, consider this iterator in C#:


class Stack<T>: IEnumerable<T>

{
T[] items;
int count;
public void Push(T item) {
if (items == null) {
items = new T[4];
}
else if (items.Length == count) {
T[] newItems = new T[count * 2];
Array.Copy(items, 0, newItems, 0, count);
items = newItems;
}
items[count++] = item;
}
public T Pop() {
T result = items[--count];
items[count] = default(T);
return result;
}
public IEnumerator<T> GetEnumerator() {
for (int i = count - 1; i >= 0; --i) yield items[i];
}
}
How would the "GetEnumerator" method be translated into C++? Notice the use of the new "yield" keyword. This new feature automatically generates an iterator class that can be used, for example, with the existing foreach construct:

Stack<int> stack;

stack.Push(1);
stack.Push(2);
stack.Push(4);
foreach(int i in stack)
{
Console.WriteLine(i);
}
// Outputs 4, 2, 1
Essentially, the body of the "GetEnumerator" method and the body of the foreach construct are woven together. The yield instruction is substituted with the body of the foreach loop and the foreach loop itself is then substituted with the body of the "GetEnumerator" method. This is very powerful stuff! Unfortunately C++ has no equivalent. My C# to C++ translator would be responsible for all the substitutions. Too much hard work for a lazy programmer like me! And this is just one of the examples of C# constructs without direct C++ equivalents. Delegates?

Fortunately I think C# 2.0 might come to my rescue, or at least the new .NET class library supplied with C# 2.0. The reflection API has been extended with an awesome new feature. The program structure is now exposed all the way down to the level of individual MSIL assembly instructions. It should be relatively easy for me to go the MSIL byte code to C++ translator route, as I did for my second Java compiler. The process would be:
  1. Use any C# compiler (e.g. Microsoft's) to compile C# 2.0 source code into byte-code assemblies.
  2. My translator loads the assemblies, examines them using the new reflection API and outputs new C++ source code for those that have changed.
  3. Any C++ compiler compiles the C++ code into native code for whatever target platform.
The main advantage of translating from byte-code is I avoid having to deal with any higher-level language features not present in C++. I can treat C++ as a kind of structured assembly language. The main disadvantage is the generated C++ code will loose most of its high level structure and will most likely be practically unreadable to a human.


Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?