Rules for Translating Code from C# to C++: Object Creation and Method Calls

Sometimes the behavior of code written in C# and C++ can differ. Let's take a closer look at how CodePorting.Translator Cs2Cpp handles such differences and ensures the correctness of the code translation. We will also learn how the conversion of unit tests is carried out.

Object Creation and Initialization

For creating reference type objects, we use the MakeObject function (analogous to std::make_shared), which creates an object using the new operator and immediately wraps it in a SharedPtr. Using this function has allowed us to avoid problems introduced by raw pointers, but it has created an access rights issue: since it is outside all classes, it did not have access to private constructors. Declaring this function as a friend of classes with non-public constructors would have made it possible to create such classes in all contexts. As a result, the external version of this function was limited to use with public constructors, and static MakeObject methods were added for non-public constructors, having the same access level and the same arguments as the proxied constructor.

C# programmers often use property initializers as part of the object creation expression. The corresponding syntax has to be wrapped in lambda functions because otherwise, it is not possible to write the initializers as part of a single expression:

Foo(new MyClass() { Property1 = "abc", Property2 = 1, Field1 = 3.14 });
Foo([&]{ auto tmp_0 = System::MakeObject<MyClass>();
        tmp_0->set_Property1(u"abc");
        tmp_0->set_Property2(1);
        tmp_0->Field1 = 3.14;
        return tmp_0;
    }());

Calls, Delegates, and Anonymous Methods

Method calls are transferred as is. When dealing with overloaded methods, it is sometimes necessary to explicitly cast argument types, as the rules for resolving overloads in C++ differ from those in C#. Consider, for example, the following code:

class MyClass<T>
{
    public void Foo(string s) { }
    public void Bar(string s) { }
    public void Bar(bool b) { }
    public void Call()
    {
        Foo("abc");
        Bar("def");
    }
}

After translation, it looks like this:

template<typename T>
class MyClass : public System::Object
{
public:
    void Foo(System::String s)
    {
        CODEPORTING_UNUSED(s);
    }
    void Bar(System::String s)
    {
        CODEPORTING_UNUSED(s);
    }
    void Bar(bool b)
    {
        CODEPORTING_UNUSED(b);
    }
    void Call()
    {
        Foo(u"abc");
        Bar(System::String(u"def"));
    }
};

Note: The method calls to Foo and Bar inside the Call method are written differently. This is because, without an explicit call to the System::String constructor, the Bar overload that accepts a bool would be called, as such type casting has a higher priority according to C++ rules. In the case of the Foo method, there is no such ambiguity, and the translator generates simpler code.

Another example of C# and C++ behaving differently is template expansion. In C#, type-parameter substitution occurs at runtime and does not affect the resolution of calls within generic methods. In C++, template argument substitution happens at compile time, so C# behavior must be emulated. For instance, consider the following code:

class GenericMethods
{
    public void Foo<T>(T value) { }
    public void Foo(string s) { }
    public void Bar<T>(T value)
    {
        Foo(value);
    }
    public void Call()
    {
        Bar("abc");
    }
}
class GenericMethods : public System::Object
{
public:
    template <typename T>
    void Foo(T value)
    {
        CODEPORTING_UNUSED(value);
    }
    void Foo(System::String s);
    template <typename T>
    void Bar(T value)
    {
        Foo<T>(value);
    }
    void Call();
};
void GenericMethods::Foo(System::String s)
{
}
void GenericMethods::Call()
{
    Bar<System::String>(u"abc");
}

Here, it is important to note the explicit specification of template arguments when calling Foo and Bar. In the first case, this is necessary because otherwise, when instantiating the version for T=System::String, the non-template version would be called, which differs from C# behavior. In the second case, the argument is needed because otherwise, it would be deduced based on the type of the string literal. Generally, the translator almost always has to explicitly specify template arguments to avoid unexpected behavior.

In many cases, the translator has to generate explicit calls where there are none in C# – this primarily concerns access methods to properties and indexers. Calls to constructors of reference types are wrapped in MakeObject, as shown above.

In .NET, there are methods that support overloading by the number and type of arguments through the params syntax, by specifying object as the argument type, or both at once – for example, such overloads exist for StringBuilder.Append() and Console.WriteLine(). Direct transfer of such constructions shows poor performance due to boxing and the creation of temporary arrays. In such cases, we add an overload that accepts a variable number of arguments of arbitrary types using variadic templates, and we translate the arguments as is, without type casts and merging into arrays. As a result, the performance of such calls is improved.

Delegates are translated into specializations of the MulticastDelegate template, which typically contains a container of std::function instances inside. Their invocation, storage, and assignment are carried out trivially. Anonymous methods are turned into lambda functions.

When creating lambda functions, it is necessary to extend the lifetime of the captured variables and arguments, which complicates the code, so the translator does this only where the lambda function has a chance to outlive the surrounding context. This behavior (extending the lifetime of variables, capture by reference or by value) can also be manually controlled to obtain more optimal code.

Exceptions

Emulating C# behavior in terms of exception handling is quite non-trivial. This is because exceptions in C# and C++ behave differently:

  • In C#, exceptions are created on the heap and are deleted by the garbage collector.
  • In C++, exceptions are copied between the stack and a dedicated memory area at different times.

This presents a contradiction. If C# exception types are translated as reference types, working with them via raw pointers (throw new ArgumentException), it would lead to memory leaks or significant problems with determining their deletion points. If they are translated as reference types but owned via a smart pointer (throw SharedPtr<ArgumentException>(MakeObject<ArgumentException>())), the exception cannot be caught by its base type because SharedPtr<ArgumentException> does not inherit from SharedPtr<Exception>. However, if exception objects are placed on the stack, they will be correctly caught by the base type, but when saved in a variable of the base type, information about the final type will be truncated.

To solve this problem, we created a special type of smart pointer ExceptionWrapper. Its key feature is that if the class ArgumentException inherits from Exception, then ExceptionWrapper<ArgumentException> also inherits from ExceptionWrapper<Exception>. Instances of ExceptionWrapper are used to manage the lifetime of exception class instances, and truncating the ExceptionWrapper type does not lead to truncation of the associated Exception type. The throwing of exceptions is handled by a virtual method, overridden by Exception descendants, which creates an ExceptionWrapper parameterized by the final exception type and throws it. The virtual nature allows the correct type of exception to be thrown, even if the ExceptionWrapper type was truncated earlier, and the link between the exception object and ExceptionWrapper prevents memory leaks.

Tests

One of the strengths of our framework is the ability to translate not only the source code but also the unit tests for it.

C# programmers use NUnit and xUnit frameworks. The translator converts the corresponding test examples to GoogleTest, replacing the syntax of checks and calling methods marked with the Test or Fact flag from the respective test functions. Both argument-less tests and input data like TestCase or TestCaseData are supported. An example of a translated test class is provided below.

[TestFixture]
class MyTestCase
{
    [Test]
    public void Test1()
    {
        Assert.AreEqual(2*2, 4);
    }
    [TestCase("123")]
    [TestCase("abc")]
    public void Test2(string s)
    {
        Assert.NotNull(s);
    }
}
class MyTestCase : public System::Object
{
public:
    void Test1();
    void Test2(System::String s);
};

namespace gtest_test
{

class MyTestCase : public ::testing::Test
{
protected:
    static System::SharedPtr<::ClassLibrary1::MyTestCase> s_instance;
    
public:
    static void SetUpTestCase()
    {
        s_instance = System::MakeObject<::ClassLibrary1::MyTestCase>();
    };
    
    static void TearDownTestCase()
    {
        s_instance = nullptr;
    };
};

System::SharedPtr<::ClassLibrary1::MyTestCase> MyTestCase::s_instance;

} // namespace gtest_test

void MyTestCase::Test1()
{
    ASSERT_EQ(2 * 2, 4);
}

namespace gtest_test
{

TEST_F(MyTestCase, Test1)
{
    s_instance->Test1();
}

} // namespace gtest_test

void MyTestCase::Test2(System::String s)
{
    ASSERT_FALSE(System::TestTools::IsNull(s));
}

namespace gtest_test
{

using MyTestCase_Test2_Args = System::MethodArgumentTuple<decltype(
    &ClassLibrary1::MyTestCase::Test2)>::type;

struct MyTestCase_Test2 : public MyTestCase, public ClassLibrary1::MyTestCase,
    public ::testing::WithParamInterface<MyTestCase_Test2_Args>
{
    static std::vector<ParamType> TestCases()
    {
        return
        {
            std::make_tuple(u"123"),
            std::make_tuple(u"abc"),
        };
    }
};

TEST_P(MyTestCase_Test2, Test)
{
    const auto& params = GetParam();
    ASSERT_NO_FATAL_FAILURE(s_instance->Test2(std::get<0>(params)));
}

INSTANTIATE_TEST_SUITE_P(, MyTestCase_Test2, 
    ::testing::ValuesIn(MyTestCase_Test2::TestCases()));

} // namespace gtest_test

Related articles