Since the Objective-C runtime exposes a C interface, it’s actually pretty easy to interact with from Rust. Over the past months I’ve worked on a Rust wrapper around the Objective-C runtime and some classes of the Foundation framework, creatively called rust-objc. I had hoped to learn more about Rust’s foreign function interface and the Objective-C runtime, but along the way I also encountered some interesting challenges in API design.
Calling Objective-C methods
If we want to interact with Objective-C from Rust, one of the first things
we’ll need to be able to do is call methods on Objective-C objects.
Let’s consider this example where we have an NSString
pointer, string
:
const char *c_string = [string UTF8String];
Since the Objective-C runtime actually has a C interface, we can invoke methods
in C using the objc_msgSend
function. Our previous Objective-C code is equivalent to this C:
SEL selector = sel_registerName("UTF8String");
const char *c_string = (const char *)objc_msgSend(string, selector);
Now that we see the C code, the translation into Rust is pretty straightforward using Rust’s foreign function interface. Once we’ve set up an interface for the functions of the Objective-C runtime, we can write:
let selector = "UTF8String".with_c_str(|name| unsafe {
sel_registerName(name)
});
let c_string = unsafe {
objc_msgSend(string, selector) as *const c_char
};
Nice! But this is Rust, we can make this better with Rust’s powerful macros. We can even support methods with arguments using a macro like this:
macro_rules! msg_send(
($obj:expr $($name:ident : $arg:expr)+) => ({
let sel_name = concat!($(stringify!($name), ':'),+);
let sel = sel_name.with_c_str(|name| {
sel_registerName(name)
});
objc_msgSend($obj, sel $(,$arg)+)
});
)
By adding a special case to our macro for the no-argument case, we can rewrite our example as:
let c_string = unsafe {
msg_send![string UTF8String] as *const c_char
};
And so we have a convenient way to call Objective-C methods with a syntax that feels comfortable for Objective-C developers.
Representing Objective-C objects
In our previous examples, we’ve been working with a variable named string
,
but what is the type of this variable? Well, in Objective-C it’d be declared
like this:
NSString *string;
The identical declaration in Rust would look like this:
let string: *const NSString;
Okay, so we’ll need some sort of NSString
type in Rust.
Since we don’t actually know or care about the memory layout of the NSString
,
we could declare it simply as a unit struct:
struct NSString;
There’s a problem with this, though: it allows user to construct an NSString
on the stack, and Objective-C objects only live on the heap!
let string_on_stack = NSString;
let c_string = unsafe {
msg_send![&string_on_stack UTF8String] as *const c_char // Oops!
};
We don’t actually want users to be able to construct our NSString
, we’ll just
be giving them pointers and references to one. To avoid this, we could use a
phantom type:
enum NSString { }
By using an enum with no variants, NSString
will be a valid type but there is
no way for users to instantiate an instance of one.
Unfortunately, this still has a problem; if a user has a reference to this
NSString
, they can still dereference it in safe code:
let string: &NSString;
let string_on_stack = *string;
This happens because the Rust compiler sees that our enum has no fields that
can’t be copied, and therefore infers that our enum is copyable as well.
To solve this, we must use the
NoCopy
marker:
struct NSString {
nocopy: NoCopy,
}
This also lets us use a struct again; now that it has a private field, users
cannot construct an NSString
themselves. As long as we don’t construct an
NSString
on the stack in our module, there will be no way in safe code for
users to end up with a stack-allocated NSString
.
Drawbacks of this representation
This isn’t a perfect solution, because even if there’s no way to get a
stack-allocated NSString
, the compiler will still accept definitions like:
let string: NSString;
let vector: Vec<NSString>;
Additionally, the following code will compile and run without doing anything:
let a: &mut NSString;
let b: &mut NSString;
mem::swap(a, b); // Doesn't actually do anything
Ideally, we would opt out of the
Sized
trait so that
the compiler would disallow these types as local variables, but unfortunately
it doesn’t seem possible to have an unsized type without all references to it
becoming “fat” two-word references.
Why not just wrap the pointer?
If an NSString
can never exist on the stack, why don’t we just prevent that
by making a struct that wraps a pointer?
struct NSString {
ptr: *mut c_void,
}
Let’s consider the case of an NSArray
of NSString
s.
If we want to get a string from the array, our array can’t return references
to this NSString
struct:
fn object_at(array: &NSArray, index: uint) -> &NSString {
let string_ptr = unsafe {
msg_send![array objectAtIndex:index]
};
let string = NSString { ptr: string_ptr };
&string // Oops! string doesn't live past this method
}
Instead, we’d have to return this NSString
struct by value, and then it’s
not tied to the lifetime of our array at all.
This would allow us to get multiple copies of an NSString
from our array and
try to mutate them simultaneously, which would cause a race condition.
To fix this we’d need to add a lifetime parameter to indicate that the string
is only valid as long as the array is mutably borrowed.
Not all strings should have this lifetime parameter, though, so we’d actually
end up needing 3 different NSString
representations:
an owned string (NSString
),
one representing an immutable borrow (NSStringRef<'a>
),
and one representing a mutable borrow (NSStringRefMut<'a>
).
This results in an interface that looks odd to both Rust and Objective-C
developers.
I felt that, despite the imperfections of representing Objective-C objects as structs in Rust, it makes for a much more usable API.
A safe Rust interface
Now that we’ve got a struct for representing our NSString
, we can implement
some methods on it. For example, we can wrap the
UTF8String
method using idiomatic Rust types:
impl NSString {
fn as_str(&self) -> &str {
unsafe {
let c_string = msg_send![self UTF8String] as *const c_char;
c_str_to_static_slice(c_string)
}
}
}
Here we can also see one of the challenges of wrapping Objective-C with a safe
interface. Of the C string returned by UTF8String
, the docs say:
This C string is a pointer to a structure inside the string object, which may have a lifetime shorter than the string object and will certainly not have a longer lifetime.
We’ve assumed that as long as the string isn’t mutated, the internal pointer is still valid, but since Foundation is closed source there isn’t really a way for us to verify this.
Inheritance
What happens when we decide to implement a safe interface for NSMutableString
?
Since NSMutableString
inherits from NSString
, it should also have this
method, but Rust structs don’t allow inheritance.
Instead of just duplicating the method, we can implement it in a trait:
trait INSString {
fn as_str(&self) -> &str {
unsafe {
let c_string = msg_send![self UTF8String] as *const c_char;
c_str_to_static_slice(c_string)
}
}
}
impl INSString for NSString { }
Now if we just implement INSString
for NSMutableString
, it’ll get this
functionality, too.
This trait is also useful for generic programming; with it, we can write
functions that take any type that implements the INSString
trait and will
accept either an NSString
or an NSMutableString
.
There is a drawback to this approach, though: users could implement this trait
for any type inappropriately.
Since it doesn’t require any other methods implemented, I could, in safe code,
just implement the INSString
trait for int
and then have undefined
behavior by sending Objective-C messages on an int
.
I don’t know of a way to prevent this without losing the convenience of
only declaring these methods once.
Objective-C memory management
Great, at this point we can call methods from an NSString
reference,
but where does this reference come from? What’s its lifetime?
Our Objective-C objects must be retained while we’re using them and released when we’re done with them, so this is a great fit for a custom smart pointer in Rust:
struct Id<T> {
ptr: *mut T,
}
impl<T> Drop for Id<T> {
fn drop(&mut self) {
unsafe { msg_send![self.ptr release]; }
}
}
impl<T> Deref<T> for Id<T> {
fn deref(&self) -> &T {
unsafe { &*self.ptr }
}
}
Now we can use this to create safe wrappers over an object’s initializers:
impl NSString {
fn new() -> Id<NSString> {
unsafe {
let cls = "NSString".with_c_str(|name| {
objc_getClass(name)
});
let obj = msg_send![class alloc];
let obj = msg_send![obj init];
Id { ptr: obj }
}
}
}
This finally allows us to work with an NSString
without any unsafe blocks!
let string = NSString::new();
println!("{}", string.as_str());
When the Id
goes out of scope, the object will automatically be released.
With just a few lines of Rust code, we’ve implemented our own simplified
version of Objective-C’s automatic reference counting.
Mutability
Sometimes we may want to retain a shared object, but it wouldn’t be safe to do
this if we implement
DerefMut
for any
Id
, because if it is mutably dereferenced in multiple places we’d have
aliasing mut references. Similarly, it’d be safe to implement
Clone
when the object is shared, but an Id
that implements DerefMut
shouldn’t
implement Clone
.
I chose to resolve this was by adding a phantom type parameter to Id
which is
either Owned
or Shared
. Then, we can implement Clone
only for a shared
Id
, and we can implement DerefMut
only for an owned Id
.
enum Owned { }
enum Shared { }
impl<T> Clone for Id<T, Shared> {
fn clone(&self) -> Id<T, Shared> {
unsafe { msg_send![self.ptr retain]; }
Id { ptr: self.ptr }
}
}
impl<T> DerefMut<T> for Id<T, Owned> {
fn deref_mut(&mut self) -> &mut T {
unsafe { &mut *self.ptr }
}
}
We can also allow an owned Id
to be “downgraded” to a shared Id
and then cloned.
impl<T> Id<T, Owned> {
fn share(self) -> Id<T, Shared> {
Id { ptr: self.ptr }
}
}
Thinking about Objective-C in terms of Rust’s memory semantics leads to some
interesting questions, and these phantom types will be used again.
For example, unlike a Vec
in Rust, an NSArray
can be copied without
copying all of its elements.
If we consider the array to own its objects, this isn’t safe because it could create aliasing mut references.
However, it’s totally fine if the array’s objects are shared.
We can resolve this by using an approach similar to Id
:
if our NSArray
has a type parameter for Owned
or Shared
,
we only implement copying for the shared array.