Introduction

Objective-C runtime is an advanced topic for Cocoa application development, yet we use it for every single line of our code.

In Garena's iOS Team, we regularly use many of its features, especially method swizzling and object association. Recently we investigated JSPatch, a popular open source project, which provides a mechanism to dynamically deploy code, and is used by many iOS apps. Its core functionalities such as dynamically adding/changing methods are implemented by taking advantages of runtime. This got us interested and so we did some research. We'd like to share our findings with you here.

There are many topics regarding Objective-C runtime. I am going to break them down into two parts. Hopefully after reading this series of posts, you will be able to work more confidently with Objective-C and Cocoa framework.

Here's the outline of the first part:

  • Runtime Overview
  • Object and Class
  • Method
  • Message Delivering

1.Runtime Overview

Written in C & Assembly, Objective-C runtime library is essentially the one adding all the dynamic features to C, such as method dispatching, method forwarding etc. It also creates all the support structure needed to make OOP possible.

1.1 Dynamic vs Static Language

Objective-C is a dynamic language, which means that the decision of what will be actually be executed is shifted from compile & link time to when it’s actually executing during runtime.

This is different from some of the other languages, like C. In C, invoking a function means jumping to a specific location in memory where the implementation is stored. This is settled at compile time. Therefore, there is much less flexibility compared to dynamic languages like Objective-C.

Let’s first take a look at this paragraph of code:

Programmer *codeMonkey = [[Programmer alloc] initWithName:@"monkey"];

[codeMonkey sayHi];
//compiler translates above line to:
objc_msgSend(codeMonkey, @selector(sayHi));

The code is fairly simple. We have a class called Person, and we call sayHi on it. The implementation of that method is not executed immediately. Instead, the compiler translates our method invocation to a C function call. This function send a message to the Person instance. An Objective-C object may or may not be able to handle this message. It can redirect this message to other objects.

This messaging mechanism is the fundamental of Objective-C runtime.

We will cover more details of message delivering later.

1.2 Interact with Runtime

We regularly take advantage of the runtime, without being aware of it. We are told to use classes subclassing from NSObject from the very first day of writing iOS application.
Why? Because much of the important and often troublesome functionalities like memory management, are integrated in NSObject. As long as we use subclasses of NSObject, we get those things for free.

The second way of interaction is through using the runtime library. The runtime library consists of set of C functions. Most of time, we do not need them, but sometimes they can come in very handy. You can use them by importing <objc/runtime.h>, or inspecting the source file in Apple's document.

2.Object and Class

In the world of OOP, a class is an extensible code-template that acts as an abstraction of logic and data. An object is an specific instance of a class. In this section, we will take look at how object and class are represented in Objective-C.

2.1 Object

This is how an object is defined in Objective-C.

struct objc_object {
    Class isa;
};

typedef struct objc_object *id;

As you can see, every object is a structure with a pointer to its class, and that's it. We also have our generic type id defined here, which is essentially a pointer to an objc_object.

So where does isa pointer point to?

2.2 Class

This is how Class is defined: a pointer to objc_class. There are quite some variables in this structure. Let's only focus on a few of them first.

typedef struct objc_class *Class

struct objc_class {
    Class isa;

    Class super_class                                       
    const char *name                                        
    long version                  
    long info                          
    long instance_size                                      
    struct objc_ivar_list *ivars                            
    struct objc_method_list **methodLists                  
    struct objc_cache *cache                                
    struct objc_protocol_list *protocols                   
}

isa

The isa pointer is perhaps the most important member of this structure. If you compare it to the object’s definition, they are actually very similar: they both start with an isa pointer. This indicates that class in Objective-C is essentially an object as well.

What does this imply? This means that you can send messages to a class, just like you can send message to an object. When you send a message to an object, the runtime will consult its class object about whether it responds to this message. Take note that in the class definition we have a pointer to pointer of objc_method_list. This makes dynamically adding/removing/exchanging methods possible.

super_class

Then we have a pointer to the super class object. If it’s the top most class already, like NSObject or NSProxy, super class pointer is set to NULL.
In terms of message delivering, a class object will follow this super class pointer to ask the super class to handle an unknown message.

objc_cache

When the Objective-C runtime inspects an object by following it's isa pointer it can find an object that implements many methods. However, you may only call a small portion of them and it makes no sense to search the classes method lists for all the selectors every time it does a lookup.
So the class implements a cache. Whenever the runtime searches through the class hierarchy for an implementation corresponding to a selector, it adds it to that cache. So when objc_msgSend looks through a class for a selector it searches in the class cache first.

meta class

If you've been reading carefully, you will probably wonder what is the isa pointer of the class object? Class of class? In fact, you are correct.

The class of class is called meta class. In the example code below, we try to create a NSArray instance by sending message to NSArray class object. Remember just now we mentioned that a class is an object, we can send message to it. Now we are sending a message to the NSArray class object. The runtime will consult its meta class object about what to do next.

NSArray *array = [NSArray array]

It’s very important to have a meta class as in it stores class methods which are different for each class.

In short,
Class object describes behaviour of object instances
Meta Class describes behaviour of class objects

meta-meta class?

Now you may again wonder where does the isa pointer for meta class point to? meta class’s meta class perhaps?

To stop this infinite recursion, the creators of Objective-C let all the isa pointers of meta classes point to the root meta class. And the root meta class’s isa pointer point to itself.

Now we have a complete picture of the class structure. Greg Parker has a very clear diagram posted on his blog. I re-post it here just for your easy reference.

alt

Please take note that root meta class's super class is our root class. The result of this inheritance hierarchy is that all instances, classes and meta classes in the hierarchy inherit from the hierarchy's base class.

Also this means that for all instances, classes and meta classes in the root class hierarchy, all instance methods of root class are valid. For the classes and meta classes, all class methods of the root class are also valid.

This may be confusing at this point. Let's take look at an example.

// Person class
#import "Person.h"

@interface Person()
@property (nonatomic, strong) NSString *name;
@end

@implementation Person
- (id)initWithName:(NSString *)name
{    
    self = [super init];
    if (self) {
        _name = name;
    }
    return self;
}

- (void)sayHi
{
    NSLog(@"My name is %@", self.name);
}
@end

//Programmer class, subclass of person
#import "Programmer.h"
@implementation Programmer
- (void)sayHi
{
    NSLog(@"Hello world! My name is %@", self.name);
}

//testing methods
- (void)testMetaClass
{
    NSLog(@"//testMetaClass");
    NSLog(@"This object is %p.", self);
    NSLog(@"Class is %@, and super is %@.", [self class], [self superclass]);
    
    Class currentClass = [self class];
    for (int i = 0; i < 4; i++)
    {
        NSLog(@"Following the isa pointer %d times gives %p", i+1, currentClass);
        currentClass = object_getClass(currentClass);
    }
    
    //Note we can not get meta class by [Person class], it only returns class
    NSLog(@"NSObject's meta class is %p", object_getClass([NSObject class]));
}


- (void)testSuperClass
{
    NSLog(@"//testSuperClass");
    NSLog(@"This object is %p.", self);
    NSLog(@"Class is %@, and super is %@.", [self class], [self superclass]);
    
    Class currentClass = [self class];
    Class currentMetaClass = object_getClass(currentClass);
    for (int i = 0; i < 4; i++)
    {
        NSLog(@"Following the super pointer %d times gives %p", i+1, currentClass);
        currentClass = class_getSuperclass(currentClass);
    }
    
    for (int i = 0; i < 5; i++)
    {
        NSLog(@"Following the meta class super pointer %d times gives %p", i+1, currentMetaClass);
        currentMetaClass = class_getSuperclass(currentMetaClass);
    }
    
    //Note we can not get meta class by [Person class], it only returns class
    NSLog(@"NSObject's meta class is %p", object_getClass([NSObject class]));
}

@end

We have two simple classes Person and Programmer, whereby Programmer inherits from Person. We also have some testing methods implemented in Programmer class. What testMetaClass does is basically following its isa pointer and print out isa pointer address. On the other hand, testSuperClass follows Programmer's superclass pointer.

We execute following code:

Programmer *codeMonkey = [[Programmer alloc] initWithName:@"codeMonkey"];
[codeMonkey testMetaClass];
[codeMonkey testSuperClass];

And here's the output:

//testMetaClass
This object is 0x7ffaf2c08050.
Class is Programmer, and super is Person.
Following the isa pointer 1 times gives 0x1096e9788
Following the isa pointer 2 times gives 0x1096e9760
Following the isa pointer 3 times gives 0x10b489198
Following the isa pointer 4 times gives 0x10b489198
NSObject's meta class is 0x10b489198

//testSuperClass
This object is 0x7ffaf2c08050.
Class is Programmer, and super is Person.
Following the super pointer 1 times gives 0x1096e9788
Following the super pointer 2 times gives 0x1096e96c0
Following the super pointer 3 times gives 0x10b489170
Following the super pointer 4 times gives 0x0
Following the meta class super pointer 1 times gives 0x1096e9760
Following the meta class super pointer 2 times gives 0x1096e96e8
Following the meta class super pointer 3 times gives 0x10b489198
Following the meta class super pointer 4 times gives 0x10b489170
Following the meta class super pointer 5 times gives 0x0
NSObject's meta class is 0x10b489198

The exact address of pointer is not important at all.
Firstly, let's try to trace the isa chain together.
So codeMonkey instance is at 0x7ffaf2c08050. Its class object is at 0x1096e9788. Its meta class object is at 0x1096e9760. The root meta class is at 0x10b489198. You can see the 4th trace is 0x10b489198 again, as class of root meta class is itself.

alt

Similarly we can trace the inheritance hierarchy by looking at the second section of output. Take note that root meta class's super class is 0x10b489170 which is the NSObject class object. Also the chain ends with NULL.

3.Method

typedef struct objc_class *Class

struct objc_class {
    //Class isa;

    //Class super_class                                       
    //const char *name                                        
    //long version                  
    //long info                          
    //long instance_size                                      
    //struct objc_ivar_list *ivars                            
    struct objc_method_list **methodLists                  
    //struct objc_cache *cache                                
    //struct objc_protocol_list *protocols                   
}

Just now, i mentioned briefly about the list of method list in a class. Let’s take a close look at what is a method list now.

struct objc_method_list {
    struct objc_method_list *obsolete                  
    int method_count                                        
    /* variable length structure */
    struct objc_method method_list[1]                      
}  

So basically method list is a variable length list storing objc_method structure. What is more interesting is how the object_method looks like.

struct objc_method {
    SEL method_name                                        
    char *method_types                                    
    IMP method_imp                                       
}                                                       

There are 3 members in this struct: a method_name, the method type and the method implementation. Let’s take a look at their types.

SEL

A selector in Objective-C is essentially a C data struct that serves as a means to identify an Objective-C method you want an object to perform. In the runtime it's defined like so:

typedef struct objc_selector  *SEL;     

object_selector is opaque type which is internally defined as a C string. You can treat as if every selector is actually the method name. But do take note that runtime does not store the method name exactly as it is. It maps the method name to another string which is unique within a class hierarchy. This implies that even with different params types, two methods with same name can not coexist in class.

IMP

typedef id (*IMP) (id self SEL _cmd, …)

IMP's are function pointers to the method implementations that the compiler will generate for you. If you compare the signature to objc_msgSend, they are actually the same. They all have an object, a selector and variables length of params as their parameters. By convention, the runtime will pass in self as the first argument, and the current selector as the second argument.
That’s the reason why we are able to call self and _cmd in our method. Also that’s why when we add our custom C function to a class, we have to include those two arguments at least.

Method type

Method types store information about that methods return type and arguments type. Runtime use type encoding to encode all those information in one string. There are fairly complicated mapping rules for that. You can take a look at Apple’s document here.
Alternatively, you can get it by using @encode compiler directive in your code. Here’s an example:

char *intTypeCode = @encode(int);
NSLog(@"%s", intTypeCode);

//output: i

The next question you may ask is why runtime needs these type info? Apple did not say much about it. We can only image how runtime use those information. For us, we will use this on a few occasions. For example:

NSMethodSignature *signature = [self methodSignatureForSelector:_cmd];
[signature methodReturnType];
[signature getArgumentTypeAtIndex:0];

Lastly, let's revisit the objc_method struct.

struct objc_method {
    SEL method_name                                        
    char *method_types                                    
    IMP method_imp                                       
}                                                       

This struct actually acts as a key-value mapping from a method name to its implementation and signature. That's why in some of other articles, the list of objc_method is also referred as dispatch table .

4.Message Delivering

Putting all the pieces together, we now have a clear idea about how runtime deliver messages:

  1. Messages are sent to class object following instance's isa pointer
  2. Message resolution at class object
    2a. check cache,
    2b. check method dispatch table
    2c. check super’s dispatch table
  3. Dynamic Method Resolution (more on this in partII)
  4. Method Forwarding (more on this in partII)

Food for Thought

To help you to test your understanding, we put 2 simple exercises at the end of the post. You can leave your answers in the comment section or you can email us your solution, so we can discuss together.

Q1. Trace Message Delivering

Trace the execution of the following code:

NSMutableArray *array = [NSMutableArray array];
[array count];
[array description];
[NSMutableArray description];

You can use this diagram as a reference:
alt

Q2. self vs. super

If self is passed in function as a parameter, how about super?

More concretely, if we alter the implementation of Programmer in previous example to following code:


@implementation Programmer

- (void)question2
{
    NSLog(@"1.%@", NSStringFromClass([self class]));
    NSLog(@"2.%@", NSStringFromClass([super class]));
}
@end

What's the output after executing following code:

Programmer *codeMonkey = [[Programmer alloc] initWithName:@"codeMonkey"];
[codeMonkey question2];

Hint: you can use clang -rewrite-objc Programmer.m to see the compiled code.

.References