Intricacies of Ruby

Ruby is considered to be a pure object oriented language. An object is nothing but state(instance variable) plus behaviour(methods). Everything is an object in Ruby.

This extract is motivated from Dave Thomas. Following is the the diagram from source file object.c in ruby source code. We will try to understand this diagram slowly. But as it say, everything is an object. Object is a class and class is an object. Simple! Isn’t it? 😛

Screen Shot 2017-11-13 at 4.16.05 AM

Following is the diagram which gives detail description of Ruby core Object Model.(made by Artem S.)

ruby-core-object-model.png

Let us quickly understand the basic concepts first:

Self

  • predefined, read-only variable. i.e. we cannot write self=”self” #error
  • current object. puts self #<object>
  • where instance variables are found.
  • default receiver for method calls.

Note: It is important to understand what self is at any point of time!

Only two things can change self:

  • method call
  • class/module definition

Let us move ahead with first thing that changes self i.e. method call.

animal="cat"
puts animal.upcase #CAT

Here, animal variable is referring an object. Object has state and behaviour. State is string value i.e. “cat”. Object also has behaviour i.e. set of methods. All strings have same methods. When a method is invoked on any object on Ruby. It searches for the  method along the parent chain. So, in this case “cat” is an object of class String. upcase is defined in String from where it is invoked with this as “cat”.

Image here

Please note that String and many other ‘classes’ like Array, Hash, Date, etc. are objects of class Class. I know it is getting confusing. Please bear with me!

Note: Every single method call works exactly the same way. There are no special cases.

puts animal.object_id #364305

There is no object_id method in string. So, what happens here?

Here first it checks inside the methods of “cat” object. It does not find the ‘object_id’ method there. So, it moves ahead to String class(Class object), with no luck. And then, it checks for String superclass i.e. Object(Class object), where object_id method is defined.

Note: It is important to understand that what we keep referring as classes are actually objects of the class ‘Class’ in Ruby.

If we hit the top class and we don’t find the method called, then we go back to the bottom to search for method ‘method_missing’. It follows the same. ‘BasicObject’ implements ‘method_missing’. So, that is what reports error. It is not built in interpreter.

We know that we can define methods directly on object in Ruby. Let us try one here.

def animal.speak
  puts "meow"
end
animal.speak # meow
"cat".speak # NoMethodError

Only one object has the method ‘speak’! The object that the variable ‘animal’ is referring to has that method. How is it accessing that method? It is clear that String class doesn’t have it. Actually, Ruby inserts a class that has speak method before the String class. This class is an anonymous class and it is hard to see it. These classes were designed as effectively invisible. If we ask ‘animal’, its class. It will say that it is String. It lies. It is not String. It is this anonymous class.

Image here

animal.class #String

Note: This anonymous class is also referred as singleton class or meta class or eigen class.

So, the invisible class that Ruby creates makes this thing work(speak method on animal!).

  • This class is normal class but hidden
  • It is only one per object.

It is only when a method is created on animal when it creates a singleton class. It is not already there. When we add more methods to animal object, Ruby sees singleton class already present. So, it adds the method to that class.

So, this whole method calling in Ruby can be broken down to four steps:

  • Ruby takes the receiver i.e. animal and sets self to that receiver. So, before anything else happens, Ruby sets self to be that “cat” object.
  • Ruby then looks for method from the class that is associated with the object i.e. it scans from self’s class across the parent chain for the method. If it finds the method, it is fine. If not, it finds method_missing method.
  • Now, it invokes that method.
  • Finally, when the method exits and returns, Ruby restores the original value of self i.e. it pops it back off from the stack.

Note: At top level, self has value: main

puts self #main
puts self.class #Object

self at top level is instance of Object coupled with little properties. When we are defining methods, we are defining in singleton class of that object. Instance variables are in that object.

Note: puts is method of Kernel module. Kernel module is included in class Object. So, any method defined in Kernel module is actually a private method in class Object. 

Let us get to another thing that changes self i.e. class/module definition.

Note: In Ruby, class definitions are executable code.

puts 1
class A
  puts 2
end
puts 3

Output will be: 1 2 3

puts 1
class A
 puts 2
end if false
puts 3

Output will be: 1 3

This is because code which defines class is executable. It is just a code as any other.

We said that self always has a value. So, self shall also have a value in class definition.

puts self #main
class A
  puts self #A
end
puts self #main

what is A on that is printed on console inside class A. Let us dig in further. Let us find the class of self in A.

class A
 puts self.class #Class
end

A is an object of class Class.

In languages like Java, when we define class, it is its declaration and it is maintained in symbol table of classes somewhere. The same does not hold true for Ruby. In Ruby, classes are objects like anything else. Here A is just a constant that is referring to a class object. We can write A.class after defining A.

puts A.class #Class

We just called a method on it. We can also assign it to a variable.

x = A
obj = x.new

There is nothing special about it. It is just an object!

So, what happens when we write ‘class A’? Ruby creates a Class object and it assigns that class object to constant called A. Inside class definition, self is set to that class object.

Let us define what other languages call as class methods. In Ruby, we do it in following way:

class A
  def self.speak #or def A.speak because self is set to A!
    puts "Hi"
  end
end
A.speak #Hi

Most people would explain this as class method. They are wrong! because Ruby don’t have any class methods.

There is absolutely no difference between def A.speak and def animal.speak! They are identical. Ruby don’t even know that we are doing that inside a class because there is nothing special about it 😉

 

The way we saw def animal.speak, Ruby inserts singleton class in a similar way and puts method inside that class. This is called meta object protocol. It has removed the entire concept of static methods or class methods. It doesn’t need them. All methods are the same!

For metaprogramming in Ruby, just remember these two points:

  • Instance variables are looked up in self.
  • Methods are looked up in self’s class(singleton or normal).
animal="cat"
def animal.speak
  puts "meow"
end
##################
#this is same as:#
##################
animal="cat"
class << animal
  def speak
   puts "meow"
  end
end

class << animal means: It opens up singleton class of this object and start putting methods into that class.

So, rather than saying it as setting up class methods for animal(which is wrong), we say open up singleton class of animal and puts methods into it!

Let us take this forward to what most people refer to as class variables. You can guess where we are going  now. There are no class variables in Ruby. There is only instance variables. And, the instance variables are set on self.

class A
  @a = 1 #self here is A
  def get_a
    puts @a #self here is instance of class A
  end
  def self.get_a
    puts @a #self here is A
  end
end
x = A.new
puts x.get_a #nil
puts A.get_a #1

Instance variables are only stored relative to self. So, when we wrote @a=1, it set @a on what was self at that time(A, rather than instance of A).

Include

module M
  def speak
    puts "hi"
  end
end
class K
  include M
end
k = K.new
k.speak #hi
module M
  def speak
    puts "hello"
  end
end
k.speak #hello

Ruby inserts a class as immediate parent of K and its methods are methods of module. It does not make module that class’ parent(because we can include module in multiple classes). So, the parent link has to be different for every one of those. Ruby creates an anonymous class which points to module methods. If we then define a new class and include that module in that class, then a new anonymous class gets created and it also points towards that module methods. And that is why we change the methods of the module, they get changed for every single instance of that class. But again, there is no special case here. Method calling is same as before.

Singleton => Class => Module => Superclass #Ruby chain for finding method

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s