From kragen@dnaco.net Thu Jul 30 14:10:02 1998
Date: Thu, 30 Jul 1998 14:10:00 -0400 (EDT)
From: Kragen <kragen@dnaco.net>
To: James Weirich <james.weirich@sdrc.com>
cc: clug-user@clug.org
Subject: Re: OO in C (was Re: KDE crap)
In-Reply-To: <199807301700.NAA21833@sgcpu12>
Message-ID: <Pine.SUN.3.96.980730133002.21649D-100000@picard.dnaco.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Keywords:
X-UID: 895
Status: O
X-Status: 

I hope these programming posts don't scare off the newbies.  It
certainly isn't necessary to understand them to use Linux.  :)

On Thu, 30 Jul 1998, James Weirich wrote:
> >>>>> "Kragen" == Kragen  <kragen@pobox.com> writes:
>     Kragen> Actually, it's not too bad.  You don't have to do really
>     Kragen> awful-looking casts; you just have to do things like this:
> 
> Close ... You also need to pass in the object itself to each of your
> methods, otherwise you have no access to the private data of the
> object.

Whoops!  I, um, sort of missed that.  :)

> [... Kragen's example with an extra argument in the method calls ...]
> 
>     struct drawing_interface {
>         int (*draw)(struct drawn_object * obj);
>         int (*moveto)(struct drawn_object * obj, int x, int y);
>         int (*rmoveto)(struct drawn_object * obj, int dx, int dy);
>     };
> 
>     struct drawn_object {
>         struct drawing_interface *ftbl;
>         int *private_data;
>     };
> 
>     int f(struct drawn_object *x) {
>         return x->ftbl->rmoveto(x, 1, 1) &&
>                x-> ftbl->draw(x);
>     }

Yes, that looks more sensible.

Actually, you can write functions like these:
drawn_object_draw(drawn_object *x)  {
	x->ftbl->draw(x);
}
drawn_object_rmoveto(drawn_object *obj, int x, int y) {
	obj->ftbl->rmoveto(obj, x, y);
}

. . .but I'm not sure that's actually going to make the resulting code
any easier to read.

>     Kragen> You can make private_data a void* if you want, and store
>     Kragen> class-specific information in a class-specific struct,
>     Kragen> which requires somewhat-ugly casts inside your private
>     Kragen> functions, but if you're willing to store your data in an
>     Kragen> int array, you don't even need to do that.
> 
> I like the trick of storing storing data in a separate data structure,
> and would probably prefer the void pointer rather than an int array.
> (I used to store arbitrary data in int arrays, but that was in FORTRAN
> when I had little choice).

I figured int arrays would probably be all right for graphical shapes
-- you can even do things like:

#define center_x 0
#define center_y 1
#define radius 2
int circle_draw(drawn_object *circle) {
	XDrawCircle(xconn, xwnd, circle->data[center_x], 
		circle->data[center_y], circle->data[radius]);
	return 1;
}
/* . . . */
#undef center_x
#undef center_y
#undef radius

> This seems to work for this simple example. Most of your icky casting
> is inside your object (if you can't avoid icky casts, at least you can
> confine them).

Yes.  It's a lot nicer to deal with than operator overloading in C++. :)

>  It also introduces an extra layer of indirection to
> accessing your data, but its probably worth the cost when you need it.

You could get around that in ways similar to how struct sockaddr does
it, but I don't loike that!

> I think it gets more complicated very quickly in the general case.
> Suppose I had Circle and Rectangle classes that conformed to your
> drawn_object interface given above.  If Rectangle supported (in
> addition to the drawn_object interface) SetWidth() and SetHeight()
> methods, then the function table for Rectangle would be a different
> struct shape than that of drawn_object.

You can do what C++ and COM do, and have a separate function table for
Rectangle.  But I wasn't talking about how to do inheritance in C, just
polymorphism.  Seems like you could do this without all the fugly
casts, though.

>     AsShape(rect)->Draw(rect);
>     AsRectangle(rect)->Draw(rect);  /* I guess this would work too */
>     AsRectangle(rect)->SetWidth (rect, 10);
> 
> But that still is extra baggage to worry about.  And we still have a
> very fragil relationship among all the function tables where any
> change in a base class must be rippled throughout the entire class
> hierarchy below it manually.

Yes, that's unacceptable.  Instead:

struct drawn_object;
struct drawing_interface { /* as before */ };
struct drawn_object { 
	struct drawing_interface *ftbl; 
	void *data;
};
struct rectangle;
struct rectangle_interface {
	void (*set_width)(rectangle *obj);
	void (*set_height)(rectangle *obj);
};
struct rectangle {
	struct rectangle_interface *ftbl;
	struct drawn_object *base_class;
	int w, h, x, y;
};

struct rectangle * new_rectangle(int w, int h, int x, int y) {
	struct rectangle *nr = malloc(sizeof(*nr));
	/* omitting error-checking */
	nr->ftbl = rectangle_ftbl;
	nr->w = w; nr->h = h;
	nr->x = x; nr->y = y;
	nr->base_class = new_drawn_object(rectangle_drawing_interface, nr);
	assert(nr->base_class->ftbl == rectangle_drawing_interface);
	assert(nr->base_class->data == nr);
	/* no error checking.  Also, new_drawn_object takes an ftbl
	 * argument because it's an abstract class, so it doesn't have
	 * a valid ftbl of its own, while new_rectangle usually will
	 * be invoked to create an object that really is a rectangle,
	 * not some derived class.
	 */
	return nr;
}

This avoids most of the problems you described, although it's less
efficient; it could probably be made more efficient by changing some of
the pointers to in-place objects.  Just call me a Java addict.

> I've written a complete example of this technique in C and provided a
> C++ version for comparison.  Its just a little long for the mailing
> list, but if anyone is interested, you can find it at
> http://w3.one.net/~jweirich/oostuff.

Looks like you need to include an incomplete definition of struct
ShapeFuncTable.

> The I/O subsystem is one area that has often used this technique.  The
> first time I saw this example in code was the standard library
> proposal for Modula 2 for handling device independent I/O.

The sfio and vmalloc libraries also use it, as do a number of other
things from AT&T Bell Labs in recent years.

(I ought to look in my Lions book and see how Thompson handled it in
Unix.  I can't imagine he handled it much differently.)

> For "BIG" things that REALLY NEED polymorphism, I would agree the
> technique is worthwhile in C.  But as general programming style for
> most of you data structures, I think the baggage quickly becomes
> unmanagable.

Well, if you include inheritance, you might be right.  What do you
think of the implementation above, which is free of the disadvantages
you cited?

>     Kragen> Sometimes the advantages of run-time-bound polymorphism
>     Kragen> are too great to ignore.  [...]
> 
> And that's why it should be very easy to do, not cumbersome 

Agreed!

> like the C method.

Well, even with my implementation above, the C method is still more
code.  OTOH, you can *see* what's going on.

> 
>     Jim> [6] I actually did true polymorphism in C once.  [...] I
>     Jim> handled the messy problem of making sure the objects were
>     Jim> properly initialized by writing a Dialog Compiler that would
>     Jim> read a dialog description and generate the C code [...]
> 
>     Kragen> Ick.
> 
> I'm assuming the "Ick" is in response to the Dialog Compiler?  The
> Dialog did much more than just initialized the function table.  

Oh, OK.

> It
> built a complete C data structure from a simple text description of a
> dialog.  For example, I could say:
> 
>     Name:    @name___________________________________
>     Address: @addr___________________________________
>     City:    @city_________  State: @st  Zip: @zip___
>                                         @OK   @CANCEL
> 
> And it would generate all the C structures and links that would be
> interpreted at run time by the user inteface.  Each field (e.g. @name,
> @zip) would be declared to be a string field or a numeric field or
> whatever was needed.

That sounds like a win.  I withdraw my Ick.  On the other hand, you
could do something like

char *addrfield = 
"     Name:    @name___________________________________\n"
"     Address: @addr___________________________________\n"
"     City:    @city_________  State: @st  Zip: @zip___\n"
"                                         @OK   @CANCEL\n";

struct form addr;
init_form(&form, &database_retrieval_result);
display_form(&form, addrfield);
validate_zipcode(extract_field(&form, "city"), extract_field(&form, "st"),
		 extract_field(&form, "zip"));

. . . but I think that's really a bit more work than your method.

>     Kragen> Why not just declare a "new_objecttype" function for each
>     Kragen> type of object (is it OK to call them classes? :) ), and
>     Kragen> then just not instantiate any objects of that class
>     Kragen> anywhere else than in the new_objecttype function?  [...]
> 
> Actually, this is an excellent idea!  I wish I would have thought of
> it back then (although the Dialog Compiler would still be needed).  

I've had to do it many times.

Kragen (thinking OO, writing C)


