The OS-periment: RPC-Based Daemon Model Goes ‘RC’

Neolander 2011-05-29 OS News 92 Comments

It’s funny how trying to have a consistent system design makes you constantly jump from one area of the designed OS to another. I initially just tried to implement interrupt handling, and now I’m cleaning up the design of an RPC-based daemon model, which will be used to implement interrupt handlers, along with most other system services. Anyway, now that I get to something I’m personally satisfied with, I wanted to ask everyone who’s interested to check that design and tell me if anything in it sounds like a bad idea to them in the short or long run. That’s because this is a core part of this OS’ design, and I’m really not interested in core design mistakes emerging in a few years if I can fix them now. Many thanks in advance.

About The Author

92 Comments

2011-05-29 12:09 pm
Kaj-de-Vos
You asked for criticism, so I’ll be negative: RPC is a bad concept to base an entire OS on. It’s inherently tied to the implementation language and to the implementation details of the services. That makes it difficult to port, hard to keep compatible with itself over time, and thus hard to keep compatible with different versions of the services.
The abstraction level of RPC interfaces is too low. To solve these problems, you need a messaging specification at a higher abstraction level, through declarative data specification.

2011-05-29 12:52 pm
Neolander
You asked for criticism, so I’ll be negative: RPC is a bad concept to base an entire OS on. It’s inherently tied to the implementation language
Why couldn’t wrappers be used? Most Unices have a system API that’s linked to C concepts at the core, that doesn’t prevent C++ or Python wrappers from being used by people who prefer those languages, at the cost of a small performance hit.
and to the implementation details of the services.
Again, why has it to be the case? A good interface can be standard without revealing implementation details. If I say that my memory allocator is called using the malloc(uint parameter) function, how does it prevent me from changing the memory allocator later ?
That makes it difficult to port,
Define port. What do you want to port where?
hard to keep compatible with itself over time,
Unless I miss something, it’s not harder than having a library keep a consistent interface over time. Which is, again, a matter of having the library interface not depend on the implementation details. Why should it be so hard?
and thus hard to keep compatible with different versions of the services.
Not if people don’t break the interface every morning.
The abstraction level of RPC interfaces is too low.
Why? If the interface of C-style dynamic libraries is enough, how can the RPC interface, which is just a nonblocking and cross-process variant of it in the end, be different?
To solve these problems, you need a messaging specification at a higher abstraction level, through declarative data specification.
Well, I wait for answers to the questions above before asking for more details about your suggested answer.
2011-05-30 3:27 am
Brendan
Hi,
The abstraction level of RPC interfaces is too low.
In my opinion, it’s the opposite problem – the RPC interface is too high level.
A “call” can be broken into 4 phases – the callee waiting to be called, the caller sending data to the callee, the callee executing, and the callee sending data back to the caller.
This could be described as 3 operations – “wait for data”, “send data and wait to receive data back” and “send data and don’t wait to receive data back”.
Now, stop calling it “data” and call it “a message” (it’s the same thing anyway, mostly), and you end up with “get_message()”, “send_message_and_wait_for_reply()” and “send__message()”.
For synchronous software (e.g. emulating RPC); the callee does “get_message()” and blocks/waits until a message arrives, the caller does “send_message_and_wait_for_reply()” and blocks/waits until it receives the reply; and then the callee does “send_message()” to return the reply. It’s exactly like RPC.
The interesting thing is that for asynchronous software, you’d use “send_message()” and “get_message()” and don’t need anything else. Basically, by breaking it down into these primitives you get synchronous and asynchronous communication (rather than just synchronous); and people can mix and match without limitations. For example, you could have a fully asynchronous service, where one client process uses synchronous messaging to use the service and another client process uses asynchronous messaging to use the service, and the service itself doesn’t need to care what each client is doing.
However, you would probably want to offer a few extra primitives to make things easier. For example, you might consider adding “send_message_and_wait_for_reply_with_timeout()”, and “check_for_message()” (which would be like “get_message()” but returns a “NO_MESSAGES” error instead of blocking/waiting for a message when there are no messages to get).
-Brendan

2011-05-30 5:43 am
Alfman verbose=1
Brendan,
“A ‘call’ can be broken into 4 phases – the callee waiting to be called, the caller sending data to the callee, the callee executing, and the callee sending data back to the caller.”
I’ve done this before, usually passing XML data structures around and manipulating them with DOM & SAX Parsers. While the approach is flexible, I’d personally be terrified to work on a system where this model is used exclusively to glue hundreds or thousands of components together (as in an operating system).
Can you illustrate why breaking down messaging to such a low level is superior to what .net does with marshalling and web service proxy objects?
If you are not familiar with it, the .net compiler takes a SOAP web service and builds a proxy class which exposes all the functions in the SOAP interface. The proxy class exposes both synchronous functions and asynchronous ones.
MyWebService x = new MyWebService();
result = x.MyFunction(…); // synchronous
AsyncRequest r = x.Begin_MyFunction(…); // Async
… // other code
result = x.End_MyFunction(r); // Async return
Is there a good reason typical devs might want to access the messaging stack at a lower level than this?
Keep in mind, that a programmer could always pass a single hash table to any function, which would technically be as expressive and extensible as any other conceivable messaging protocol (so long as the inner objects are either serializable or marshalled).
Edited 2011-05-30 05:46 UTC
2011-05-30 5:48 am
Neolander
I probably shouldn’t use the “RPC” term, you too got confused into thinking that I was talking about blocking calls, while I am in fact doing nonblocking calls.
Once you have a nonblocking call interface, you can trivially implement a blocking call interface on top of it. I simply choose not to do it because I don’t want to favor this kind of dirty practices if I can avoid to.
As for RPC being too high level, well… I’m tempted to say that pipes are too low level.
Don’t get me wrong, pipes are great for programs of the “streaming” kind, which have an input data stream, process it, and return results in an output data stream. That’s why I have them. But most tasks of a system API do not belong to the data stream processing family, and are more about executing a stream of instructions.
In that case, pipes are too low-level, because they are fundamentally a transportation medium for integer data, not instructions. If you want to send instructions across a pipe, you have to use a communication protocol on top of the pipe layer in order to get an instruction representation, so what you have is user-mode RPC implemented on top of the pipe IPC primitive.
I personally think that if an IPC primitive is to be very frequently used, it’s better to implement it directly in the kernel (or at least parts of it), due to the extra control it gives over the communication process. The kernel executes trusted code, but library code can be compromised.

2011-05-30 10:08 am
Brendan
Hi,
I probably shouldn’t use the “RPC” term, you too got confused into thinking that I was talking about blocking calls, while I am in fact doing nonblocking calls.
A call is something that doesn’t return until it completes. A “non-blocking call” is something that defies logic..
I got the impression that your “non-blocking call” is a pair of normal/blocking calls, where (for e.g.) the address of the second call is passed as an argument to the first call (a callback). I also got the impression you’re intending to optimise the implementation, so that blocking calls that return no data don’t actually block (but that’s an implementation detail rather than something that effects the conceptual model).
As for RPC being too high level, well… I’m tempted to say that pipes are too low level.
I’m not sure where pipes were mentioned by anyone, but I don’t really like them much because they force the receiver to do extra work to determine where each “piece of data” ends.
Pipes can also make scheduling less efficient. For e.g. if a task unblocks when it receives IPC (like it should), then a task can unblock, look at what it received, realise it hasn’t received enough data to do anything useful, and block again; which is mostly a waste of time (and task switches).
For an analogy (to summarise), think of email. Asynchronous messaging is like people writing emails and sending them to each other whenever they want while they do other things. Synchronous messaging and RPC is like people sending emails and then sitting there doing nothing for hours while they wait for a reply. Pipes are like people sending pieces of a conversation – “I just sent this email to say hell“, “o and wish you a happy birth“, “day.\n -Fred\000Dear sir, we are“…
I personally think that if an IPC primitive is to be very frequently used, it’s better to implement it directly in the kernel (or at least parts of it), due to the extra control it gives over the communication process. The kernel executes trusted code, but library code can be compromised.
I assumed IPC primitives would be implemented directly in the kernel because you can’t implement IPC anywhere else. For example, if you have an “IPC service” implemented as a process/daemon, how would processes communicate with the “IPC service”?
The other thing to consider is that usually IPC has a certain amount of control over the scheduler – tasks block when waiting for IPC, and tasks unblock (and then potentially preempt) when they receive IPC, so it makes sense to implement it near the scheduler.
– Brendan

2011-05-30 11:41 am
Neolander
A call is something that doesn’t return until it completes. A “non-blocking call” is something that defies logic..
I got the impression that your “non-blocking call” is a pair of normal/blocking calls, where (for e.g.) the address of the second call is passed as an argument to the first call (a callback). I also got the impression you’re intending to optimise the implementation, so that blocking calls that return no data don’t actually block (but that’s an implementation detail rather than something that effects the conceptual model).
What I want to do is…
1/Process A gives work to do to process B through a “fast” system call, that in turn calls a function of B in a new thread using a stack of parameters given by A.
2/Process A forgets about it and goes doing something else.
3/When process B is done, it sends a callback to process A through the same mechanism using which A has given B work to do (running a function of A). Callbacks may have parameters, the “results” of the operation.
Does it remind you of something ?
I’m not sure where pipes were mentioned by anyone, but I don’t really like them much because they force the receiver to do extra work to determine where each “piece of data” ends.
For me send_message() and get_message() was like pipe operation (you send messages to or receive messages from the pipe). Sorry if I didn’t get it.
For an analogy (to summarise), think of email. Asynchronous messaging is like people writing emails and sending them to each other whenever they want while they do other things. Synchronous messaging and RPC is like people sending emails and then sitting there doing nothing for hours while they wait for a reply. Pipes are like people sending pieces of a conversation – “I just sent this email to say hell“, “o and wish you a happy birth“, “day.\n -Fred\000Dear sir, we are“…
Then what I do is definitely not RPC in the usual sense, as it is an asynchronous mechanism too. If the above description reminds you of some better name, please let me now.
I assumed IPC primitives would be implemented directly in the kernel because you can’t implement IPC anywhere else. For example, if you have an “IPC service” implemented as a process/daemon, how would processes communicate with the “IPC service”?
If you have something like a pipe or message queue, you can implement higher-level IPC protocols on top of it, and use user-space libraries to implement a new IPC mechanism that uses these protocols. That’s what I was talking about. But except when trying to make the kernel unusually tiny, I’m not sure it’s a good idea either.
The other thing to consider is that usually IPC has a certain amount of control over the scheduler – tasks block when waiting for IPC, and tasks unblock (and then potentially preempt) when they receive IPC, so it makes sense to implement it near the scheduler.
Totally agree.
Edited 2011-05-30 11:57 UTC
2011-05-31 2:44 am
Brendan
What I want to do is…
1/Process A gives work to do to process B through a “fast” system call, that in turn calls a function of B in a new thread using a stack of parameters given by A.
2/Process A forgets about it and goes doing something else.
3/When process B is done, it sends a callback to process A through the same mechanism using which A has given B work to do (running a function of A). Callbacks may have parameters, the “results” of the operation.
Does it remind you of something ?
While I can see some similarities between this and asynchronous messaging, there’s also significant differences; including the overhead of creating (and eventually destroying) threads, which (in my experience) is the third most expensive operation microkernels do (after creating and destroying processes).
On top of that, (because you can’t rely on the queues to serialise access to data structures) programmers would have to rely on something else for reentrancy control; like traditional locking, which is error-prone (lots of programmers find it “hard” and/or screw it up) and adds extra overhead (e.g. mutexes with implied task switches when under lock contention).
I also wouldn’t underestimate the effect that IPC overhead will have on the system as a whole (especially for “micro-kernel-like” kernels). For example, if IRQs are delivered to device drivers via. IPC, then on a server under load (with high speed ethernet, for e.g.) you can expect thousands of IRQs per second (and expect to be creating and destroying thousands of threads per second). Once you add normal processes communicating with each other, this could easily go up to “millions per second” under load. If IPC costs twice as much as it does on other OSs, then the resulting system as a whole can be 50% slower than comparable systems (e.g. other micro-kernels) because of the IPC alone.
If you have something like a pipe or message queue, you can implement higher-level IPC protocols on top of it, and use user-space libraries to implement a new IPC mechanism that uses these protocols. That’s what I was talking about. But except when trying to make the kernel unusually tiny, I’m not sure it’s a good idea either.
In general, any form of IPC can be implemented on top of any other form of IPC. In practice it’s not quite that simple because you can’t easily emulate the intended interaction with scheduling (blocking/unblocking, etc) in all cases; and even in cases where you can there’s typically some extra overhead involved.
The alternative would be if the kernel has inbuilt support for multiple different forms of IPC; which can lead to a “Tower of Babel” situation where it’s awkward for different processes (using different types of IPC) to communicate with each other.
Basically, you want the kernel’s inbuilt/native IPC to be adequate for most purposes, with little or no scaffolding in user-space.
– Brendan
2011-05-31 7:26 am
Neolander
While I can see some similarities between this and asynchronous messaging, there’s also significant differences; including the overhead of creating (and eventually destroying) threads, which (in my experience) is the third most expensive operation microkernels do (after creating and destroying processes).
Ah, Brendan, Brendan, how do you always manage to be so kind and helpful with people who play with OSdeving ? Do you teach it in real life or something ?
Anyway, have you pushed your investigation so far that you know which step of the thread creation process is expensive ? Maybe it’s something whose impact can be reduced…
On top of that, (because you can’t rely on the queues to serialise access to data structures) programmers would have to rely on something else for reentrancy control; like traditional locking, which is error-prone (lots of programmers find it “hard” and/or screw it up) and adds extra overhead (e.g. mutexes with implied task switches when under lock contention).
This has been pointed out by Alfman, solved by introducing an asynchronous operating mode where pending threads are queued and run one after the other. Sorry for not mentioning it in the post where I try to describe my model, when I noticed the omission it was already too late to edit.
I also wouldn’t underestimate the effect that IPC overhead will have on the system as a whole (especially for “micro-kernel-like” kernels).
I know, I know, but then we reach one of those chicken and egg problems which are always torturing me : how do I know that my IPC design is “light enough” without doing measurements on a working system for real-world use cases ? And how do I perform these measurements on something which I’m currently designing and is not implemented yet ?
For example, if IRQs are delivered to device drivers via. IPC, then on a server under load (with high speed ethernet, for e.g.) you can expect thousands of IRQs per second (and expect to be creating and destroying thousands of threads per second). Once you add normal processes communicating with each other, this could easily go up to “millions per second” under load. If IPC costs twice as much as it does on other OSs, then the resulting system as a whole can be 50% slower than comparable systems (e.g. other micro-kernels) because of the IPC alone.
First objection which spontaneously comes to my mind is that this OS is not designed to run on server, but rather on desktop and smaller single-user computers.
Maybe desktop use cases also include the need to endure thousands of IRQ per second though, but I was under the impression that desktop computers are ridiculously powerful compared to what one asks from their OSs and that their reactivity issues rather come from things like poor task scheduling (“running the divx encoding process more often than the window manager”) or excessive dependency on disk I/O.
In general, any form of IPC can be implemented on top of any other form of IPC. In practice it’s not quite that simple because you can’t easily emulate the intended interaction with scheduling (blocking/unblocking, etc) in all cases; and even in cases where you can there’s typically some extra overhead involved.
Understood.
The alternative would be if the kernel has inbuilt support for multiple different forms of IPC; which can lead to a “Tower of Babel” situation where it’s awkward for different processes (using different types of IPC) to communicate with each other.
Actually, I tend to lean towards this solution, even though I know of the Babel risk and have regularly thought about it, because each IPC mechanism has its strength and weaknesses. As an example, piping and messaging systems are better when processing a stream of data, while remote calls are better suited when giving a process some tasks to do.
You’re right that I need to keep the number of available IPC primitives very small regardless of the benefits of each, though, so there’s a compromise there and I have to investigate the usefulness of each IPC primitive.
Edited 2011-05-31 07:28 UTC
2011-06-02 10:02 am
Brendan
PART I
Hi,
While I can see some similarities between this and asynchronous messaging, there’s also significant differences; including the overhead of creating (and eventually destroying) threads, which (in my experience) is the third most expensive operation microkernels do (after creating and destroying processes).
Ah, Brendan, Brendan, how do you always manage to be so kind and helpful with people who play with OSdeving ? Do you teach it in real life or something ?
Anyway, have you pushed your investigation so far that you know which step of the thread creation process is expensive ? Maybe it’s something whose impact can be reduced…
Thread creation overhead depends on a lot of things; like where the user-space stack is (and if it’s pre-allocated by the caller), how kernel stacks are managed (one kernel stack per thread?), how CPU affinity and CPU load balancing works, how much state is saved/restored on thread switches and must be initialised to default values during thread creation (general registers, FPU/MMX/SSE, debug registers, performance monitoring registers, etc), how thread local storage is managed, etc.
For an extremely simple OS (single-CPU only, no support for FPU/MMX/SSE, no “per thread” debugging, no “per thread” performance monitoring, no thread local storage, no “this thread has used n cycles” time accountancy) that uses one kernel stack (e.g. an unpreemptable kernel); if the initial state of a thread’s registers is “undefined”, and the thread’s stack is pre-allocated; then it could be very fast. Not sure anyone would want an OS like that though (maybe for embedded systems).
Also, if other operations that a kernel does are extremely slow then thread creation could be “faster than extremely slow” in comparison.
There’s something else here too though. For most OSs, typiaclly only a thread within a process can create a thread for that process; which means that at the start of thread creation the CPU/kernel is using the correct process’ address space, so it’s easier to setup the new thread’s stack and thread local storage. For your IPC this isn’t the case (the sending process’ address space would be in use at the time you begin creating a thread for receiving process), so you might need to switch address spaces during thread creation (and blow away TLB entries, etc) if you don’t do it in a “lazy” way (postpone parts of thread creation until the thread first starts running).
On top of that, (because you can’t rely on the queues to serialise access to data structures) programmers would have to rely on something else for reentrancy control; like traditional locking, which is error-prone (lots of programmers find it “hard” and/or screw it up) and adds extra overhead (e.g. mutexes with implied task switches when under lock contention).
This has been pointed out by Alfman, solved by introducing an asynchronous operating mode where pending threads are queued and run one after the other. Sorry for not mentioning it in the post where I try to describe my model, when I noticed the omission it was already too late to edit.
Hehe. Let’s optimise the implementation of this!
You could speed it up by having a per-process “thread cache”. Rather than actually destroying a thread, you could pretend to destroy it and put it into a “disused thread pool” instead, and then recycle these existing/disused threads when a new thread needs to be created. To maximise the efficiency of your “disused thread pool” (so you don’t have more “disused threads” than necessary), you could create (or pretend to create) the new thread when IPC is being delivered to the receiver and not when IPC is actually sent. To do that you’d need a queue of “pending IPC”. That way, for asynchronous operating mode you’d only have a maximum of one thread (per process), where you pretend to destroy it, then recycle it to create a “new” thread, and get the data needed for the “new” thread from the queue of “pending IPC”.
Now that it’s been optimised, it looks very similar to my “FIFO queue of messages”. Instead of calling “getmessage()” and blocking until a message is received, you’d be calling “terminate_thread()” and being put into a “disused thread pool” until IPC is received. The only main difference (other than terminology) is that you’d still be implicitly creating one initial thread.
[Continued in Part II – silly 8000 character limit…]
2011-06-02 10:03 am
Brendan
PART II
I also wouldn’t underestimate the effect that IPC overhead will have on the system as a whole (especially for “micro-kernel-like” kernels).
I know, I know, but then we reach one of those chicken and egg problems which are always torturing me : how do I know that my IPC design is “light enough” without doing measurements on a working system for real-world use cases ? And how do I perform these measurements on something which I’m currently designing and is not implemented yet ?
You can estimate; but how fast would be “too fast”?
In my opinion, there’s no such thing as too fast. It’s mostly a question of whether any extra overheads are worth any extra benefits.
For example, if IRQs are delivered to device drivers via. IPC, then on a server under load (with high speed ethernet, for e.g.) you can expect thousands of IRQs per second (and expect to be creating and destroying thousands of threads per second). Once you add normal processes communicating with each other, this could easily go up to “millions per second” under load. If IPC costs twice as much as it does on other OSs, then the resulting system as a whole can be 50% slower than comparable systems (e.g. other micro-kernels) because of the IPC alone.
First objection which spontaneously comes to my mind is that this OS is not designed to run on server, but rather on desktop and smaller single-user computers.
Maybe desktop use cases also include the need to endure thousands of IRQ per second though, but I was under the impression that desktop computers are ridiculously powerful compared to what one asks from their OSs and that their reactivity issues rather come from things like poor task scheduling (“running the divx encoding process more often than the window manager”) or excessive dependency on disk I/O.
The difference between a server OS and a desktop OS is mostly artificial product differentiation made in higher levels of the OS (e.g. if it comes with HTTP/FTP servers and no GUI, or if it comes with a GUI and no HTTP/FTP servers; licensing, advertising, availability of support contracts, etc). It makes no difference at the lowest levels; until/unless you start looking at fault tolerance features (redundancy, hot-plug RAM/CPUs, etc).
The alternative would be if the kernel has inbuilt support for multiple different forms of IPC; which can lead to a “Tower of Babel” situation where it’s awkward for different processes (using different types of IPC) to communicate with each other.
Actually, I tend to lean towards this solution, even though I know of the Babel risk and have regularly thought about it, because each IPC mechanism has its strength and weaknesses. As an example, piping and messaging systems are better when processing a stream of data, while remote calls are better suited when giving a process some tasks to do.
Well, not quite (I’m not sure you fully understand the differences between messaging and pipes).
Pipes would work well for streams of bytes, but messaging wouldn’t be ideal (there’d be extra/unnecessary overhead involved with breaking a stream into smaller pieces). Most things aren’t streams of bytes though – they’re streams of “some sort of data structure”. Maybe a stream of “video frames”, a stream of “keypresses”, a stream of “commands/instructions”, a stream of “compressed blocks of audio”, etc. In these cases there’s natural boundaries between the “some sort of data structures” – messages would be ideal (one message per “some sort of data structure”) and pipes wouldn’t be ideal (there’d be extra/unnecessary overhead involved with finding the natural boundaries).
Also, for messaging each message typically has a “message type” associated with it. This means that the same receiver can handle many different things at the same time (e.g. it could be receiving a stream of “video frames”, a stream of “keypresses” and a stream of “compressed blocks of audio” at the same time, and be able to distinguish between them using the message types). Pipes don’t work like that – you’d need a different pipe for each stream. This means that rather than waiting to receive messages of any type, you end up needing something special like “select()” to monitor many different pipes.
– Brendan
2011-06-02 10:47 am
Neolander
You can estimate; but how fast would be “too fast”?
In my opinion, there’s no such thing as too fast. It’s mostly a question of whether any extra overheads are worth any extra benefits.
For me, “too fast” would be when gaining extra speed implies having another desirable characteristic of the OS become exceedingly low. Speed has its trade-offs, and to solve the problem of trade-offs it’s good to have goals. Well, I think we agree there anyway.
The difference between a server OS and a desktop OS is mostly artificial product differentiation made in higher levels of the OS (e.g. if it comes with HTTP/FTP servers and no GUI, or if it comes with a GUI and no HTTP/FTP servers; licensing, advertising, availability of support contracts, etc). It makes no difference at the lowest levels; until/unless you start looking at fault tolerance features (redundancy, hot-plug RAM/CPUs, etc).
That’s the way it’s done today, but it has not been always done like that. Classic Windows and Mac OS, as an example, were designed for desktop use, and would have been terrible as server OSs for a number of reasons.
With TOSP, I design solely for “desktop” (more precisely, personal) computers, because I believe that reducing the amount of target use cases will in turn simplify the design in some areas and reduce the amount of trade-offs, resulting in a design that’s lighter and better suited for the job. It’s the classical “generic vs specialized” debate, really.
Well, not quite (I’m not sure you fully understand the differences between messaging and pipes).
For me, the difference is about what is the atomic transmitted unit that is processed by the recipient.
In a pipe, that atomic unit is a fixed-size heap of binary data, typically a byte in the UNIX world. In a message, the atomic unit is a variable-sized message, which is not actually processed by the recipient until the message’s terminator has been received.
Am I right ?
Pipes would work well for streams of bytes, but messaging wouldn’t be ideal (there’d be extra/unnecessary overhead involved with breaking a stream into smaller pieces). Most things aren’t streams of bytes though – they’re streams of “some sort of data structure”. Maybe a stream of “video frames”, a stream of “keypresses”, a stream of “commands/instructions”, a stream of “compressed blocks of audio”, etc. In these cases there’s natural boundaries between the “some sort of data structures” – messages would be ideal (one message per “some sort of data structure”) and pipes wouldn’t be ideal (there’d be extra/unnecessary overhead involved with finding the natural boundaries).
But what about a kind of pipe which would take something larger than a byte as its basic transmission unit ?
Like, you know, if a keypress is defined by a 32-bit scancodes, a 32-bit scancode pipe ?
You could avoid the overhead of a messaging protocols for fixed-size structures this way.
Also, for messaging each message typically has a “message type” associated with it. This means that the same receiver can handle many different things at the same time (e.g. it could be receiving a stream of “video frames”, a stream of “keypresses” and a stream of “compressed blocks of audio” at the same time, and be able to distinguish between them using the message types). Pipes don’t work like that – you’d need a different pipe for each stream. This means that rather than waiting to receive messages of any type, you end up needing something special like “select()” to monitor many different pipes.
What if a program could monitor several streams at the same time by having a different callback being triggered when a message comes in each of the pipes ?
(Again, if the OSnews comment system has decided to archive this discussion by the time you answer, feel free to continue replying on my blog.)
Edited 2011-06-02 10:50 UTC
2011-06-02 6:59 pm
Alfman verbose=1
Neolander,
“Like, you know, if a keypress is defined by a 32-bit scancodes, a 32-bit scancode pipe ?
You could avoid the overhead of a messaging protocols for fixed-size structures this way.”
Unless there’s a compelling reason, I wouldn’t limit devs to fixed size messages.
“What if a program could monitor several streams at the same time by having a different callback being triggered when a message comes in each of the pipes ?”
Hmm, all this talk of pipes is making me think why aren’t pipes and RPC unified into one fundamental concept?
The typical use cases for pipes is that they are explicitly “read”, where as for RPC a function is explicitly called with parameters “passed in”.
But wouldn’t it be possible for them to share the same paths in the OS and just let the userspace determine which access method is more appropriate?
Would there be a shortcoming in doing so?
…Just a thought.
2011-06-02 6:44 pm
Alfman verbose=1
Brendan,
“You could speed it up by having a per-process “thread cache”. Rather than actually destroying a thread, you could pretend to destroy it and put it into a “disused thread pool” instead, and then recycle these existing/disused threads when a new thread needs to be created.”
Yes, many MT devs use this design, I use it in my async package because dynamic thread creation via pthreads is so costly. But I do wonder if it is inherently slow, or just that pthreads/linux are inefficient at it.
In theory, one could just alloc a stack, register swap area, TLS and accounting structure (possibly in one call to malloc). Then we add this structure to a linked list of threads and fire off the thread function.
It wouldn’t even be necessary to coordinate the above with other CPUs if each CPU had it’s own process thread “list”.
It wouldn’t even be necessary to coordinate with any syscalls (if userspace could be trusted with managing the thread list within it’s own process, as long as the process only endangers itself, this might not be an issue).
The entire thread lifecycle could take place on a single CPU with no inter CPU synchronization at all.
Now obviously, there would need to be a separate mechanism to migrate threads between CPUs. But this migration process might be able to take the synchronization hits once per cycle, instead of once per thread.
During an interrupt (pre-emptive threads), an explicit yield, or a blocking OS call, the OS would swap the cpu state and start another thread in the queue.
That summerizes the lightest thread implementation I can conceive of, and if it worked that way, I would think the performance between a thread pool and ordinary thread creation might be a wash (particularly with an efficient or pooled malloc).
“Most things aren’t streams of bytes though – they’re streams of ‘some sort of data structure’.”
The lack of message boundaries has always been a major design flaw in my opinion.
There’s no reason (other than legacy design) that pipes shouldn’t allow us to send discrete packets. It should be possible for high level application code to specify boundaries (even if very large packets still span multiple read requests).
This deficiency has lead to the same design patterns being reimplemented over and over again inside of network applications needing to reassemble messages from the stream.
“Also, for messaging each message typically has a “message type” associated with it. This means that the same receiver can handle many different things at the same time”
Yes, but if the OS has explicit support for RPC, wouldn’t the need for discrete messages and message types be largely eliminated?
2011-06-02 10:26 am
Neolander
There’s something else here too though. For most OSs, typiaclly only a thread within a process can create a thread for that process; which means that at the start of thread creation the CPU/kernel is using the correct process’ address space, so it’s easier to setup the new thread’s stack and thread local storage. For your IPC this isn’t the case (the sending process’ address space would be in use at the time you begin creating a thread for receiving process), so you might need to switch address spaces during thread creation (and blow away TLB entries, etc) if you don’t do it in a “lazy” way (postpone parts of thread creation until the thread first starts running).
Not necessarily, it depends where the IPC primitives are managed. If RPC is done through system calls, then you can create a thread while you’re in kernel mode and have no extra address space switching overhead.
Hehe. Let’s optimise the implementation of this!
You could speed it up by having a per-process “thread cache”. Rather than actually destroying a thread, you could pretend to destroy it and put it into a “disused thread pool” instead, and then recycle these existing/disused threads when a new thread needs to be created. To maximise the efficiency of your “disused thread pool” (so you don’t have more “disused threads” than necessary), you could create (or pretend to create) the new thread when IPC is being delivered to the receiver and not when IPC is actually sent. To do that you’d need a queue of “pending IPC”. That way, for asynchronous operating mode you’d only have a maximum of one thread (per process), where you pretend to destroy it, then recycle it to create a “new” thread, and get the data needed for the “new” thread from the queue of “pending IPC”.
Now that it’s been optimised, it looks very similar to my “FIFO queue of messages”. Instead of calling “getmessage()” and blocking until a message is received, you’d be calling “terminate_thread()” and being put into a “disused thread pool” until IPC is received. The only main difference (other than terminology) is that you’d still be implicitly creating one initial thread.
Yeah, I had thought about something like this for thread stacks (keeping a cache of orphaned stacks to remove the need to allocate them). But you’re right that it can totally be done for full threads, with even better performance.

2011-05-30 12:29 pm
Kaj-de-Vos
You’re talking about the transport method. That is indeed the other side of the coin. I have been talking about the problem that RPC implies an inflexible semantic data exchange (the payload). You’re right that RPC also implies an inflexible transport method.

2011-05-29 1:20 pm
Kaj-de-Vos
I’m not really interested in spending a lot of time discussing this, sorry. You asked for warnings, and this is mine. We could argue endlessly about the details, but it boils down to this: the abstraction level of declarative messaging is higher than RPC. Leaking of implementation details is detrimental to interfacing with other hardware architectures, binding with other languages, and interfacing with historical versions of interfaces on the same language and hardware architecture. Therefore, a higher abstraction level is desirable.

2011-05-29 7:17 pm
Alfman verbose=1
Kaj-de-Vos,
“Leaking of implementation details is detrimental to interfacing with other hardware architectures”
I understand all your initial criticisms, however I’m curious how an RPC interface leads to leaking of implementation details?
Corba interfaces are completely portable across many languages/platforms, including scripting languages.
Heck, just using corba itself would provide instant RPC compatibility with almost all serious languages out there.
If corba is too heavy weight to use in the OS, one could still provide an OS binding for it – that might even be a novel feature for the OS.

2011-05-29 7:33 pm
Neolander
If you understood his criticism, could you please answer my questions ? Or at least some of them ? I still don’t get what his problem is myself, and it seems that he isn’t interested in answering…
Edited 2011-05-29 19:37 UTC

2011-05-29 7:43 pm
Kaj-de-Vos
I will, but I got the feeling that you are not ready to accept the criticism you asked for. The overarching problem here is that most of the world is in mental models that consist of code instead of data, and thus code calls instead of semantic interchange, and thus implementation details of how to do something, instead of what to do. It turns out to be hard for people to switch models, so I have stopped trying over time.

2011-05-29 7:38 pm
Kaj-de-Vos
In RPC, you assume that the remote end has a procedure you can call. That’s a big assumption. To make it work, you assume that the remote procedure is written in the same programming language. That’s a huge implementation “detail”.
Remote objects are just an object oriented extension of the RPC concept. They were en vogue in the nineties, when everybody switched to remote architectures. This was when CORBA and other such methods were found to be too cumbersome.
Messaging has a long history, really. These lessons were already learned in AmigaOS, BeOS, Syllable and new messaging systems such as 0MQ. You can also ask yourself what the most successful remote protocol does. Is HTTP/HTML RPC based?

2011-05-29 8:02 pm
Neolander
Well, this looks like the beginning of an answer, so if you allow me…
In RPC, you assume that the remote end has a procedure you can call. That’s a big assumption.
At the core, we have this: daemon process wants to inform the kernel that there’s a range of things which it can do for other processes. The procedure/function abstraction sounded like the simplest one around the “things which it can do” concept to me.
To make it work, you assume that the remote procedure is written in the same programming language. That’s a huge implementation “detail”.
Hmmm… Can you mention a modern, serious programming language (joke languages like BF don’t count) that does not have the concepts of a function or a pointer ? Because once the concepts are there, dealing with the switch from one language to another during a call is just a matter of gory implementation magic.
Messaging has a long history, really. These lessons were already learned in AmigaOS, BeOS, Syllable and new messaging systems such as 0MQ. You can also ask yourself what the most successful remote protocol does. Is HTTP/HTML RPC based?
I’d prefer it if we didn’t put the notions of long story and success in there. DOS has a long story, Windows is successful. Does it mean that these are examples which everyone in the OS development community would like to follow ?
Edited 2011-05-29 20:05 UTC

2011-05-29 8:21 pm
Kaj-de-Vos
You keep defending your existing notions, instead of entertaining the notion I introduced that is apparently new to you. Do you agree that declarative data is at a higher abstraction level than a procedure call? Do you agree that not specifying an implementation language is simpler than specifying a specific language?
If you are not willing to look at common implementations, lessons from history become meaningless, either good or bad. Do you have experience with messaging in Amiga, BeOS, Syllable, microkernels, REBOL, enterprise messaging, or anything else?
2011-05-29 8:38 pm
Neolander
You keep defending your existing notions, instead of entertaining the notion I introduced that is apparently new to you.
I work this way. If you want to prove that your way of thinking is better than mine, you have to expose clearly what’s wrong in my way of thinking. Alfman has been successfully doing this when defending the async model vs the threaded model, as such async has now more room in my IPC model.
Do you agree that declarative data is at a higher abstraction level than a procedure call?
Define declarative data, Google and Wikipedia have no idea what this is and I haven’t either
Do you agree that not specifying an implementation language is simpler than specifying a specific language?
Simpler? Oh, certainly not, if you consider the whole development cycle. The higher-level abstractions are, the more complicated working with them tends to be, as soon as you get away from the path drawn for you by the abstraction manufacturer and you have to think about what the abstraction actually is (which is, say, the case when implementing the abstraction)
As an example, when explaining sorting algorithms, it is common to draw some sketches that implicitly represent lists (packs of data with an “insert” operation). Now, try to visually represent sorting in an abstract storage area that may be as well a list or an array. How hard is that ?
As another example, which programming abstraction is easier to define to someone who has no programming knowledge : a function or an object ?
If you are not willing to look at common implementations, lessons from history become meaningless, either good or bad. Do you have experience with messaging in Amiga, BeOS, Syllable, microkernels, REBOL, enterprise messaging, or anything else?
I’m not sure what is it that you’re calling messaging actually. Are you talking about the concept of processes communicating by sending data to each other (pipes), the idea of communicating over such a data link with a messaging protocol (like HTTP), … ?
Edited 2011-05-29 20:48 UTC
2011-05-29 9:01 pm
Kaj-de-Vos
I don’t have to convince you. You asked for criticisms that would be useful to you. If you don’t consider what you requested, it won’t be useful to you. It seems my impression was right that you don’t understand the concept of messaging, and it would take me a lot of time and energy to change your mental model.
2011-05-29 9:36 pm
xiaokj
Define declarative data, Google and Wikipedia have no idea what this is and I haven’t either
Let me help, whatever I can, here. If, and that is a very big “if”, I am correct, he is referring to something really esoteric. It should be a design philosophy coming straight out of things like “Art of Unix Programming”.
Apparently, he is trying to tell you that there is a very much more abstract way to deal with stuff than the RPC. To work with RPC, you will need to define the function name and its accepted parameters, and that would then be set in stone. If you used declarative data, then, what you would do is to have the library export a datasheet of “what can I do” and when you pick a specific function, “what options are here”, complete with version numbers. Preferably XML. Then, the clients can make do with whatever that is provided.
The benefits of this is that major changes can be done a lot easier than before. However, there is a major downside too: it is much harder to code in that form. The benefits tend to pay out over the long run, but still.
The main point of doing things like this, other than the obviously stated one, is that it makes you get used to declarative data structures. They, on the other hand, make much more sense! As the Art of Unix Programming notes, the human mind is a lot better at tackling complex data than complex code flows. Declarative data structures push the complexity into the data side, so that the overall code becomes a lot cleaner, and it is in there that the benefits can be most easily reaped.
Take the pic language for example. It is a lot easier to declare that you want a rectangle of a certain size, and that its top left corner (NW) corner is connected to an arrow that points to a circle of radius so and so. The code then takes care of the rest. These kinds of code tend to stay sane even with extreme longevity whereas if you tried to define things by coordinates, sooner or later your API will be replaced, for such simplistic API are a dime a dozen. Declarative programming is something like that, and it is really time-saving.
I hope I have correctly captured his idea. I don’t know anything, actually, so take some salt with this.
2011-05-29 9:50 pm
Kaj-de-Vos
That’s pretty good, except:
– It’s not esoteric, but widely used. Hence my example of HTML.
– I do not prefer XML. It has become a reflex for people to come up with that immediately, but like RPC, it’s an implementation detail. Actually, I think XML is way too heavy.
– Specification sheets (such as DTDs) are not strictly necessary. This is also an implementation detail. A metadata specification is required if all the world needs to know the meaning of the data, but most interfaces are between parties that know each other and don’t need to be understood by parties that have no preexisting knowledge of the interface.
– Therefore, there are no inherent drawbacks of difficult implementation. It can be as simple as you make it.
2011-05-29 9:58 pm
xiaokj
Personally, I prefer something lighter too: the HTTP protocol itself is a wonder, and it is much lighter than the tag heavy XML, of course.
However, a specification sheet is a good idea since implementations can, and do, change. Better to code with expectation of change rather than go by “interface memory”. If you wanted to have something be as abstract as declarative would allow, then why strap yourself down with black magic? Again, something light would be very nice too. Maybe just a version number is good enough, but still.
Glad that I could actually understand you with just the magic 2 words. It may not be esoteric, but this is proper old school (actually, more like good sense than old).
2011-05-29 10:11 pm
Kaj-de-Vos
The best declarative interface is a self-descriptive one. Which has the effect of the metadata specification being woven into the communication. In that case, there is of course still a standard for what the data can look like, but that standard is fixed, like a type system.
2011-05-30 5:15 am
Alfman verbose=1
It’s difficult to understand you without a specific reference to what you mean.
I’m going to take a stab at it and guess that SOAP (the successor to xml-rpc) might be the most popular instance of the type of interface you are alluding to?
http://weblog.masukomi.org/writings/xml-rpc_vs_soap.htm
In ASP.NET, the soap interface is a derivation of the function prototype, therefore, in this instance SOAP hasn’t really extended the expressiveness of the function prototype; but in theory the potential is there.
I’d be really interested in seeing good examples of SOAP which have been exploited beyond wrapping regular functions. Anyone familiar with any?
JSON is another popular interchange format for web browsers, often prefered over xml due to more compactness and better correlation to abstract data types.
http://www.json.org/
Kaj-de-Vos, unless I’m mistaken, it don’t seem like you have a problem with RPC itself, but with the non-extensible interfaces provided by a C function prototypes.
If this is the case, then I understand. And now I am forced to admit that C function prototypes are not very future compatible.
C++ supports overloaded functions, so you could get away with adding more parameters in the future, but the model breaks down with too many variants, and in any case it would be C++ specific.
How do you feel about languages which permit/require named parameters? The parameters are effectively a hash table. I think it’s a future-friendly model, but I await your comments.
2011-05-30 5:54 am
Neolander
Kaj-de-Vos, unless I’m mistaken, it don’t seem like you have a problem with RPC itself, but with the non-extensible interfaces provided by a C function prototypes.
If this is the case, then I understand. And now I am forced to admit that C function prototypes are not very future compatible.
Interesting. Can you explain why ?
2011-05-30 7:02 am
Alfman verbose=1
“‘If this is the case, then I understand. And now I am forced to admit that C function prototypes are not very future compatible.’
Interesting. Can you explain why ?”
I’m not sure this is the best example…but take a look at the win32 apis.
They’re full of functions which needed updated function prototypes. To retain compatibility, the devs have to come up with new function names. For example:
OpenFile
CreateFile
CreateDirectory
CreateDirectoryEx
CreateDirectoryTransacted
…
“OpenFile Function
Creates, opens, reopens, or deletes a file.
Note This function has limited capabilities and is not recommended. For new application development, use the CreateFile function.”
http://msdn.microsoft.com/en-us/library/aa365430%28VS.85%29…
The windows userspace API has a bunch of these, which stem from the fact that the original function names can no longer be used for the extended functionality (which stems from the fact that C function prototypes are not update friendly).
Ultimately we ended up with an Win32 API having different functions names for variations of the same thing.
A language with dynamic parameters (like Perl) would not have suffered from this, since we could append parameters at any time. Named parameters are even more flexible (plsql,.net).
Another example of having multiple functions doing nearly the exact same thing:
CreateWindow // Original ascii version
CreateWindowW // Wide unicode characters
CreateWindowA // ascii byte characters
In order to add support for unicode, MS needed to change the prototypes for thousands of functions. They renamed the old function to “*A” and created new unicode variants “*W”, and then used ifdefs in the header files to map user code to one or the other.
In theory, a more extensible parameter model could have allowed Win32 to continue using one function for both character sets.
The point isn’t that C cannot handle the changes, it can, but that once a change is made any old code will cease to be compatible. Therefore, using C function prototypes either restrict future modifications, or breaks old code.
The Linux kernel is different than windows in that most syscalls are wrapped in glibc and are therefore shielded from upgraded syscalls to an extent. For example some linux syscalls were duplicated to handle 64bit file sizes. So when glibc was upgraded to use 64bit positions, they didn’t need to rename the functions as they would have needed to in windows.
Edited 2011-05-30 07:12 UTC
2011-05-30 7:43 am
Neolander
I may have some ideas to solve the C extensibility problem, but I need to think about them a bit more. The core idea is that C’s problem are instantly solved by a mere recompile, so if we can do what this recompile would do at run time, the problem is solved.
2011-05-30 12:34 pm
Neolander
Appending function parameters to C functions is actually quite easy to manage in an RPC-based API, if a small modification in prototype handling is added.
Thinking about it, how would one append parameters in a backwards compatible way if all existing code could be recompiled (ex : interpreted languages)? One would only have to add new parameters at the end of the function prototype, with default values so that legacy code that does not specify those parameters will still work once recompiled.
Let’s keep note of that.
Picture ourself in a scenario where process A is making remote calls on process B. When A was compiled, the remote call prototype was…
void dummy(int a);
But by the time when B was compiled, the remote call’s prototype had become…
void dummy(int a, int b = 3);
Now, let’s add the requirement that A, too, has to broadcast the prototypes of the RPC calls which it will be doing to the RPC subsystem. This requirement is very useful if we have backwards compatibility in mind.
If all these conditions are met, then the RPC subsystem can detect incompatibilities AND solve the problem.
Let’s arbitrarily take a C calling convention where the rightmost parameters are pushed on the stack first (the order doesn’t matter, but precise examples are easier to read). What the RPC subsystem will be doing is to create a “default stack frame” for A with the default value of b (here, 3) in it. Whenever A makes an RPC dummy() call, the RPC subsystem creates a copy of the “default stack frame”, pushes A’s parameters on it (here, the value of a), and then makes the call using this new stack. As far as B is concerned, everything goes as if A had been recompiled.
Now, that was the easy part.
2011-05-30 1:10 pm
Neolander
Now, on the other hand, when studying the CreateWindow/CreateWindowW/CreateWindowA problem which you also mention, things get a bit more funky.
You would like to put the same name on slightly different functions with a different set of parameters. So what you want, in short, is function overloading. But the problem is that sadly C doesn’t support it at the language level. Can we help it ?
On the client side, this problem is relatively easy to solve cleanly, by having the client broadcast the prototype of the function which it wants to call.
Let’s assume that at the time when process A was written, there was only one text output function in the system API, depending on ASCII strings…
void print(char* stuff);
…and that at the time when process B was compiled, there’s now two versions of print() available, one for ASCII and one for Unicode.
void print(char* stuff);
void print(wchar_t* stuff);
In that case, if process A has specified to the RPC mechanism that it’s going to call the ASCII version, no problem will arise. The RPC mechanism will silently handle the overloading stuff, even though the language in which A has been written theoretically couldn’t.
Now, what’s more interesting is, what about the case where B is written in a language which doesn’t support overloading, templating, or any similar stuff ?
At the time where A was written, compiling B was simple, because there was only one version of print(). But what should be done now ?
Well, since C can’t handle overloading, B’s code will have to include two versions of print() with different names, the Win32 way. As an example, internally, B might now use…
void print_a(char* stuff);
void print_w(wchar_t* stuff);
…with a bit of search and replace to remove nasty references to print() in B’s code.
But the interesting part is, since we have the extra RPC abstraction layer, if sufficient control on the “prototype broadcasting” process is given to the developer of B, nothing prevents users of overloading-friendly languages from seeing a cleanly overloaded print() interface.
That’s because if such control is provided, nothing prevents the developer of B from associating the RPC call “void print(char*)” to the function “void print_a(char*)” and the RPC call “void print(wchar_t*)” to the function “void print_w(wchar_t*)”. That’s because a C prototype is just that, a prototype. The name which a function bears does not actually matters, only the function pointer matters. And RPC provides the layer of abstraction needed to do this kind of “overloaded RPC” black magic.
I don’t know if I made myself sufficiently clear in this post.
Edited 2011-05-30 13:13 UTC
2011-05-30 1:37 pm
Neolander
What I’ve also been investigating is how further than shared libraries RPC can get before reaching the point where people have to go metadata in order to get more flexibility.
Another issue of shared libraries is what happens when function parameters are changed. As an example, let’s imagine that PIDs used to be described using 32-bit unsigned integers in the old version of the system API and that they are now described using 64-bit integers.
For a compiled shared library, if caller code is not recompiled, backwards compatibility is broken, because functions like…
void dummy(PID somepid, int whatever);
…will have changed between the old and the new version of the library, with in effect…
void dummy(uint32_t somepid, int whatever); //Before
void dummy(uint64_t somepid, int whatever); //After
However, what if the RPC subsystem was able to do the same kind of automated integer conversion as a C compiler is capable of, and what if it was able to call a function with a 64-bit parameter using a stack from the 32-bit version, by zero-padding the 32-bit integer in order to make it a 64-bit one ?
My conclusion is that it’s not worth it. This example sounds simple, because it’s an introductory one. But once structures and classes start to get around, things get really dirty. In particular, if a classes’ constructor does not have constant behavior and has an internal state based on global or static variables, it becomes totally impossible to get around the need to recompile things in a compiled languages, unless struct/class initialization itself is done using RPC, which would be a design constraint that’s at the same time restrictive and gross beyond repair.
Solving a problem in part is worst that not saving it at all, so let’s consider than changing a function’s parameter is still breaking backwards compatibility. Solving all the problems of the Win32 API which you’ve mentioned will probably be enough for today as far as I’m concerned
Edited 2011-05-30 13:39 UTC
2011-05-30 5:11 am
Neolander
Oh, alright, now I see better what is it going.
It would be a bit like using objects for making calls (yeah, yeah, I know, implementation details and stuff).
A malloc implementation could be described like…
//This is written in the PidginObject language
service Malloc [
….option AllocatedSize
]
And for the caller, it’d be like…
mymalloc = setup_service(Malloc)
mymalloc.setproperty(AllocatedSize = <whatever>)
call_service(mymalloc)
…or if we’re a messaging freak…
send_message(daemon, “run_service Malloc with option AllocatedSize = <whaterver>, option SaveWilly = no, etc…”)
Actually, I plan to use something like that for UI description.
It has at least the following benefits :
-> You can indeed use language-agnostic headers (like XML or CSS). More precisely, you’d use headers written in your own language.
-> The order in which you put function parameters don’t matter. That means that you can change one default parameter without redefining all the others “before” it, since there isn’t such a concept
-> You can use a consistent data description language for services and UIs, settings, etc…
There are some tricks worth pointing out, though.
First, a significant header parsing overhead has to be here each time a functionality is called, not only when it is declared. This could be quite problematic for low-level stuff that has to be fast.
If you want to fully implement your own communication protocol, and not use those of an existing programming language, then you have to not only write the function prototypes in your new language, but also describe the data with it. Now this is a tricky one. In C, everything can be described in term of blocks of memory with integers inside and pointers. But there, if you want to do things cleanly using your own specifications, you need to create a syntax for arrays, a syntax for objects, a syntax for strings, a syntax for numbers, etc… one for each data abstraction which you want people to be able to use.
What this means is that you’ll have to code a data translator that exactly as complex as a modern compiler, and have a great data conversion overhead, akin to that of having heterogeneous OSs written in different languages and running on different networks communicating over a network, except that it’ll occur all the time, even when you remain on a local machine, running a single architecture, and doing calls between programs written in the same language. You do not optimize for the common case.
Astonishingly enough, this does not solve the compatibility problem.
The classical compatibility issue is that functions can gain parameters, but not change name, change the name of parameters, change the order of parameters, or lose parameters.
Here, the object replacing our functions cannot change name either (otherwise processes looking for that service using the old name won’t find it). Parameters can’t get a different name or disappear for the same reason (programs coded for an old version of the service wouldn’t work). So basically, all we can do is change the orders in which parameters are written.
My question is, is it worth the performance hit of going back and forth an intermediate representation each time a call is made ? Is it worth the bloat and security risk of having a the translator around each time something as common as a procedure code is made ? Is it worth the extreme coding complexity of that translator, and the lost comfort of being able to use a large number of assumptions about the language being used ? How about rather writing function parameters in the right order the first time ?
Edited 2011-05-30 05:15 UTC
2011-05-30 2:02 pm
Kaj-de-Vos
You’re on the right track here: it’s indeed a matter of how parameters are passed (the message). But you’re framing most of your thought in traditional terms of code, with calls and parameters and many other details. Doing implementations in those terms has led to the idea that it is complex and costly. As I said in another post, this is not so if you do it right. To do that, you have to forget about all those things that are irrelevant.
Let’s make this concrete. How would you implement a service that draws a line? You could draw up a plan including all sorts of functions, parameters, transfer methods, interface description languages and parsers for it, but that is all irrelevant. To draw a line, assuming the pen is set at a starting point, it suffices to specify this:
draw x y
You could call “draw” a function name, but that is irrelevant and assumes too much. It’s just a command. x and y are the parameters. Not because they’re inherently parameters, but because they’re preceded by a command. This is our first self-descriptive feature. But we’ve already assumed it’s a line in a 2D space. At least we haven’t assumed it’s either a screen or a plotter, but we could make it more general by specifying a 3D line like this:
draw x y z
I don’t believe in higher physical dimensions, so we’ll leave it at this. We’ve written it in a human message, so how do we encode this in an efficient machine message that wouldn’t be out of place in the core of an operating system? A naive first attempt would say that we need numbers for each component. Both sides of the interface would need to agree on a command set, like syscalls. draw is our first command, and if we encode it in an integer, all parts can have the same encoding:
1
integer
integer
Now this is really not hard to parse, and the performance loss against a C function call is negligible. On the other hand, we haven’t improved much on its flexibility yet, except that we are completely free to interpret this as a sync or async command. An important goal is to keep changing interfaces compatible, so we could do that by brute force by prefixing an interface version:
1
1
integer
integer
This is trivial here, but not so in low level code languages such as C. You’d have to depend on symbol versioning, for example, making you dependent on certain toolchains. However, even better than a wholesale interface version is to make compatibility more granular by weaving it into the data. Let’s see what happens on changes. Consider the case that you want to move coordinates to floating point to use subpixel precision in your graphics. This actually happened during the development of AtheOS. The abstract specification is still the same:
draw x y
But we would need to bump the interface version because the encoding changes:
2
1
float
float
This makes old clients incompatible with new services when they drop the old interface. We can avoid that by introducing a type system. So far, we have data of three types:
1: command
2: integer
3: float
Here’s a typed version of the interface:
1
1 1
3 float
3 float
The parser in the interface becomes a little more complex, but it’s still trivial, and very flexible. It’s now easy to support the integer interface in the same interface version:
1
1 1
2 integer
2 integer
We’re venturing into terrain that low level languages without proper polymorphism can’t really support. We can still count the numbers we use on the fingers of one hand, and we already have a powerful type system independent of any implementation language. We’re starting to feel very powerful, and confident to look far into the future. We will add types when we need them, but what happens when we introduce new types that old interfaces don’t know about? We can keep some new interfaces usable by old clients if they can at least parse the encoding, and skip data they don’t understand, or pass it along to interfaces that do understand. When AtheOS switched completely to floating point graphics coordinates, old programs just kept working and were then running in a more advanced environment that they knew nothing about. To keep new types parsable by old interfaces, the encoding needs to include their sizes. We can do this only for new types to optimise the encoding. REBOL has almost sixty data types, so it’s fairly safe to reserve a space for hundred standard types. Let’s say a mathematician has a weird virtual coordinate space in which he wants to draw:
1
1 1
101 size coordinate
101 size coordinate
So far we have disregarded the starting coordinate for the line. Let’s introduce a command to set it:
set x y
1
1 2
3 float
3 float
Now we can draw a line starting anywhere:
set x y
draw p q
1
1 2
3 float
3 float
1 1
3 float
3 float
Note that in RPC, this would be two calls, with the associated overhead, so we’re actually becoming more efficient here. But wait, we wanted to support 3D, so we now have to solve the problem of variable length parameter lists. We can write it like this:
set [x y]
draw [p q]
And we will have to encode the number of parameters somehow. To keep the format a generic stream of values, we could associate it with the command encoding:
1
1 2 2
3 float
3 float
1 1 2
3 float
3 float
set [x y z]
draw [p q r]
1
1 2 3
3 float
3 float
3 float
1 1 3
3 float
3 float
3 float
Alternatively, we could introduce a list type and pretend that a command has one parameter, the list:
1
1 2
4 2
3 float
3 float
1 1
4 2
3 float
3 float
Note that this is an alternative encoding for the same specification:
set [x y]
draw [p q]
Does that look prohibitively complicated?
2011-05-30 2:17 pm
Kaj-de-Vos
Sorry, I’ve already made that last example too complex. It’s very easy to fall into that trap. Because we defined a command type, the data stream is self-synchronising: if an interface has consumed all the parameters it understands, it can simply skip forward to the next command. So there is strictly no need to define a parameter number or list in this example. Still, they’re useful constructs to solve other issues.
2011-05-30 2:57 pm
Neolander
Thanks you a lot, this makes it much easier to understand the concepts which you’re invoking.
Some points remain obscure, though…
1/How does the type system help the switch from integer to float in the drawing system ?
2/More generally, is function overloading dealt with by the parser, or by the daemon doing the drawing work ?
3/Biggest issue I have : how is this so different from the kind of RPC which I advocate ? I mean, all in all it still looks a lot like a non-blocking RPC interface implemented on top of a messaging protocol. Even sending batches of RPC requests could be done in a “pure” RPC model, given an extra layer of abstraction that allows to send whole arrays of “RPC call” objects to the RPC subsystem.
Also…
Because we defined a command type, the data stream is self-synchronising: if an interface has consumed all the parameters it understands, it can simply skip forward to the next command.
I fail to see how letting client process send requests with an incorrect set of parameters could be a good idea.
Edited 2011-05-30 15:02 UTC
2011-05-30 3:20 pm
Kaj-de-Vos
A type system is needed if you want to support polymorphism in an interface. How else would you know what type an item is and what size it has in the encoding? With types, it’s trivial for a drawing server to support both integer and floating point coordinates.
Skipping of unknown parameters and commands is useful to enable old interfaces to use some new ones. This is what web browsers do with HTML. If your browser doesn’t support gradients, you’ll get graphics without gradients. If you deem any interface upgrade to be incompatible, you just bump the wholesale interface version.
There are no functions, so there is no function overloading. You really have to let go of such terms. 🙂
Regarding which side does what: each service has to implement a little binary parser to interpret messages sent to it.
This is really quite different from RPC, but from other posts I understand that you are confusing the concept of RPC. You’re also conflating the semantic payload with the transport mechanism. I’m only concerned with the payload here. You’re basically free to choose a complementary transport method.
2011-05-30 3:50 pm
Neolander
A type system is needed if you want to support polymorphism in an interface. How else would you know what type an item is and what size it has in the encoding? With types, it’s trivial for a drawing server to support both integer and floating point coordinates.
I still don’t understand. At first, the way you presented your messaging system made it sound like it had a killer feature that allowed you to fully switch from integer drawing coordinates to floating point drawing coordinates without having to recompile old code. Were you just advocating the ability to do polymorphism ?
Skipping of unknown parameters and commands is useful to enable old interfaces to use some new ones. This is what web browsers do with HTML. If your browser doesn’t support gradients, you’ll get graphics without gradients. If you deem any interface upgrade to be incompatible, you just bump the wholesale interface version.
Hold on. Skipping of unknown commands I can understand. On the other hand, skipping of unknown parameters isn’t so easy to do, at least the way you presented your messaging system above.
Let’s imagine you had a command for drawing colored lines in the spirit of “line color_code x y” where color_code, x, and y are integers.
Then later you decide that putting color support in the line drawing function is a mistake, and prefer to go with a more classical brush system. You hence define your new line-drawing command to be “line x y”.
If you try to run legacy code this way, no warning will ever occur, but the color code will be understood as an x coordinate and the x coordinate will be understood as an y coordinate, so problems will occur unless the version number is bumped.
Only, only if the dropped parameter was at the end of the parameter list, will the command execute without problem.
Is it worth it ?
There are no functions, so there is no function overloading. You really have to let go of such terms. 🙂
Meh Commands are functions without parentheses and commas, the way they were done in early programming languages, why is this distinction so important ? It’s all syntax, the concept is the same…
Regarding which side does what: each service has to implement a little binary parser to interpret messages sent to it.
This is really quite different from RPC, but from other posts I understand that you are confusing the concept of RPC.
Indeed. I call the mechanism which I advocate nonblocking/asynchronous RPC, but the ongoing discussion with Brendan implies that the name may be inappropriate. In case someone knows a better name for what I’m advocating, I’m all for it. Otherwise, I’ll have to try to find a new name for it.
Again, the principle which I’m advocating is as follows :
1/Daemon provides the kernel with a prototype of a function which can be “remotely” run by client processes, corresponding to a service which it can provide.
2/Client provides the kernel with a prototype of a function which it wants to “remotely” run. Kernel checks that the service is available, and optionally does some black magic to prepare communication between different compatible versions of the same service if required, then says that everything is okay.
3/At some point, client wants daemon to perform the action it has publicly advertised to be up to. So it performs something similar to a procedure call, except that it’s one of the daemon’s functions that is run and that the client is not blocked while the code is executed : it has just sent a service request.
4/If an operation completion notice or results are required, they are sent to the client in the form of a callback : a function specified in advance by the client is run on the client, and results are transmitted through this function’s parameters.
Again, if this IPC mechanism has a name, I’m interested.
You’re also conflating the semantic payload with the transport mechanism. I’m only concerned with the payload here. You’re basically free to choose a complementary transport method.
Where am I confusing those ?
Edited 2011-05-30 15:58 UTC
2011-05-30 3:55 pm
Kaj-de-Vos
Maybe reread everything, follow my references and sleep a few nights on it. I can’t make it clearer than I have.
2011-05-30 10:26 pm
Alfman verbose=1
Kaj-de-Vos,
“Maybe reread everything, follow my references and sleep a few nights on it. I can’t make it clearer than I have.”
But you’ve been so vague.
You keep saying that “RPC” is limited, etc, but your examples keep pointing to the fact that C function prototypes are limited. In other words, you seem to be moving the goal posts.
I guess we’re at an impasse. I understand if you don’t want to discuss it any further.
2011-05-30 10:59 pm
Kaj-de-Vos
Sorry, I don’t see how a series of commented implementation examples at byte level is vague.
2011-05-31 2:53 am
Alfman verbose=1
Kaj-de-Vos,
“Sorry, I don’t see how a series of commented implementation examples at byte level is vague.”
It’s vague because it doesn’t address the reasons why RPC is deficient in your view. I’m beginning to think that your deliberately avoiding the topic.
2011-05-31 11:13 am
Kaj-de-Vos
Right, I deliberately respond to the request for comments on an RPC design and write extensive examples in order to avoid the subject.
2011-05-31 5:58 pm
Alfman verbose=1
Kaj-de-Vos,
“Right, I deliberately respond to the request for comments on an RPC design and write extensive examples in order to avoid the subject.”
Your criticisms of RPC made sense – assuming fixed C function prototypes and simple data types. But beyond that, you’ve overgeneralized points as though they applied to all state of the art RPC in general. Every time I requested clarification using counterexamples, you completely ignored them.
Even in your last post, you chose to write this sarcastic statement instead of pointing out why a modern hierarchical RPC cannot handle the example you provided, which is what I’m interested in.
If you are right, I’d really like to know how modern RPC fails in your example. The example by itself doesn’t imply a failure because as far as I can tell the structures you brought up are permissible using modern RPC.
2011-05-31 6:41 pm
Kaj-de-Vos
I think you’re being vague. I wrote examples for you at byte level. Give me a link to the specific RPC implementation you propose and examples that match my examples, so we can try to compare.
2011-05-31 7:56 pm
Alfman verbose=1
Kaj-de-Vos,
“I think you’re being vague. I wrote examples for you at byte level. Give me a link to the specific RPC implementation you propose and examples that match my examples, so we can try to compare.”
Well you still haven’t answered my questions. I’m not saying that you are wrong, I am just trying to get answers.
I hate repeating myself, but at an abstract level what is preventing someone from implementing the structure you proposed in dot net (for example), and then passing that structure to a remote procedure through a soap web service call to another Perl component (for example)?
Edited 2011-05-31 19:57 UTC
2011-05-31 8:14 pm
Kaj-de-Vos
Nothing, that’s exactly the point. My alternative does not depend on any programming language, that’s what I’ve been trying to get through all the time. And it’s not abstract. I gave examples at the byte level, with the purpose that you can implement those exactly as-is in most any programming language.
And as I also said before, it does not depend on any transport method, either. As Brendan said, you can basically implement any IPC on any other IPC if you want to. If you want to transfer a declarative message over RPC you can do that, but it’s silly, because declarative messaging is a superior replacement to RPC.
2011-05-31 9:12 pm
Neolander
Nothing, that’s exactly the point. My alternative does not depend on any programming language, that’s what I’ve been trying to get through all the time. And it’s not abstract. I gave examples at the byte level, with the purpose that you can implement those exactly as-is in most any programming language.
Inexact. Your commands have a syntax, so there’s a language involved. You simply happen to create a new language for the purpose of sending/receiving commands, instead of using the one you use to develop the OS. This mean that code following a best-case scenario (written in the same language as the OS, on the same architecture, etc…) will suffer translation overhead just as well as code written in an obscure language that only one person in the world uses.
Except, of course, if you design your language so that it uses calling conventions that are very close to that of the system language, in order to reduce the overhead. But in that case, you basically use the same language, just a different dialect of it.
All in all, I don’t see what’s so great there.
And as I also said before, it does not depend on any transport method, either. As Brendan said, you can basically implement any IPC on any other IPC if you want to. If you want to transfer a declarative message over RPC you can do that, but it’s silly, because declarative messaging is a superior replacement to RPC.
Yeah, and you twist these words. RPC can be implemented on top of messaging protocols just as well as declarative messaging can be implemented on top of RPC, so none is superior to the other from that point of view. And Brendan also said that having IPC primitives that are not fine-tuned for the target job is not necessarily a good idea either…
2011-05-31 9:20 pm
Kaj-de-Vos
Sigh. As I suspected with your very first response, I am wasting my time here. You are not interested in the criticisms you requested, you are just interested in contradicting them and trying to twist my words so you can still claim to be right. I also suspect you are not interested in writing an operating system, but just in endlessly talking about it. That’s fine, but don’t drag me into your hobby, because I’m actually developing a few of them, so my time is very limited.
2011-05-31 10:23 pm
Alfman verbose=1
Kaj-de-Vos,
“As I suspected with your very first response, I am wasting my time here. You are not interested in the criticisms you requested, you are just interested in contradicting them and trying to twist my words so you can still claim to be right.”
Though this wasn’t addressed to me, I appreciate your time. There will always be people with views contradictory to our own. In a sense it’s better that way since it encorages deviation from a monoculture, and leaves all parties with a more in depth understanding of the subject, never mind that we may still disagree on the subjective importance of competing goals.
2011-05-31 10:34 pm
Kaj-de-Vos
Spoken like a true politician. However, I don’t think you’ve learned anything from this. So as the article predicts, you’ll have to learn it many years from now; that is, if you ever get to the implementation phase.
2011-05-31 11:10 pm
Alfman verbose=1
Kaj-de-Vos,
“Spoken like a true politician. However, I don’t think you’ve learned anything from this.”
I like debating technical subjects. And, like everyone else, I learn on the way. I’ve already implemented my OS years ago, and it lacked RPC. But I think RPC would be an interesting kernel interface, who are you or I to discourage it? Criticism is good, but it is silly for the critic, no matter the credentials, to lay out the law expecting no debate (unless of course you are a professor).
“So as the article predicts, you’ll have to learn it many years from now; that is, if you ever get to the implementation phase.”
You responded to my post, but I’m guessing this wasn’t addressed to me?
2011-06-01 8:47 am
Neolander
Criticism is good, but it is silly for the critic, no matter the credentials, to lay out the law expecting no debate (unless of course you are a professor).
I can’t help but find this sentence a bit strange.
There certainly are areas of knowledge where things are written in stone and can’t be discussed, like (recent) history or languages. But in the vast realm of scientific knowledge, professors should be able to answer criticism and justify why they explain things in a certain way, otherwise what they’re teaching is nothing more than a religion.
As an example, when teaching about gravity, a high-school physics teacher who is said “nonsense, heavy objects fall faster than light ones” could (theoretically speaking) grab two marbles of significantly different mass, have the critic confirm that fact, put them in vacuum, have them fall a great height while filming, and then show the video frame to the critic to have him acknowledge that both objects have fallen at the same speed.
Of course, it’s a bit hard to do that in practice because teachers lack infinite time. But at least they can mention the experiment instead of simply saying “m*dv/dt = m*g IS, you nonbeliever !”.
2011-06-01 10:11 am
Alfman verbose=1
Neolander,
“Criticism is good, but it is silly for the critic, no matter the credentials, to lay out the law expecting no debate (unless of course you are a professor).”
“I can’t help but find this sentence a bit strange.”
Haha, I learned this lesson the hard way. Some professors encourage an intellectual debate, others just want obedient regurgitation. I’ve had my share of both. With the later, it’s best not to stray too far from the gospel. Overt disagreement can yield lower grades despite a perfectly defensible answer.
There was one professor in my major who drove students up a wall. I got an undeserved D on a midterm because he wouldn’t accept any (correct) answers but his own. What was even worse is that he’s one of the lazy professors who insists on keeping his tests secret. After reviewing our exam results, he’d literally demand hard copies back to be shredded – he was paranoid about them getting out. I have several problems with that. It’s a place of learning for god’s sake. I even confronted him about it but he said his 20 years teaching experience trumped any complains I might have.
His arrogance was not only unpleasant in class, but had long lasting repercussions for students graduating with unfairly low GPA. I’m still having trouble getting over it.
Edited 2011-06-01 10:13 UTC
2011-06-01 2:49 pm
xiaokj
Or you can simply link to our beloved Walter Lewin @ MIT. His series of physics lectures includes as much down-to-earth experiments you can witness as it does theory.
2011-05-31 10:34 pm
Neolander
Sigh. As I suspected with your very first response, I am wasting my time here. You are not interested in the criticisms you requested, you are just interested in contradicting them and trying to twist my words so you can still claim to be right.
You are obviously free to take it this way, especially if you feel like you have nothing left to say… Thanks for your time.
I also suspect you are not interested in writing an operating system, but just in endlessly talking about it.
It doesn’t have to be endless talk, but talking and, more precisely, arguing, is definitely a vital part of design in my opinion, while I hope we agree that good design is itself a vital part of developing any operating system that aims at having a long-term future.
An OS is something huge. If you try to design everything by yourself, without asking feedback from anyone, you quickly end up reaching the limitations of the human mind : you care only about a limited number of sides of the designed product, and forget about the rest. But there are often important things hiding in “the rest”. Hence the need to find people who care about sides of the designed product that you don’t currently care about.
That’s fine, but don’t drag me into your hobby, because I’m actually developing a few of them, so my time is very limited.
Oh, I’ve never dragged you in particular. But if you come up and say “oh, what you’re doing is crap, you should do like me it’s simply superior”, you are to expect some replies asking you to put some meat in those claims, and put some arguments and data behind the “superior”.
Edited 2011-05-31 22:37 UTC
2011-05-31 9:14 pm
Alfman verbose=1
Kaj-de-Vos,
“Nothing, that’s exactly the point. My alternative does not depend on any programming language”
Well, one big difference is that using a standardized RPC mechanism enables the programmer work on a level above parsing byte streams and recreating a local structures internally, which most devs would agree is tedious. Without standard RPC, the customized parsing code would need to be ported to each platform to use the service.
“And as I also said before, it does not depend on any transport method, either.”
I agree, but eventually you’re implementation is going to nail it down to *something*. That something can either be a standardized RPC mechanism, with the portability and auto-serialization benefits RPC offers, or it can be some custom protocol framing individual bytes in a byte stream.
Either way works, so why should we oppose RPC?
“If you want to transfer a declarative message over RPC you can do that, but it’s silly, because declarative messaging is a superior replacement to RPC.”
That’s debatable, arguably modern RPC is a replacement for parsing bytes out of a byte stream.
Maybe you prefer working with the byte stream, but there’s no denying that 1) RPC offers a level of abstraction 2) many programmers prefer working at higher levels.
Is this good/bad? I don’t know, one the one hand less qualified people end up working in the field creating more bloat. On the other hand, higher level concepts are empowering and enable us to get more done per unit time.
Edited 2011-05-31 21:18 UTC
2011-05-30 5:27 pm
xiaokj
Let me help here too!
First of all, let’s deal with the earlier question. You asked something along the lines of “why bother with this when we can just design the RPC sensibly in the first place?” Well, the answer is that this *is* the sensible way out. It is inevitable that you will need to incorporate some fundamental changes somewhere down the road, why not do it properly in the first place? Also, you can simply make an optimising parser — given that it would not change so often, the parser can run slow (this can be something like mkinitrd). If the filesystem supports notification, then that can be used to auto-invoke the parser per alteration. This ensures that we can actually not get much of a performance hit.
Now, for the specific questions,
1) There is no type system! Okay, it does look like one, but it actually is just regular data written in a specific way. The great thing is that it can be parsed by an simple program and the outcome can instantaneously migrate the system from integer to floating point calculations.
2) This depends on the choice of the implementer. If infraction is known rare, then it should be sensible to make a compromise — the standard case is done by the drawing primitive, and the edge cases can be done by an external parser generated by the optimising parser of the data spec sheet. This ensures performance with no problems in compatibility.
3) This interface is a lot more flexible! Different OSes can just pass around the spec sheet and everybody can interoperate without difficulty (even understand each other’s binary blobs; bad idea, I know, but still). Changes can be made at whim and most importantly, you are no longer hard-wiring code; you are able to just modify plain old data, which is a lot more malleable than code, surely!
Okay. Now to the last part. Maybe processes will have it less, but programs, in general, should not obnoxiously assume that they are free to mangle whatever they have been given. If there are parts they do not understand, barfing may actually destroy the critical debugging information. A program that keeps silent of the unknown (barfing only upon stuff it knows is bad) is actually desirable: it is capable of being combined with others!
Take the Troff package for example: The datastream actually includes more than just roff data, it includes eqn and tbl for example. When eqn does not understand the tbl input it receives, it just keeps quiet, knowing that something down the chain will understand it. Of course, it does barf when it is given nonsense in its own area of expertise.
Also, the example given above is only one part of the entire philosophy here. The ascii program’s example is one of the more amazing ones I have seen: Instead of generating the entire program’s output from scratch, the original author had realised that the whole table, precomputed, is actually better to work with.
Neolander, please try to read the Art of Unix Programming before we can actually continue with the discussion. There is a lot from there permeating this discussion.
2011-05-30 6:59 pm
Neolander
First of all, let’s deal with the earlier question. You asked something along the lines of “why bother with this when we can just design the RPC sensibly in the first place?” Well, the answer is that this *is* the sensible way out. It is inevitable that you will need to incorporate some fundamental changes somewhere down the road, why not do it properly in the first place?
This sounds a bit religious. Why should there be a single sensible way out ? Isn’t there supposed to be several sensible ways out depending on which design criteria you have, which things matter more or less to you ? I personally believe in a “the right tool for the right job” approach, and try to learn the benefits and drawbacks of each approach before deciding which one is best for my design criteria, while other people with other use criteria will probably make other choices.
Also, you can simply make an optimising parser — given that it would not change so often, the parser can run slow (this can be something like mkinitrd). If the filesystem supports notification, then that can be used to auto-invoke the parser per alteration. This ensures that we can actually not get much of a performance hit.
Wait a minute, why are you talking about file systems already ? This is interprocess communication, were are in the early stages of a microkernel’s boot and the filesystem is still far from being available yet.
Now, for the specific questions,
1) There is no type system! Okay, it does look like one, but it actually is just regular data written in a specific way. The great thing is that it can be parsed by an simple program and the outcome can instantaneously migrate the system from integer to floating point calculations.
Hmmm… You should check Kaj’s post, he totally mentions a type system in there (that’s why the first integer number in his command parameters is here for), that separates integer and floating point data.
3) This interface is a lot more flexible! Different OSes can just pass around the spec sheet and everybody can interoperate without difficulty (even understand each other’s binary blobs; bad idea, I know, but still).
This is desirable for a number of scenarios, but I’d like to point out that I’m currently talking about a system API which does not have “being portable to other OSs without issue” as it goals. Besides, I don’t see why basing my OS on a declarative data paradigm would offer better interoperability with anything but OSs based on declarative data paradigms (like Syllabe). How would such a paradigm make interoperability with, say, an UNIX system where everything is about text files and data pipes, easier ?
Changes can be made at whim
Define “changes”. I don’t think that you can suddenly decide tomorrow that’s malloc’s integer parameter should be a number of megabytes instead of a number of bytes and have all programs based on the old implementation work without recompiling them. Some changes may be easier, sure, but you’d have to precise which and why.
and most importantly, you are no longer hard-wiring code; you are able to just modify plain old data, which is a lot more malleable than code, surely!
That plain old data defines an interface to a chunk of code. If it does not include code, and if the code is not modified, then I can’t see how anything but cosmetic changes can be made without breaking compatibility between the interface and the underlying code (which would sure be a bad idea).
Neolander, please try to read the Art of Unix Programming before we can actually continue with the discussion. There is a lot from there permeating this discussion.
I’m currently in the process of doing it. Didn’t know that it was freely accessible on the web when it was mentioned before.
Edited 2011-05-30 19:00 UTC
2011-05-30 7:45 pm
Neolander
Well, at this stage of the book, I find mentioned as part of the UNIX philosophy something which I agree truly deeply represents it : traditional UNIX software is designed to do a simple filtering operation on a stream of data. Preferably ASCII data.
This is, in my opinion, the core reason why Unices are so good at running web servers, so bad at GUI, and why UNIX-based liveCDs tend to have horrible performance. UNIX is based on text files and streams of text, and it is excellent at doing that. But the further away one gets from fast file storage and text I/O, the worst it becomes. The file system and text I/O are the two key abstractions of the UNIX programming model, you simply can’t get away from them without breaking the basic OS paradigm.
Edited 2011-05-30 19:46 UTC
2011-05-30 7:52 pm
xiaokj
This sounds a bit religious. Why should there be a single sensible way out ?
Yes it does sound so. Which is why I was thinking it would be rather heavy-handed to decide to do it at the systems-API level. But your own point about RPC is also as religious, no? 😉
Wait a minute, why are you talking about file systems already ? This is interprocess communication, were are in the early stages of a microkernel’s boot and the filesystem is still far from being available yet.
Read that again! This is clearly too early to do, but is part of the final design. The initrd created by mkinitrd is fixed and is loaded into the kernel at boot, while the rest of the filesystem is still dead. If you have no time to do this, you are certainly free to just leave it be.
Hmmm… You should check Kaj’s post, he totally mentions a type system in there (that’s why the first integer number in his command parameters is here for), that separates integer and floating point data.
This sounds Zen, but as you focus on the type system, I am focusing on the fact that this “type system” you are talking about lives as plain text arranged in readable and parse-able format. Of course, in the end, you will be implementing a type system, but the difference is huge! You will be able to reconfigure with impunity as it has been shown.
This is desirable for a number of scenarios, but I’d like to point out that I’m currently talking about a system API which does not have “being portable to other OSs without issue” as it goals. Besides, I don’t see why basing my OS on a declarative data paradigm would offer better interoperability with anything but OSs based on declarative data paradigms (like Syllabe). How would such a paradigm make interoperability with, say, an UNIX system where everything is about text files and data pipes, easier ?
a) Portable also means portable in time. Within your own OS, if you had taken the time to write it in this manner, you may find it much easier to keep things sane (not breaking every few moments you want to change).
b) If you want to port against Unix, for example, you can make the parser generate an API translator, I suppose. Sadly, almost every API call will incur one translation overhead (or actually maybe less), but at least the code will work out of the box.
Define “changes”. I don’t think that you can suddenly decide tomorrow that’s malloc’s integer parameter should be a number of megabytes instead of a number of bytes and have all programs based on the old implementation work without recompiling them. Some changes may be easier, sure, but you’d have to precise which and why.
I actually myself advocate some compromise — it is clear you won’t just change malloc so easily, so there is little reason to do it as declarative. However, even there, you can see obvious improvements. If you had done it as declarative, then there is no reason why you cannot immediately change malloc to accepting size_t only, and have the parser convert old calls, using integer, into size_t in a seamless way. In this way, you can see how the information provided can be utilised to minimise your headache. Also, because of versioning, you can do abrupt breaks without trouble as long as you have seamless conversion between transitions. Also, once versioning is done, it also means you can provide function overloading (in versions, not parameters, this time), and then you can look back into your codebase and select functions still using the old version, and slowly eliminate them. This process tends to create multi-versioning disasters, but at least the system as a whole could continue working instead of dying right out. It also means that you can employ temporary fixes as you go along, which is definitely powerful.
That plain old data defines an interface to a chunk of code. If it does not include code, and if the code is not modified, then I can’t see how anything but cosmetic changes can be made without breaking compatibility between the interface and the underlying code (which would sure be a bad idea).
It should not include code, and even if it does modify behaviour, it should not include code “proper”. Data Tables yes, precomputed values of clear reason yes, but code, no. The aim is not to make it so general that you code an OS within an OS (bad idea always). The aim is to make machine parsing and human readability help you.
I kinda think of this as if you actually create the tools to manage your codebase a little like how you would use WordPress with your blog. Clearly, it should not interfere with whatever genius you want to do, but it also should not be such that you find yourself hardcoding the API (html). Of course, if you find yourself spending a lot of time on the engine, that is bad too. It is a lot of balancing, of which I doubt there is anything other than raw experience that you can learn from.
I can only take comfort in the fact that I have successfully started another soul on the AoUP. That is a true jewel. Of course, there are others, like the opposite tale of Unix Hater’s Handbook, and so on. Computing is HARD! So much balancing to do! 🙂
2011-05-31 6:59 am
Neolander
Yes it does sound so. Which is why I was thinking it would be rather heavy-handed to decide to do it at the systems-API level. But your own point about RPC is also as religious, no? 😉
Yes and no.
I’ve noticed over the year that while I can, like everyone else, understand others’ points of view, I’m bad at trying to embrace it. I tend, on the other hand, to be more reasonable after having been beaten in an argumentative fight. So I get around these limitations of my mind by using a “hit the wall” approach : I follow the idea I’ve initially had, and challenge proponents of other ideas to prove it wrong, or more precisely to prove theirs better.
This works well while I can find people who know the subject well and are good at arguing, of course, but at my current level of OSdeving mastery and for common topics like the ones which I study, it’s not exactly hard.
This is not religious, because I do everything I can to stay on an intellectual field (my idea is good, because…). Anyone who catches me using insufficient argumentation, avoiding a vital point of the opponent’s one, or worse using any statement of the “it’s simply better” kind, is invited to whip me with nettles, as Alfman has done in a past when on a tiring day I had written that threads were “simply more scalable” than asynchronous approaches.
By the way, your point about this being maybe too heavy-handed for a system API is something which I could have said myself, and have actually said somewhere in this thread I think.
Since you have led me to read a book about UNIX, I think you can understand a will on my side to have system-wide abstractions of beautiful simplicity that have a large spectrum of applications. So an IPC model that would be suitable for all API layers would be quite attractive to me.
This sounds Zen, but as you focus on the type system, I am focusing on the fact that this “type system” you are talking about lives as plain text arranged in readable and parse-able format. Of course, in the end, you will be implementing a type system, but the difference is huge! You will be able to reconfigure with impunity as it has been shown.
Doesn’t code qualify as plain text arranged in a readable and parse-able format ?
What I’m trying to say is that if external text config files are not used outside of development periods, maybe it’s not worth it to bother keeping them.
a) Portable also means portable in time. Within your own OS, if you had taken the time to write it in this manner, you may find it much easier to keep things sane (not breaking every few moments you want to change).
I have already heard this, but can’t understand why. You seem to have answered below, so I’ll look there.
b) If you want to port against Unix, for example, you can make the parser generate an API translator, I suppose. Sadly, almost every API call will incur one translation overhead (or actually maybe less), but at least the code will work out of the box.
If you have to write extra code and suffer a translation overhead anyway, how is it different than writing wrapper libraries for other OSs and languages ?
I actually myself advocate some compromise — it is clear you won’t just change malloc so easily, so there is little reason to do it as declarative.
If declarative is not universal enough that I could write things like malloc with it and require extra abstractions, that’s one of its weak points
However, even there, you can see obvious improvements. If you had done it as declarative, then there is no reason why you cannot immediately change malloc to accepting size_t only, and have the parser convert old calls, using integer, into size_t in a seamless way. In this way, you can see how the information provided can be utilised to minimise your headache.
Okay, so you’d have the parser itself convert old calls. That looks like what I was investigating for my own model above, but I ran into issues when trying to push it further, so it’d be interesting to see if you also have them in your declarative data model or not.
Let’s picture ourselves a function that takes a class as a parameter. Between version N and version N+1, this class gains new members. One of those members is a unique integer identifying each instance of the class, generated using static variables in the constructor. If the old code was recompiled, there would be no issue at all. But here we’re trying to keep binary compatibility without recompiling, isn’t it ?
Question is : can you, with declarative data, transform an old instance of the class into a new instance of the class without putting inappropriate data in the “identifier” class member ? My conclusion was that it is impossible in my “sort of like RPC” model, but maybe declarative data can do the trick.
Also, because of versioning, you can do abrupt breaks without trouble as long as you have seamless conversion between transitions.
The feature of versioning can definitely be added to an RPC mechanism : at the time where prototypes are broadcasted to the kernel, the client and server processes only have to also broadcast a version number. From this point, it works like function overloading : the kernel investigates whether a compatible version of what the client is looking for is available.
Also, once versioning is done, it also means you can provide function overloading (in versions, not parameters, this time), (…) It also means that you can employ temporary fixes as you go along, which is definitely powerful.
Not sure I understand this concept, can you give more details ?
I kinda think of this as if you actually create the tools to manage your codebase a little like how you would use WordPress with your blog. Clearly, it should not interfere with whatever genius you want to do, but it also should not be such that you find yourself hardcoding the API (html). Of course, if you find yourself spending a lot of time on the engine, that is bad too. It is a lot of balancing, of which I doubt there is anything other than raw experience that you can learn from.
Yeah The good old framework problem : people who don’t spend enough time on the design phase of their product end up creating a gigantic-sized arithmetic operation framework supporting every single mathematic calculation known to man just to add two numbers. The only way to avoid this is to have precise knowledge of your needs, and avoid the temptation of excessive genericity.
I can only take comfort in the fact that I have successfully started another soul on the AoUP. That is a true jewel. Of course, there are others, like the opposite tale of Unix Hater’s Handbook, and so on. Computing is HARD! So much balancing to do! 🙂
Heh Not all computing is like this, but development of software planned to last very long, have a wide range of uses, and be deployed on a relatively large scale, like an OS, definitely is.
2011-05-31 7:19 am
xiaokj
I see most of that as quite pragmatic, so I don’t think I can argue much further than already had been. However:
Question is: can you, with declarative data, transform an old instance of the class into a new instance of the class without putting inappropriate data in the “identifier” class member? My conclusion was that it is impossible in my “sort of like RPC” model, but maybe declarative data can do the trick.
You may not be able to make the old code suddenly be new, but without recompiling, you can make the old code speak in the new slang. The parser can just export glue code (as thin as possible, hopefully).
Also, once versioning is done, it also means you can provide function overloading (in versions, not parameters, this time), (…) It also means that you can employ temporary fixes as you go along, which is definitely powerful.
Not sure I understand this concept, can you give more details ?
Nah, it’s simple stuff. For the moment, think of a design loop: Maybe to implement something important, you found that your own design phase had an infinite loop. To implement A, you had to first implement B, which requires A. Then, what you can do is to implement proto-A and proto-B and get it over with. The mechanism can take over from there, really.
Or, if you found yourself in a temporary crisis: Something important crashed in the middle of your computing. Your debug options are in peril. Then, you may find yourself implementing temporary fixes in your codebase that you intend to remove and reconstruct later. (Something you do to just keep temporary order at the fastest moment, so that you can still get some rest.) Something like the BKL (Big Kernel Lock) Linux had.
The feature of versioning can definitely be added to an RPC mechanism : at the time where prototypes are broadcasted to the kernel, the client and server processes only have to also broadcast a version number. From this point, it works like function overloading : the kernel investigates whether a compatible version of what the client is looking for is available.
If all you have is a version number, then it is really troublesome trying to keep the details intact. Having a complete spec sheet makes interoperability easier. With a version number, then you can guarantee that the functionality is provided in a later version. But you cannot guarantee the functionality works exactly as prescribed before. Also, it means that you absolutely have to provide said functionality in future versions of the library code — you cannot do a turnabout and remove said functionality. With a spec sheet, you can guarantee that the client code can still run, as long as it does not use the removed functionality.
2011-05-31 7:44 am
Neolander
You may not be able to make the old code suddenly be new, but without recompiling, you can make the old code speak in the new slang. The parser can just export glue code (as thin as possible, hopefully).
Okay, so it is at the same point as my approach. More generally, I have been for some time under the impression that in this topic, we are talking about very similar things with a very similar set of advantages and drawbacks even though we don’t know it yet.
Nah, it’s simple stuff. For the moment, think of a design loop: Maybe to implement something important, you found that your own design phase had an infinite loop. To implement A, you had to first implement B, which requires A. Then, what you can do is to implement proto-A and proto-B and get it over with. The mechanism can take over from there, really.
*laughs* Do you want to know the most ugly hack ever usable to do this in my “RPC” system ? Have the server process broadcast a prototype that is associated with the NULL function pointer. Any attempt to run this prototype during the design phase would crash the server, but if you make sure that the functionality gets implemented…
More seriously, common development practices like using placeholder implementations of the “myfunc() { return 0; }” kind can also be envisioned. As usual, the trick is to always remember to implement the functionality in the end.
If all you have is a version number, then it is really troublesome trying to keep the details intact. Having a complete spec sheet makes interoperability easier. With a version number, then you can guarantee that the functionality is provided in a later version. But you cannot guarantee the functionality works exactly as prescribed before. Also, it means that you absolutely have to provide said functionality in future versions of the library code — you cannot do a turnabout and remove said functionality. With a spec sheet, you can guarantee that the client code can still run, as long as it does not use the removed functionality.
I broadcast a version number ALONG WITH a prototype, doesn’t the whole thing qualify as a spec sheet ?
Besides, removing functionality can be done, here’s how :
BEFORE:
-Server process provides functionality A and B
-Client process shows up, asks the kernel for access to the functionality A of server process during initialization
-The kernel check that server process indeed broadcasts functionality A, and says client that everything is okay
-Client can later make calls to A
AFTER:
-Server process only provides functionality A now
-Client process shows up, asks the kernel for access to the functionality A of server process during initialization
-The kernel check that server process indeed broadcasts functionality A, and says client that everything is okay
-Client can later make calls to A
About interoperability between versions, I thought about using semantic version numbers of the Breaking.Compatible form.
Edited 2011-05-31 07:47 UTC
2011-05-31 6:30 pm
Alfman verbose=1
Neolander,
“About interoperability between versions, I thought about using semantic version numbers of the Breaking.Compatible form.”
Is your OS going to behave differently based on the exposed version numbers? If so, I think it’d be wise to use manage versioning internally since programmers are bound to screw it up doing it manually.
I’m queasy about the use of versioned models like this though. It could be irrational, but it reminds me of active-x hell.
In active-X, if two developers tried to update one component, then the component was permanently diverged (at the binary level). If an application was compiled against one of the divergent branches, it would not be compatible the other divergent branch.
Personally, I’m leaning towards a “if the prototypes match, then the link should succeed” approach.
2011-05-31 10:48 pm
Neolander
Well, a versioning system is good in that it helps to make room for the future. If tomorrow I decide that malloc takes an argument in megabytes, the prototype will be the same but the input data will be different, so compatibility is broken anyway.
For a less idiotic example, adding and removing private members to a class may change the binary interface but not the prototype, so it’s interesting to have versioning in that case too.
I’m not sure that fully automatic versioning can be done (having development tools make the difference between breaking and nonbreaking new versions is going to be tricky to code), but it is possible to only have developers of the server process to care about it, by having (hand-crafted) versioning information automatically added to the “stub” library interface which clients will later link to.
Edited 2011-05-31 22:49 UTC
2011-05-31 11:05 pm
xiaokj
More generally, I have been for some time under the impression that in this topic, we are talking about very similar things with a very similar set of advantages and drawbacks even though we don’t know it yet.
*laughs*
Same here. |0|
(in-joke, somebody read that as lol)
I broadcast a version number ALONG WITH a prototype, doesn’t the whole thing qualify as a spec sheet ?
Still too crude, since you can have minor behavioural changes. Mathematica is one of such examples — each update, it will tweak some commands a little bit, and even though the parameters are the exact same, the functionality may be drastically changed such that automated updating is impossible.
Version numbers do little, especially if you run a gap of a few version numbers, depending upon the scale of the problem (determined mainly by the coder, really).
I am really more interested in compatible breakage — for example, a previously provided functionality A is now replaced by B and C whereby most cases go for B, and some go to C under some conditions. If automation can still be of use, I do not see why the original code needs to be recompiled — the slight performance hit should be okay for most. Even after a few more rounds, I see no problem. It really should be the translator’s job, to me (the translator will kill me, hehe).
Come to think of it, this is really abuse of translation. Some things just cannot be handled that way. For example, the old C string format had been changed drastically because of massive security holes. Such that, we realised, that one of the tokens is completely dangerous and it is no longer even allowed (let alone support)! Most new implementations will just “politely segfault” if you tried to use it. (I’m talking about the one that outputs the number of bytes written into a memory address thing). I don’t know how the translator should handle this: Should it barf (as is currently done), or should it silently drop the message? Or something in between? This is a huge thing to judge, because of the myriad implications.
Sigh.
2011-06-01 9:15 am
Neolander
Still too crude, since you can have minor behavioural changes. Mathematica is one of such examples — each update, it will tweak some commands a little bit, and even though the parameters are the exact same, the functionality may be drastically changed such that automated updating is impossible.
In that case, compatibility with the old code is broken, so the “breaking” version number should be incremented.
In fact, I’m not sure that a secondary “compatible” version number would be needed. If the new prototype is compatible with the old one, then one should just leave it be as if it was the old version.
Version numbers do little, especially if you run a gap of a few version numbers, depending upon the scale of the problem (determined mainly by the coder, really).
Why ? They help old code not to break as long as their API version is still supported.
I am really more interested in compatible breakage — for example, a previously provided functionality A is now replaced by B and C whereby most cases go for B, and some go to C under some conditions. If automation can still be of use, I do not see why the original code needs to be recompiled — the slight performance hit should be okay for most. Even after a few more rounds, I see no problem. It really should be the translator’s job, to me (the translator will kill me, hehe).
Wow. Now this starts to make the translator very complicated.
Myself, I think that if the server process breaks compatibility in a way as extreme as changing the role of function parameters, it is responsible for it and it is its job, not the translator’s one, to manage the transition to the new API version. If a recompile would not be sufficient to make old code work with the new API version, then we are outside of the translator’s area of expertise.
As an example, one could imagine handling your scenario this way :
-Before, server process broadcasted A, version 1
-Now, server process broadcasts B and C, version x, and A version 1
-The implementation of A version 1 provided by the server is in fact just a placeholder one, that calls in turn B and C depending on the situation.
Come to think of it, this is really abuse of translation. Some things just cannot be handled that way. For example, the old C string format had been changed drastically because of massive security holes. Such that, we realised, that one of the tokens is completely dangerous and it is no longer even allowed (let alone support)! Most new implementations will just “politely segfault” if you tried to use it. (I’m talking about the one that outputs the number of bytes written into a memory address thing). I don’t know how the translator should handle this: Should it barf (as is currently done), or should it silently drop the message? Or something in between? This is a huge thing to judge, because of the myriad implications.
Sigh.
What you seem to imply is that it is not the implementation of the function that was faulty and compromised system security, but that it was broken by design and that the API function should never have existed in the first place.
We agree that in such a case, keeping a compatible placeholder implementation is impossible, so when receiving such a call there are two options : either ignore it and return a false result or send some form of “illegal instruction” signal that may crash the software if it didn’t handle it.
I think what should be done should be evaluated on a per-case basis.
In the case of a function that computes a string’s length, as an example, a correct result is needed by the caller and it’s impossible to simply ignore the call, so the server should report the function as not supported and let the client face the consequences if it tries to use it anyway.
On the other hand, if we consider, say, a function that changes brush size in a 2D drawing API, it’s possible to keep the previous brush size. Broken visual appearance will result, but at least the software will be able to run.
The question is : is what the function does critical to the operation of software which use it ? And I think this should be decided on a per-function basis. Barfing has the advantage that you don’t have to ask yourself the question and carefully consider every use case of the dropped API function : it’s not supported anymore. That’s it. Use an old release of the OS in a VM if you want to use it anyway.
Edited 2011-06-01 09:24 UTC
2011-06-01 10:33 am
Alfman verbose=1
Neolander,
“In the case of a function that computes a string’s length, as an example”
But strlen is considered a “safe” function call.
Is the example just an oversight?
The functions which are considered unsafe are those like gets(), where data is written to unspecified memory bounds.
gets, fgets // never safe.
sprintf // may be safe in controlled contexts
In any case, would it even be possible to craft such functions in an RPC scenario? We’re talking about separate address spaces after all, shouldn’t the RPC mapping protect the address spaces?
Edited 2011-06-01 10:42 UTC
2011-06-01 10:42 am
Neolander
Neolander,
“In the case of a function that computes a string’s length, as an example”
But strlen is considered a “safe” function call.
Is the example just an oversight?
Both of my examples are actually perfectly safe (seriously, how could changing a 2D brush’s size compromize security ? A specific implementation, yes, but the idea itself is good), I just needed examples of critical and non-critical functions and couldn’t spontaneously think of “unsafe” API calls.
The functions which are considered unsafe are those like gets(), where data is written to unspecified memory bounds.
gets, fgets // never safe.
sprintf // may be safe in controlled contexts
In any case, would it even be possible to craft such functions in an RPC scenario? We’re talking about separate address spaces after all.
Not with unspecified memory bounds, to the best of my knowledge. But if we have an upper limit to the size of the destination memory region, it’s possible to allocate a memory region of that size and share it for the duration of the call.
Like…
-Client allocates shareable buffer
-Client shares buffer with server, receives a pointer to the buffer in the server’s address space
-Client makes fgets-like remote call, using the pointer received above as the destination string
-Server writes in the shared buffer, then calls client back
-Client takes its string and stops sharing the buffer with server
(That’s only one of the possible ways to do it, there certainly are cleaner approaches but I just wanted a proof that this is possible)
Edited 2011-06-01 10:50 UTC
2011-06-01 11:23 am
xiaokj
As an example, one could imagine handling your scenario this way :
-Before, server process broadcasted A, version 1
-Now, server process broadcasts B and C, version x, and A version 1
-The implementation of A version 1 provided by the server is in fact just a placeholder one, that calls in turn B and C depending on the situation.
Hmm. Actually, my whole point is that this method of doing things make the server full of compatibility kludges. This is not a good idea in the long run. Compatibility causes the codebase to be slower and longer and more difficult to maintain. If you implemented it as part of the translator, you have stuff working, but incur a translation overhead. This is a good thing by design! This forms a part of the “inflation” that forces the entire ecosystem to contribute.
My idea is that, instead of piling up the precious codebase with fixes, each release should come with a small translator (it may be the same for each release, if you have time to code a general one, or many small wrappers, one for each release). This would help the code be maintainable.
Still too crude, since you can have minor behavioural changes. Mathematica is one of such examples — each update, it will tweak some commands a little bit, and even though the parameters are the exact same, the functionality may be drastically changed such that automated updating is impossible.
In that case, compatibility with the old code is broken, so the “breaking” version number should be incremented.
In fact, I’m not sure that a secondary “compatible” version number would be needed. If the new prototype is compatible with the old one, then one should just leave it be as if it was the old version.
Version numbers do little, especially if you run a gap of a few version numbers, depending upon the scale of the problem (determined mainly by the coder, really).
Why ? They help old code not to break as long as their API version is still supported.
Now that my intention is made clearer from the one above, you should catch that I think the granularity of version numbers is too coarse. I think that it is healthy for codebases to break compatibility, minor or major, once in a while. But, it also means that the minor breaking version number will increment like crazy. This, it may as well be. However, I doubt that is going to tell you anything more than “Oh, there is a breakage”. This is going to be just like the original Unix codebase — problem areas, potential or not, is just marked with a “?”, without any explanations whatsoever, and the reader (hence maintainer/developer) needs to know exactly what is the problem from there. A much more sensible approach would be to have comments telling you what is wrong, and an even better one, transition plan. If you provided information about the transition in the form of spec sheet, then I think the transition would turn out to be a lot smoother than “needs recompile now”, or “API is broken; get new version immediately”. Not to mention that sometimes, you just want the older version for the removed features.
As per usual, I agree with the rest; case-by-case analysis tend to give better results too. I am just wary of making code be obnoxious to the users. I refer to the RPM4/5/6 ugliness that is just mind blowing. (intentional internal inconsistency is upheld? What nonsensical obnoxious behaviour!)
2011-06-01 12:33 pm
Neolander
Hmm. Actually, my whole point is that this method of doing things make the server full of compatibility kludges. This is not a good idea in the long run. Compatibility causes the codebase to be slower and longer and more difficult to maintain. If you implemented it as part of the translator, you have stuff working, but incur a translation overhead. This is a good thing by design! This forms a part of the “inflation” that forces the entire ecosystem to contribute.
The way I see it, all you do is to move the compatibility kludges from the server to the translator, using e.g. a small interpreted language in the translator, so apart from making the translator more complex I don’t see the benefit. Correct me if I’m wrong, though.
My idea is that, instead of piling up the precious codebase with fixes, each release should come with a small translator (it may be the same for each release, if you have time to code a general one, or many small wrappers, one for each release). This would help the code be maintainable.
But as more and more incompatible release happen, the size of the translation layer grows, unless compatibility with old software is regularly dropped. This is why I think that even in your model, breaking compatibility implies a need to write “compatibility code” and as such remains a bad thing if done too frequently.
Now that my intention is made clearer from the one above, you should catch that I think the granularity of version numbers is too coarse. I think that it is healthy for codebases to break compatibility, minor or major, once in a while. But, it also means that the minor breaking version number will increment like crazy.
It depends. If compatibility is only broken “once in a while”, then the breaking version number will only increase once in a while, too.
Then there’s the compromise of what is versioned : is it the server’s remote call interface, as a whole, that is versioned, or is it each individual function that is versioned. That’s a subtle design choice : the latter choice leads to version numbers that increase less often (because each individual function is updated less often), and offers more fine-grained information on what compatibility has been broken, but puts more burden on the server’s developer (because now each individual function has to be versioned) and is in the end mostly beneficial to “large” server software which offers a very wide range of remote calls, not exactly the kind of thing which I want to favor.
This, it may as well be. However, I doubt that is going to tell you anything more than “Oh, there is a breakage”.
See above. If designed for this purpose, it can tell you which function exactly has seen its compatibility broken. From that point, it’s time to go and see the API reference, in order to see what has changed in this function and how to adapt the client software in order to make it compatible with the new version.
A much more sensible approach would be to have comments telling you what is wrong, and an even better one, transition plan.
So if I get it right, you’d like to broadcast not only a version number, but a full changelog, along with each function prototype, explaining in details why compatibility has been broken and how to switch to the new version, instead of leaving that to the API’s doc ? Well, at least, it could make server developers fear breaking compatibility much more
Just one question : would it be a full changelog in the strict sense of the term (“To migrate from version 2 to 3, do this; to migrate from version 3 to 4, do this; etc…”), or would you want something that directly explains how to migrate from the old version used by the client to the new version provided by the server ?
If you provided information about the transition in the form of spec sheet, then I think the transition would turn out to be a lot smoother than “needs recompile now”, or “API is broken; get new version immediately”. Not to mention that sometimes, you just want the older version for the removed features.
Transition information is certainly neat, but does it have its place in a remote call mechanism, or should it rather be left to the API documentation, avoiding work duplication, since this one will have to be updated anyway ?
Edited 2011-06-01 12:34 UTC
2011-06-01 2:37 pm
xiaokj
Yes, I mean to shift the compatibility kludges to the translator. The idea is that you can make a major version translator and many minor version translators. In this way, the server code can march forward as it likes; once a translator has been created, breakage is next to free of cost. At the end, when major releases are made, all the minor translators can go to nirvana whereas all the old code can just use the major version translators. There will be a V1 -> V2 translator, a V2 -> V3 translator, so on and so forth, and once that is done, there will be no need for any modifications to them. They must, of course, be able to be chained, so that the maintainer does not need to make V1->V3 translators and such nonsense.
In this method, the size of the translator is bounded by how often major releases are made. It actually encourages major releases, not discourage. And given that it is major releases, major breakages are encouraged too.
I intend this to look more like when you upgrade packages on apt — the system would try its best to seamlessly upgrade the machines’ configuration along with the packages. When it gets into trouble, the user can be asked, or some other appropriate measure can be taken on a case-by-case basis. If seamless translation can be done, so be it. If not, the world will not end.
Leaving things to the API doc means manual intervention is required by the original author. I hope to minimise this to a minimum. If you have a machine parse-able transition plan, then the machine can hobble on as it goes, only dying when it really needs to. Sometimes, you don’t even have access to the original codebase / author, so this is actually not so far-fetched.
Of course, the final granularity of the implementation is up to the coder to decide, so I hope I did give you some interesting ideas.
2011-06-01 3:14 pm
Neolander
That’s indeed quite an interesting track to follow. It would make compatibility breakages less of an issue, which could lead to development patterns that are quite different from the ones we know.
Initial development would be faster, because developers would be less afraid to make design mistakes. They’d know that no matter what they do, they can easily fix it through incompatible changes and translation layers later.
It would be less like “do it right the first time” and more like “get it out of the door now, fix it later”. It is kind of like the Linux stable ABI debate : should Linux devs take the time to design a good ABI once and for all, and only break it infrequently, or be left free to break it whenever they want and rewrite third-party drivers so that they still work, which allows faster development ?
I still tend to prefer the first approach myself, so I think I won’t go this far, but I admit that’s totally a very interesting OS design path to follow.
Edited 2011-06-01 15:15 UTC
2011-06-01 3:52 pm
xiaokj
Nice to hear that you find it interesting too.
Actually, despite a lack of battle myself, there are quite a lot of war history on this front. For example, from Raymond Chen’s experience being at the forefront of Windows development, compatibility is a huge thing. It is one of the deciding factors in the popularity game, which then influences the amount of help/resources you get.
For example, in the Win3.11->Win95 split, there is a altogether new API coming in. All Win95 programs need to work with it, or else. But the new API has yet to be written. The way out was a massive effort to translate all the old calls to the new one, so that Win95 came out supporting all the old stuff. (The fact that it had a real DOS in the hood makes it a lot easier, but still. Things like the SimCity incident is one for all to remember.)
On the other hand, it is also clear having too much compatibility kludges in the system proper is bad. Hence why I propose to put it into translator wrappers, so that there is sufficient glue to keep the system running without recompilation, without the excess for the “everything new and works together” normal case. The fact that it does automatically penalise un-recompiled code by the translation overhead is, to me, a good thing, despite being a definite pain for real world maintenance/performance.
Of course, it is nice that you also note the clear preference on evolutionary computing from here. Funnily, from whatever little of LISP experience I have, I actually like the “do it right the first time” approach better myself. It is kind of a mix, really — if you like “do it right”, then it is important to force yourself to take some evolutionary precautions too, whereas if you like the “implement it now”, then it is also imperative to give your designs a bit of thought too. And I digress into philosophy again.
PS: It is interesting to also see Alfman joining our conversation, although he has less to say to me. And how it actually developed from Kaj-de-Vos’ one small vague comment.
2011-06-01 7:00 pm
Neolander
Actually, despite a lack of battle myself, there are quite a lot of war history on this front. For example, from Raymond Chen’s experience being at the forefront of Windows development, compatibility is a huge thing. It is one of the deciding factors in the popularity game, which then influences the amount of help/resources you get.
I fully agree with that. Compatibility is important, because it provides developers the invaluable ability to write some software once and stop worrying about it once it has reached a stable and bug-free state, instead of having to follow API versions and constantly rewrite things.
On the other hand, it is also clear having too much compatibility kludges in the system proper is bad. Hence why I propose to put it into translator wrappers, so that there is sufficient glue to keep the system running without recompilation, without the excess for the “everything new and works together” normal case. The fact that it does automatically penalise un-recompiled code by the translation overhead is, to me, a good thing, despite being a definite pain for real world maintenance/performance.
I’d argue that nothing prevents system API manufacturers from putting all compatibility kludges in a separate code module, separating them from the rest. After all, you put the same code in that module that you’d put in a translator applet, so that code has an independent life on its own and doesn’t need to be mixed with the main server code and impair future developments.
On the other hand, translator applets have this advantage that they *enforce* such an isolation.
PS: It is interesting to also see Alfman joining our conversation, although he has less to say to me. And how it actually developed from Kaj-de-Vos’ one small vague comment.
Actually, we have already chatted on that topic, and I have taken his feedback into account, so it’s normal that he has less things to tell me this time
Alfman has a quality which I’m really fond of, as far as OSdeving discussions are concerned : he likes precision. When he notices a blanket statement, he is going to press its author until he either puts some true arguments on the table or quits. That makes him a precious ally to have when designing things.
Edited 2011-06-01 19:01 UTC
2011-06-01 11:44 pm
xiaokj
I’d argue that nothing prevents system API manufacturers from putting all compatibility kludges in a separate code module, separating them from the rest. After all, you put the same code in that module that you’d put in a translator applet, so that code has an independent life on its own and doesn’t need to be mixed with the main server code and impair future developments.
Theoretically possible, but in the real world, it is hard to even envision, let alone actually do it.
On the other hand, translator applets have this advantage that they *enforce* such an isolation.
Well said. I cannot even put it better myself.
…so it’s normal that he has less things to tell me this time
You caught my idea wrong. I was finding it funny he didn’t talk to me, not you. But of course.
Alfman has a quality which I’m really fond of, as far as OSdeving discussions are concerned : he likes precision. When he notices a blanket statement, he is going to press its author until he either puts some true arguments on the table or quits. That makes him a precious ally to have when designing things.
Certainly. If not for a lack of real exp myself, I would have joined in.
2011-06-02 12:23 am
Alfman verbose=1
xiaokj,
Well, it’s nice to get an honourable mention like this.
“You caught my idea wrong. I was finding it funny he didn’t talk to me, not you. But of course.”
Don’t read into it too much. I didn’t see anything objectionable and I don’t have any points you guys aren’t already covering.
2011-06-02 12:38 am
xiaokj
Alfman,
Nah, I had realised a bit late about your injections — the comment system did not notify me.
I suppose, given the recent increases in the comment count, there must have been more.
I think the reason why you did not see anything objectionable is that I have provided a very vague notion, although I do point out the relevant places I got those ideas from. You simply had nothing solid to poke at! I really should get my hand dirty in some form of computing practice…
2011-06-02 7:04 am
Neolander
Theoretically possible, but in the real world, it is hard to even envision, let alone actually do it.
Let me show you how I think it could be done in a C-like language.
The server process has decided to drop function A and introduce B and C as a replacement.
So the person implementing it creates the “compatibility.h” header and puts two functions in there : “compat_init()” and “compat_A()”.
compat_A() is a function which takes the same parameters as A and emulates the behaviour of A through use of B and C.
compat_init() does the remote call initialization stuff, broadcasts compat_A() as A. It is to be put with the rest of the server initialization stuff.
So in the end, you have added exactly one line of code to your server initialization code, “compat_init();”, and from that point everything happens in a separate code module.
I have already said I agree that having a translator system may enforce the isolation better, but the isolation is certainly not hard at all to do in code.
EDIT : Oh, by the way… I expect this discussion to be soon “obsoleted” by the OSnews comment system, which does not allow writing in threads that are more than 5 days old or so. When it happens, feel free to continue it on the blog post’s comments if you still have something to say. Myself, if nothing new has emerged on Saturday, I’ll examine what has been said here, and publish a new version of the design for further review when it’s ready.
Edited 2011-06-02 07:14 UTC
2011-05-30 10:17 pm
Alfman verbose=1
Kaj-de-Vos,
“I have been talking about the problem that RPC implies an inflexible semantic data exchange (the payload).”
I’ve found it frustrating that your posts are so vague. I’m really out of ideas as to what problems you have with “RPC”. Your claims may be valid against the least common denominator forms of function prototypes, but there are plenty of counter examples which you’ve been ignoring.
“Let’s make this concrete. How would you implement a service that draws a line?”
Well, your example evolves from just drawing a line to doing more stuff. But the implication that RPC cannot handle “more stuff” is not accurate.
You’re assuming a least common denominator approach again, but many modern languages support functions which are extensible. It’s not fair to put them all beside C and label all RPC as inadequate.
“You could draw up a plan including all sorts of functions, parameters, transfer methods, interface description languages and parsers for it, but that is all irrelevant. To draw a line, assuming the pen is set at a starting point, it suffices to specify this:”
You’re essentially coming up with the foundations of a vector graphics format. You could make it arbitrarily complex. You could support windows 3.0 metafiles or VML or SVG (all vector graphics formats).
Javascript can easily accommodate your example by using JSON arrays and hashes. Web services can be used to connect separate components via HTTP/JSON directly to native types on many platforms including Perl/PHP/.Net/Python.
I think you’re assuming that all RPC is limited to transferring only simple types as parameters, but this isn’t the case. Today many languages make it possible to call remote procedures with deep objects hierarchies.
I can understand why you’d dislike simple function prototypes as in C (which may be what neolander has in mind), but I don’t think your claims hold up against “RPC” in general.
2011-05-31 8:48 pm
bouhko
You might want to have a look at Google’s protocol buffers. This is basically a way to define messages that can be serialized/deserialized in multiple languages. It allows you to define services as well (and let you implement the RPC details for your system) :
http://code.google.com/apis/protocolbuffers/docs/reference/cpp-gene…
2011-05-31 10:56 pm
Neolander
Okay, so if I understand it correctly it’s about having a code generator that generates both sides of the RPC call based on a description language, right ? Sounds pretty neat indeed
The regular deprecation warnings at the beginning of the linked paragraph bug me, though.
2011-06-01 5:53 pm
bouhko
Yeah that’s the idea.
Actually, looks like it’s deprecated in favor of the plugin API. It should be able to achieve the same kind of stuff, maybe with a bit more work (but looks like it’s more flexible).
Anyway, what’s really awesome about protocol buffer is the way the are serialized. It’s really efficient and fast (you can look up at some benchmarks on internet).

2011-05-29 8:20 pm
Alfman verbose=1
“In RPC, you assume that the remote end has a procedure you can call.”
Well, that’s a given, but we’re talking semantics here. Whether your talking about dos interrupts, linux syscalls, vector calls, we’re still technically calling a “procedure”.
I guess you are referring to different mechanisms for parameter passing?
It’s true there are different incompatible types (for example __cdelc or __stdcall), and these may even have subtle differences from platform to platform (passing floating point values in ST0 instead of stack). But these are strictly binary differences, all models are compatible at a source level – I just need to recompile.
“That’s a big assumption. To make it work, you assume that the remote procedure is written in the same programming language.”
Why did you ignore my counter example? In any case, this is no different than windows or linux being written around C callers.
“That’s a huge implementation ‘detail’.”
Exactly, it’s an implementation detail which end users rarely if ever need to concern themselves with. People don’t need to know the calling conventions of their platforms to be able to write code.

2011-05-29 8:30 pm
Kaj-de-Vos
All the things you talk about are procedure calls. If you never consider the alternative of declarative messaging, you won’t see the difference.

2011-05-29 9:53 pm
Kasi
Hey Kaj,
I’m hitting a bit of a wall with google in looking for information on working with a declarative data model. Can you point me to a book or other source so I can read on my own?

2011-05-29 10:04 pm
Kaj-de-Vos
Hmm, this is such a general concept that I don’t know of any specific texts just about that topic. The previous poster said it is treated in ESR’s hacker’s bible, so that would be a good example. I learned it over the years in several of the systems I mentioned. Especially the REBOL language is excellent to form your mental model, because it implements this concept very purely, fundamentally and pervasively.
There are also many overlapping concepts, such as data driven programming, table driven programming, template oriented programming and modeling and markup languages, which are often different names for basically the same thing. Such concepts have sections on Wikipedia.