Shop OBEX P1 Docs P2 Docs Learn Events
Library OS? — Parallax Forums

Library OS?

https://www.sigarch.org/leave-your-os-at-home-the-rise-of-library-operating-systems/

Suggesting very strongly an OS is not required on modern hardware.

Comments

  • Prior to the use of discs, some computers didn't have operating systems like early IBM computers. Discs and the advent of mass storage library volumes and complex files with indices all contributed to the use of operating systems as the size and complexity of disk (and somewhat tape) support libraries increased. Often a lot of overlays were used.

    For the Propeller, not much is needed ... maybe some common communication areas for I/O drivers, maybe some frequently used I/O drivers (like keyboard / display or serial terminal support) along with a loader which can mostly reside in its own cog. Several such OSes have been written for the Propeller consisting mostly of utility programs, a few I/O drivers, and an initialization program
  • Heater.Heater. Posts: 21,230
    edited 2017-09-15 17:24
    My eyes glazed over and I stopped reading after the first contradictions in the article. First we have:

    "The efficiency of Operating Systems (OSes) has always been in the spotlight of systems researchers ... But the reason for this obsession is not entirely obvious."

    Then we have:

    "It turns out that in data center workloads about 15-20% of CPU cycles are spent in the OS kernel. Indeed, most of these workloads are I/O intensive, thus they stress multiple OS components, from device drivers through networking and file I/O stack, to the OS scheduler managing thousands of threads."

    Which is to say that the reason for the obsession over OS efficiency is obvious. As spelled out in the second paragraph.

    Of course only 15-20% of data center apps are spent in the OS kernel. Most data center apps are written in languages like Java, Javascript, PHP, etc that are horrendously inefficient. It's a tribute to the OS builders that only 20% of CPU cycles are spent in their code, despite the stress they are under.

    I'll have to try reading past that point again...

    However, whilst we are here, I might ask why do my programs need an operating system at all? They should be able to run on bare metal. As they do in many embedded, micro-controller, systems.

    Even in the cloud (data center) my programs need only a few things: A CPU to run them, some memory to work in, some means of input and output.

    I don't need a time sharing system to share that CPU with others. No scheduler, context switches and all that junk.

    I don't need a network stack. Just give me some simple pipes for data in and data out. Can be done in simple hardware like the "channels" of the Transputer,

    I don't need a file system. Just let me output data with some identifier and let me get it back later with that identifier (Using those pipes mentioned above.)

    Meanwhile, other processors in that "cloud" setup can take care of networking, files systems, etc. Guess what? They don't need an OS either. In this picture they also run on bare metal.

    What am I saying here?

    What we want is many simple processors with simple software hooked together with communication pipes. Rather than huge complex processors, with their huge complex operating systems and protected memory, virtualization and all that Smile.

    See: "Communicating Sequential Processes", Tony Hoare, 1985.
    http://www.usingcsp.com/cspbook.pdf

  • Sure, you can have all sorts of other processors in the cloud to take care of the stuff going through the pipes, either to the local or global network or through file system implementations to storage devices or whatever. Those can be done without operating systems too. The problem you run into is when the CPU or local memory resources are expensive and therefore need to be shared because, unless they're running well behaved applications, they're going to be idle some of the time. If they're cheap, you can afford to have a few idle. If they're expensive, they have to be shared and that takes an operating system to manage. A resource manager can be run in some other processor if that one can control the "application processors". It's been done. At some point you call that an operating system.
  • Heater.Heater. Posts: 21,230
    edited 2017-09-16 02:07
    That's right. An operating system is not just about abstracting away hardware it's about resource allocation.

    It's all historical baggage. Got a 10 million dollar main frame and a 100 potential users? Better devise a way to share that mainframe among the users.

    The micro-processor has been faithfully following, re-inventing, all those old main frame ideas ever since. The complexity has been escalating. Together with high energy requirements, bugs, and security issues.

    It need not be like that. If my application needs network access that is just a simple hardware channel. Storage? It's on the end of a simple hardware channel. Console port? Another channel. Just give me a processor, memory and channels to talk to.

    Given that we can now build small, cheap, low power processors why not dedicate one to every application in the cloud? Everything simple and regular.

    Things are heading that way. Google was using a ton of regular servers to implement it's neural nets for language translation an such. Now they have the Tensor Processing Unit. https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu

    Of course if we build millions of small, simple, single application processors to provide cloud server instances the "operating system" moves out of those processors to somewhere else. Something has to manage that network of channel connected processors, allocate them as needed, connect them up correctly, detect failures, etc. A whole new ball game.
  • Mike GreenMike Green Posts: 23,101
    edited 2017-09-16 02:21
    So where are we going with this? Ideas get reinvented with small variations as the economics of logic functions, CPUs, different sorts of memories and I/O devices change. Ideally engineers and programmers have some training in history, otherwise are doomed to re-live it and make variations of the same mistakes all over again. I started in this business when patchboards were used to sequence calculations. Some of the mistakes we see today were done ... somewhat differently when computers had discrete transistors on PCBs, but much the same.
  • Heater.Heater. Posts: 21,230
    edited 2017-09-16 02:53
    I have no idea where we are going with this...

    Certainly it seems that old, forgotten, ideas can become new again as technology and economics changes.

    I'm not sure that engineers do get much of training in the history of their art. A couple of examples:

    0) The original micro-processor instruction sets were designed by electronics engineers not those steeped in computer history. That is why the Intel and Motorola (6800) instruction sets are so awful.

    1) Back in the day Intel decided it needed to jump from 8 bit to 16 or 32 bit processors. They hired a whole team of CS grads to design it. That was the i432. After a year or more it was still going to take "another year" had become too complex to build and performance sucked. The 432 was canned and and in 10 weeks an emergency project team produce the x86 design we know and "love" today. Those 432 guys had not learned from the famous collapse of Multics years before.

    2) Later Intel decided the x86 architecture was Smile. They needed to jump to 64 bits so that was a good time for an architecture change. After billions spent the result was the Itanium. Which as you know was a disaster and Intel had to quickly turn around and adopt AMD's x64 architecture. The Itanium designers had never learned that a VLIW machines had never been made to perform well. Shunting the optimization required for VLIW to the compiler is a problem that has never been solved.

    It's interesting to hear Dave Patterson (of RISC fame) talk of the last 30 or 40 years of computer architecture history:





    As for our massive network of single application, super simple, OS free processors, that is something we have never done before. We did not have the technology for it. Or the economic incentive. Who knows where it will lead. But I can't help thinking the current Intel model has not long to live.
  • The original 8008, then 8080 instruction set was a derivative of Datapoint's 2200 instruction set, invented by a couple of smart guys (programming and hardware design) there based on a design constrained by serial shift register memories and by logic available at the time. Intel was stuck with the design (and existing code base) and used much of the instruction set to come up with the 8086 and its successors.
  • Heater.Heater. Posts: 21,230
    edited 2017-09-16 04:57
    I would never say those designers were not smart. There is no sign that the designer of the 8080 instruction set, Victor Poor, was steeped in computer architecture history.

    The Datapoint 220 was not conceived as a computer. It was a terminal. Given the requirements and the constraints I'm sure it was great. It is not clear to me that the instruction set of the 8080 was the same as Datapoints original TTL implementations.

    From wikipedia:

    "Poor and fellow amateur radio colleague Harry Pyle produced the underlying architecture of the modern microprocessor on a living room floor. They then asked fellow radio amateur Jonathan Schmidt to write the accompanying communications software. Pitching the idea to both Texas Instruments and Intel, the partnership developed the Intel 8008, the forerunner of the microprocessor chips found in today's personal and computing devices"

    The 8008 instruction set seems to have been an independent development.

    It is amazing that the 8008 instruction set lived on to the 8085. And more amazing that the x86 instruction set is basically the same with some additions. Back in the day we ran 8085 assembler source through Intel's "conv86" translator and it would spit out x86 assembler syntax that would work out of the box. The translation was almost all a direct mapping of a single 8080 instruction to a single x86 instruction. They were so similar.

    And that is why we still have a valuable single byte instruction used for Ascii Adjust for Addition (AAA) in the latest x86 incarnations. Even though it is never used.

    Gotta give credit to Intel for trying to wipe the slate clean and kill off x86 with something else. Sadly it was the Itanium.





  • Mike GreenMike Green Posts: 23,101
    edited 2017-09-16 18:48
    Datapoint actually contracted with both Intel and TI to product a single chip version of the 2200's CPU. Intel couldn't do it while TI's version required too much interface circuitry to be practical. TI and Intel agreed to let Datapoint out of the contract in exchange for letting them use the instruction set for further CPU development. This eventually became the 8008, then (with relatively small changes) the 8080, 80186, 8085, 80286, etc.

    Datapoint was originally Computer Terminal Corporation and most of their business was terminals. The 2200 was marketed as a business computer from the beginning.

    Vic, Jonathan, and Harry did develop the instruction set in Vic's living room. Harry had been going to Case Inst. of Technology in Cleveland, studying computer science.

    I got involved writing an ASCII decimal arithmetic package for one of the 2200 prototypes. They had developed a business programming language called Databus with a compiler and interpreter and needed the arithmetic package to do variable length, arbitrary decimal place signed arithmetic. Eventually I went to work for them.
  • Heater.Heater. Posts: 21,230
    Wow, the history from an original source. Thanks Mike.

    I guess that pesky DAA instruction came in very handy for what you were doing.

    Interestingly by the time CP/M came around it was almost never used. When I was creating the Z80 emulator for the Prop I got CP/M up and running before I implemented DAA. Never did find a CP/M program that used it. It's been there wasting a valuable single byte opcode space in our computers ever since!

  • Cluso99Cluso99 Posts: 18,069
    Mike,
    Thanks for that info. Right at the forefront of mini-computing.

    I find it very interesting about requiring decimal arithmetic. The mini I worked on had decimal arithmetic inbuilt in the hardware. The memory addressing was also decimal, with an instruction using 10 ascii 6-bit characters.
  • Mike GreenMike Green Posts: 23,101
    edited 2017-09-16 22:24
    The IBM 1620 was decimal based as was the IBM 1401, 1440, and others in that series.

    Datapoint was also actively involved in wireless networking and developed an operating system for distributed networking where all parts of an active program could be farmed out to other instances of the system as long as permissions were granted. You could have a program's console in one place, the program executing elsewhere, a printer and com channel in other places, and disk drives elsewhere.

    They had optical links for network paths (a mile or so) where cabling or microwave links were not available.
  • Cluso99Cluso99 Posts: 18,069
    The Singer/ICL mini could network up to 10 multidropped devices (video terminals and printers) per shielded twisted pair cable at 53Kbit (later 106Kbit) with a max distance of about a mile. The protocol was quite unusual and was done before UART chips were developed.
  • Mike Green wrote: »
    The IBM 1620 was decimal based as was the IBM 1401, 1440, and others in that series.
    The IBM 1620 was the second computer I learned to program. It supported variable length numbers and I remember having trouble later when I moved to a DEC PDP-12 where numbers were fixed at 12 bits. I had gotten used to being able to have any precision I wanted with the 1620.
  • Cluso99Cluso99 Posts: 18,069
    edited 2017-09-17 06:03
    David Betz wrote: »
    Mike Green wrote: »
    The IBM 1620 was decimal based as was the IBM 1401, 1440, and others in that series.
    The IBM 1620 was the second computer I learned to program. It supported variable length numbers and I remember having trouble later when I moved to a DEC PDP-12 where numbers were fixed at 12 bits. I had gotten used to being able to have any precision I wanted with the 1620.
    Very interesting. Lots of similarities with the early mini and mainframes.

    The mini supported decimal arithmetic up to 10 digits (20 for multiply and divide to avoid overflow). And then an edit instruction to format the output. It was a RISC machine with 15 instructions. Instructions were memory to memory, with indirect and indexing options.

    What could be done in a few KB of memory was amazing. But it was B&W (or green on black) video terminals, uppercase ASCII.

    In 1981 a major upgrade brought 8-bit ASCII (from 6-bit ASCII) and a few additional instructions including some logic ones.

    It was certainly an exciting part of my life.
  • msrobotsmsrobots Posts: 3,704
    edited 2017-09-17 09:52
    Well - I think the need for decimal arithmetic at that time was partly because of COBOL running on all of them. And COBOL loves HW supporting decimals.

    And before some smart operators in Berkley(?), bored by just sitting around and running batch jobs INVENTED the idea of 'time-sharing' most computers did just run one program after the other. No OS needed, for that we had operators.

    This is like we are cycling backwards. After Single Batch per Mainframe to Time-Share on MF, to Personal Computer to PC is now just Web-Browser like a MF-Terminal to we need HW just doing one job, without 'Time-sharing' like a OS does.

    To me it looks like the cycle is closed.

    Mike
  • Mostly, but with a twist, and that is hardware doing one job doesn't mean one system per job, more like there is hardware for multiple jobs now, and whether it's one system or not is more of a choice.

  • Heater.Heater. Posts: 21,230
    I don't think the cycle is closed at all.

    What we have been discussing here is the idea of a single process running on a single processor with no operating system. Or at least a very minimal one. That might sound like the old batch job running mainframes of old but it's very different.

    Those old batch jobs were run, one at a time, from a queue. Perhaps the queue was a stack of punched cards. Each job started, did it's thing, then ended. Then the next job was loaded from the queue.

    What we have been talking about is a single process running on a single processor but it is expected to run forever. Servicing requests for whatever it does as they come.

    Significantly systems today comprise many of those processes running at the same time. Tens, hundreds, thousands of them. Each on their own processor. Some might be dedicated to database work, some might be web servers, some might be processing this or that. See Facebook, Google, etc, etc.

    When you submit a job to such a system, like actually hitting submit on a web page, your job is making work for many of those processors. Not just one.

    Of course the issues of resource management and such, as handled by those later multi-tasking, multi-user operating systems are still there. But now they are in load balancers and such that direct the work to the available processors.
  • The 8008, 8080, 8085, and Z80 are object code compatible. You can build a computer with an 8085 or Z80 that will run compiled object code written for the 8008 or 8080.

    The 8086/8088 and its successors are not object code compatible with the 8080, although all its successors to the modern day are object code compatible with the 8086. The goal with the 8086 was source code compatibility, so you could take an assembly language program written for the 8080 and compile it directly to new machine code that would run on the '86. The actual instruction set was quite different though because of the implementation of the segment registers we loved *cough* so much back in the day. The idea was to provide for quick migration of existing 8080 software while providing a smooth upgrade path (without mode bits) for future enhancements involving more memory and more powerful CPU instructions.

    Of course by the time of the 8086, the Z80 had gained a lot of market share and the x86 series weren't compatible with its extended instruction set. x86 was also very memory hoggy compared to the 8-bitters, with the 8-bit bus 8088 also being slower for a similar clock speed. The whole project was really ahead of its time but not competitive with any of the popular 8-bit competitors. But when IBM came along Tandy was using the Z80, and Apple and Commodore the 6502. They didn't want to be also-rans using the same chip as an existing popular computer, and that pretty much left them with Intel because it was so expensive that nobody was using it for much of anything.

    On the other hand, the reason x86 was so hoggy was that it was very forward-looking, and the reason you can run a program compiled for the original IBM PC on a modern machine without redeveloping it is that forward thinking. A few generations forward and that compatibility was far more important than any other performance metric.
  • Heater.Heater. Posts: 21,230
    edited 2017-09-19 00:40
    As I said above, we used to run our 8085 source code base thought Intel's conv86 translator and it would produce a one to one, line for line equivalent in x86 assembler. A very simple translation as the architectures were the same. Same registers were available, same operations.

    Not only that the 8088 was, hardware wise, an almost a drop in replacement for the 8085. I remember building a little carrier board for the 8088 that provided whatever tweaks were needed to the pin out, and a new clock arrangement clock, and plugged into our embedded systems boards. I was amazed when it worked! The up shot was that respinning all our boards to use the 8088 was trivial.

    The catch was that after that conv86 assembly language translation the resulting binary ran almost exactly half as fast as the 8080 version! It then took a lot of tweaking of the code to use 16 bit arithmetic and such before you got the speed back.

    In what way was the x86 forward looking?

    As I said above, the Intel 432 was forward looking. So forward they could not build it. The x86 on the other hand was an emergency quick hack created in 10 weeks. Apart from the 1 megabyte addressing capability lashed onto the side I see nothing forward looking in it.

    The reason you can run old x86 code on modern machines is that they just kept piling more and more stuff on to it and leaving the old stuff in there. Each layer of cruft being buried under a "mode": long mode, protected mode, real mode, system management mode, unreal mode, virtual 8086 mode. Hmm... must be more than that, they had to add AMD mode to move to 64 bits.
  • TorTor Posts: 2,010
    edited 2017-09-19 10:04
    localroger wrote: »
    The 8008, 8080, 8085, and Z80 are object code compatible. You can build a computer with an 8085 or Z80 that will run compiled object code written for the 8008 or 8080.
    No, that is not entirely correct. The Z80 object code format is a superset of the 8080 one, but the 8080 is not a superset of the 8008. There is some overlap, but they are not object code compatible.
    The goal with the 8086 was source code compatibility, so you could take an assembly language program written for the 8080 and compile it directly to new machine code that would run on the '86.
    The 8080 and 8086 are not source code compatible. But Intel provided a tool to help with source code translation [edit: The conv86 tool Heater mentions above], around the time when the 8086 was introduced. It was working reasonably well, although not perfectly IIRC.

  • Heater.Heater. Posts: 21,230
    The thing with conv86 was that it did a very good job of converting 8080 assembler into 8086 assembler. Functionally identical.

    Problem was that for a lot of instructions that affected the flag register bits ADD, SUB, etc. the flags were not set exactly as an 8080 would set them. So conv86 would fix that by adding a whole bunch of LAHF (Load Status Flags into AH Register) and SAHF (Store AH into Flags) instructions. Together with whatever it was that corrected the flag bits.

    The result of all that was the code came out two or three times bigger than it started and ran incredibly slowly.

    Turns out that the reason for all this extra flag twiddling was that the 8080 DAA (Decimal Adjust Accumulator) instruction worked a bit differently from the x86 AAA (ASCII Adjust After Addition) instruction. The flags (specifically The Auxiliary carry bit) were tweaked all the time just so that the AAA instruction would work correctly.

    Of course, nobody ever used that stupid DAA instruction so Intel provided an option to conv86 that told it not to do all that flag correcting. (That should have been a clue to them that x86 did not need an AAA instruction)

    Boom, you got a one to one instruction mapping. Code size and execution time was much reduced. I was amazed when I pushed our first embedded app though conv86, and ran it on my Frankenstein hack of an 8088 board into an 8085 board and it ran first time! It had taken all day to do all the conversion and rebuild the app on three Intel MDS development systems running in parallel.

    Oh Smile. I can't remember what I had for lunch yesterday and here I am reminiscing about work I did in the early 1980's. Nurse, more meds...


    Oh yes, the Z80 object code contains three times the number of instructions as the 8080 (or is it four). The Z80 added all kind of bit twiddling instructions and such. It's much harder to write an emulator for the Z80. As I found out when my 8080 emulator for the Propeller grew into the Z80 emulator (ZiCog). Luckily in the CP/M world nobody ever used those extra instructions. (Except for the block move instructions and the second register bank sometimes).
Sign In or Register to comment.