I can't cite chapter and verse but I seem to remember this zeroing problem was solved decades ago by just introducing a bit which said this chunk of memory or disk is new (to this process) and not zeroed but if there's any attempt to actually access it then read it back as if it were filled with zeros, or alternatively zero it. Isn't that a result of the language? Low level languages give that power to the author rather than assuming any responsibility. Hacker News had a fairly in-depth discussion regarding the nature of C with some convincing opinions as to why it's not exactly the right tool to build this sort of system with. The gist, forcing the author of a monster like OpenSSL to manage memory is a problem. On Tue, Apr 15, 2014 at 11:37 AM, Barry Shein <bzs@world.std.com> wrote:
I can't cite chapter and verse but I seem to remember this zeroing problem was solved decades ago by just introducing a bit which said this chunk of memory or disk is new (to this process) and not zeroed but if there's any attempt to actually access it then read it back as if it were filled with zeros, or alternatively zero it.
Sort of the complement of copy-on-write, you do it by lazy evaluation.
For a newly allocated disk sector for example you don't have to actually zero it on the disk when allocated, you just return a block of zeros if a process tries to read it (i.e,. the kernel mediates.)
Typically you allocate disk space (other than by writing blocks) by doing a seek forward, maybe write a block, and then seek backwards to access the virgin space in between.
But anything in that virgin space can be marked in kernel memory as having to read back as zeros, no need to read anything at all from the actual disk.
In fact, there's no need to actually allocate the block on disk which is why we have the notion of files with "holes", see for example the 'tar' man or info page for discussion of dealing with file holes when archiving. That is, whether to try to remember them as holes per se or to actually write all the zeros.
It's important because it can create (it's a command line option) an actual TB tar file which has expanded that hole into blocks of zeros, a tar archive (e.g.) which might really be over a TB, or it might only be a few blocks, header info plus info that the rest is just to be treated as a TB hole. Or of course the tar archive could only appear to be a TB file but really be a big hole itself.
This is not at all limited to tar, it's just some place it came up very explicitly for the reasons I described.
Maybe that (lazy-evaluation of returning zeros) never got widely implemented but I thought it showed up around BSD 4.3 ca 1984.
I think some of the models being presented her are somewhat oversimplified.
-- -Barry Shein
The World | bzs@TheWorld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: 800-THE-WRLD | Dial-Up: US, PR, Canada Software Tool & Die | Public Access Internet | SINCE 1989 *oo*