Activation Record Blog
Thursday, January 26, 2012
Tsuna's blog: How long does it take to make a context switch?
Tsuna's blog: How long does it take to make a context switch?: That's a interesting question I'm willing to waste some of my time on. Someone at StumbleUpon emitted the hypothesis that with all the impr...
Thursday, August 4, 2011
Linux msync braindamage, Part 1
About msync() system call
Contemplating a possible implementation of log-based transactional system and looking at the UNIX API it seems natural to employ msync() function for memory mapped log files.NAME
msync - synchronize memory with physical storage
SYNOPSIS
int msync(void *addr, size_t len, int flags);
Indeed msync specification states the following:The msync() function should be used by programs that require a memory object to be in a known state; for example, in building transaction facilities.The idea is that the log file should be mmaped to the address space of the process, the log data is written to the memory and synchronized with disk by the power of OS virtual memory mechanism. This way there is no need to allocate in-memory buffer for log data and call write() when the buffer is full. Instead just when the transaction is to be committed, exactly the portion of the mmaped log that contains the transaction data is msynced and that's it. Concurrently the data for other transactions can be written further down the log and stay cached in memory avoiding unnecessary I/O. Additional appeal to msync() gives the existence of two modes MS_ASYNC and MS_SYNC:
When MS_ASYNC is specified, msync() shall return immediately once all the write operations are initiated or queued for servicing; when MS_SYNC is specified, msync() shall not return until all write operations are completed as defined for synchronized I/O data integrity completion.I can't help but think that msync() was introduced to UNIX specifically to cater DBMS people. This cannot be a coincidence. This is just what one would want developing a DBMS engine.
Okay, so far I referred to POSIX and UNIX. However currently probably most attention deserves one particular implementation of POSIX API, namely, Linux. Just checking the Linux msync() man page it seems that everything is good. It pretty much conforms the POSIX specification. Or so it says.
Once you start wondering what is situation on the ground the picture becomes more complicated. One interesting tidbit can be found in FreeBSD man page:
The msync() system call is obsolete since BSD implements a coherent file system buffer cache. However, it may be used to associate dirty VM pages with file system buffers and thus cause them to be flushed to physical media sooner rather than later.This is confusing. The purpose of msync is to ensure data integrity. I understand that if a process crashes then its modified mmapped data still remains in the system cache and at some point it will be synchronized with physical storage. So far so good. But what if the whole system crashes? Without msync() this will result in the data loss. Or are they saying that their msync() merely causes the page flush to happen somewhat earlier but not right away? So on FreeBSD there is no big difference whether you msync() or not as it provides no integrity guarantee anyway? Well, I don't have answers to these questions as now I don't want to spend much time on FreeBSD research, I'm more focused on the Linux.
About msync() on Linux
So what is about msync() on Linux precisely? In the fairly recent Linux release the following comment could be found in the filelinux/mm/msync.c:/* * MS_SYNC syncs the entire file - including mappings. * * MS_ASYNC does not start I/O (it used to, up to 2.5.67). * Nor does it marks the relevant pages dirty (it used to up to 2.6.17). * Now it doesn't do anything, since dirty pages are properly tracked. * * The application may now run fsync() to * write out the dirty pages and wait on the writeout and check the result. * Or the application may run fadvise(FADV_DONTNEED) against the fd to start * async writeout immediately. * So by _not_ starting I/O in MS_ASYNC we provide complete flexibility to * applications. */So let me summarize the current status of msync() on Linux:
- msync(..., MS_ASYNC) is effectively noop
- msync(..., MS_SYNC) is effectively equal to fsync()
Okay, should we stop here? Or is there more to learn yet? Sure, it is.
[To be continued]
Monday, August 1, 2011
Innodb is full of crap
Innodb code is full of crap. All across the source base it pretends that it can do multiple log groups. But it always initializes only one. And sometimes at random places it acknowledges this fact. There is a case of this schizophrenia within single function. In file log0log.c, function log_write_up_to(), we see the code with this comment:
Almost half (albeit disabled) of 3500+ lines of log0log.c file from MySQL 5.1 still deal with these log archives. And this code is still there in MySQL 5.5. Is there anybody who needs it?
The same is for support for some ancient checksum algorithms. I guess the Innodb files with these checksums if still can be found then it's only on certain spacious 40MB hard-drives that collect dust in some abandoned garage in Finland.
And let's not forget that wonderful uber-portable 64-bit arithmetic with the help of macros. Yes, it makes the code so readable. I just like to learn what are the current target platforms where there is no compiler available with "long long" or "int64_t" arithmetic? I believe that in year 2011 we could already think of 32-bit arithmetic as something special and 64-bit as the default.
All in all, with all the effort to scale up MySQL to high-end servers why not start cleaning up the mess already?
group = UT_LIST_GET_FIRST(log_sys->log_groups);
group->n_pending_writes++; /*!< We assume here that we have only
one log group! */
Then a few lines below we see iteration over the list of groups.
group = UT_LIST_GET_FIRST(log_sys->log_groups);
/* Do the write to the log files */
while (group) {
<...>
group = UT_LIST_GET_NEXT(log_groups, group);
}
Why iterate over the list that always has only one member? Why have this list at all, why it's not direct pointer to the log group? Who needs multiple log groups that never are?
Even more interesting is that there is a large part of log-related code that stays there but is disabled with #ifdefs. This is the code for something called log archives. The related configuration options are documented like this:
- innodb_log_arch_dir
This variable is unused, and is deprecated as of MySQL 5.0.24. It is removed in MySQL 5.1 - innodb_log_archive
Whether to log InnoDB archive files. This variable is present for historical reasons, but is unused. Recovery from a backup is done by MySQL using its own log files, so there is no need to archive InnoDB log files. The default for this variable is 0.
Almost half (albeit disabled) of 3500+ lines of log0log.c file from MySQL 5.1 still deal with these log archives. And this code is still there in MySQL 5.5. Is there anybody who needs it?
The same is for support for some ancient checksum algorithms. I guess the Innodb files with these checksums if still can be found then it's only on certain spacious 40MB hard-drives that collect dust in some abandoned garage in Finland.
And let's not forget that wonderful uber-portable 64-bit arithmetic with the help of macros. Yes, it makes the code so readable. I just like to learn what are the current target platforms where there is no compiler available with "long long" or "int64_t" arithmetic? I believe that in year 2011 we could already think of 32-bit arithmetic as something special and 64-bit as the default.
All in all, with all the effort to scale up MySQL to high-end servers why not start cleaning up the mess already?
Friday, July 29, 2011
Tinkering with web design
I made a few changes to my web pages.
- Finally updated info on the libjit page about getting libjit sources from the git repository
- Removed wordpress blog that I never really used and that only was a target for spammers and linked this blog my static pages
- Modified style sheet for my static pages to use analogous color scheme
- Tinkered with the blog template to make it similar to my static pages
My initial style sheet used colors that I chose semi-randomly. I just put in some hex value with digits that looked good for me, then looked at the page in the browser and tried again until I more or less liked it. Time passed and I realized that my color choice was awful. I bothered to read about color schemes and went on with free online tool Adobe Kuler to create my new scheme. Immediately I liked the look of my site much better. Perhaps even with new colors I still will be a laughing stock for people more sensitive to design. I never was one, sorry. But right now I'm happy with my colors.
On the other hand I'm not completely happy with blogspot page template that I got. There are still some glitches. However I fixed the most irritating thing for me personally. I like to maximize my browser window to see at once as much of a page content as possible. But the width of the content area is too small in the default Blogger.com theme, it occupies but a narrow stripe in the middle of the window. I converted the template to elastic design that I use with my static pages and so the text now utilizes much more window space. For me this is a big win.
- Finally updated info on the libjit page about getting libjit sources from the git repository
- Removed wordpress blog that I never really used and that only was a target for spammers and linked this blog my static pages
- Modified style sheet for my static pages to use analogous color scheme
- Tinkered with the blog template to make it similar to my static pages
My initial style sheet used colors that I chose semi-randomly. I just put in some hex value with digits that looked good for me, then looked at the page in the browser and tried again until I more or less liked it. Time passed and I realized that my color choice was awful. I bothered to read about color schemes and went on with free online tool Adobe Kuler to create my new scheme. Immediately I liked the look of my site much better. Perhaps even with new colors I still will be a laughing stock for people more sensitive to design. I never was one, sorry. But right now I'm happy with my colors.
On the other hand I'm not completely happy with blogspot page template that I got. There are still some glitches. However I fixed the most irritating thing for me personally. I like to maximize my browser window to see at once as much of a page content as possible. But the width of the content area is too small in the default Blogger.com theme, it occupies but a narrow stripe in the middle of the window. I converted the template to elastic design that I use with my static pages and so the text now utilizes much more window space. For me this is a big win.
Thursday, July 28, 2011
Back to libjit hacking
Right now I have more free time than I had during last 2 years so perhaps I will be able to contribute something new to libjit.
Currently I'm trying to improve libjit memory management. There is a proposed patch for pluggable memory allocator from Patrick van Beem (http://savannah.gnu.org/patch/?7237). I fully recognize the need for some applications to perform custom memory allocation. However I would like to have more elaborate solution for this problem than that found in the existing patch. First of all, libjit's own memory manager (jit/jit-cache.[hc]) is not so good. For instance, the way it allocates function redirectors may result in memory leaks. The patch supposedly resolves this problem but only if pluggable memory manager supports some extra feature not available for libjit internal manager. This is clearly not how it should be done. The leak should be fixed in the way not dependent on which memory manager is used.
So I try to find appropriate solution that would fix this problem for all libjit users whereas Patrick's patch keeps libjit logic mostly intact and "solves" the problem by letting third-party allocator do something that normal libjit users will not have ability to do.
Another thing to consider is that libjit allocates code space in relatively small chunks. If at the compile time libjit figures that the code for a function doesn't fit to the allocated chunk then a bigger chunk is allocated and the compilation is restarted. However on systems with virtual memory (pretty much any modern system where libjit is likely to be ever used) a program can reserve very large amounts of memory in the first place. The system allocates physical memory page for a virtual memory page only when it is really accessed, not when it is reserved. Hence there should be no need for code space reallocation and recompilation. Normally, the way to go is to reserve memory block with size, say, 0.5 GB and all the code generated during lifetime of libjit application should go there. Initially we can commit only few pages from this amount and commit more on demand. If this block ever becomes full then we report error and just quit. Of course, the the size of allocated block should be configurable.
It might be that some application will not be happy with such allocation scheme. For instance, it might target an embedded system where the old allocation scheme works better. Or the application has tight control over the lifetime of JITed functions and it can tell if particular function is no longer needed so the space occupied by the function's code could be reclaimed and used for something else. And this is exactly what pluggable memory manger interface is for.
So for me the first goal is to provide better internal memory manager for libjit and the second goal provider interface to plug custom managers. Along the way I should figure out the most flexible interface that will allow application to do whatever it wants.
Currently I'm trying to improve libjit memory management. There is a proposed patch for pluggable memory allocator from Patrick van Beem (http://savannah.gnu.org/patch/?7237). I fully recognize the need for some applications to perform custom memory allocation. However I would like to have more elaborate solution for this problem than that found in the existing patch. First of all, libjit's own memory manager (jit/jit-cache.[hc]) is not so good. For instance, the way it allocates function redirectors may result in memory leaks. The patch supposedly resolves this problem but only if pluggable memory manager supports some extra feature not available for libjit internal manager. This is clearly not how it should be done. The leak should be fixed in the way not dependent on which memory manager is used.
So I try to find appropriate solution that would fix this problem for all libjit users whereas Patrick's patch keeps libjit logic mostly intact and "solves" the problem by letting third-party allocator do something that normal libjit users will not have ability to do.
Another thing to consider is that libjit allocates code space in relatively small chunks. If at the compile time libjit figures that the code for a function doesn't fit to the allocated chunk then a bigger chunk is allocated and the compilation is restarted. However on systems with virtual memory (pretty much any modern system where libjit is likely to be ever used) a program can reserve very large amounts of memory in the first place. The system allocates physical memory page for a virtual memory page only when it is really accessed, not when it is reserved. Hence there should be no need for code space reallocation and recompilation. Normally, the way to go is to reserve memory block with size, say, 0.5 GB and all the code generated during lifetime of libjit application should go there. Initially we can commit only few pages from this amount and commit more on demand. If this block ever becomes full then we report error and just quit. Of course, the the size of allocated block should be configurable.
It might be that some application will not be happy with such allocation scheme. For instance, it might target an embedded system where the old allocation scheme works better. Or the application has tight control over the lifetime of JITed functions and it can tell if particular function is no longer needed so the space occupied by the function's code could be reclaimed and used for something else. And this is exactly what pluggable memory manger interface is for.
So for me the first goal is to provide better internal memory manager for libjit and the second goal provider interface to plug custom managers. Along the way I should figure out the most flexible interface that will allow application to do whatever it wants.
Wednesday, November 18, 2009
authenticode with osslsigncode
Let's suppose you would like to sign a windows file but for some obscure reason you cannot use MS signtool or signcode utilities. For instance, you would like to do this on Linux. This requires the following software:
- OpenSSL
- osslsigncode
- pvktool
References:
- OpenSSL PKCS#12 FAQ
- Converting a PFX file to SPC and PVK files
- Work around to moving Microsoft® Authenticode® (Multi-Purpose) Certificate to different machines running different Windows platforms
- CAcert Wiki: Authenticode
- OpenSSL
- osslsigncode
- pvktool
References:
- OpenSSL PKCS#12 FAQ
- Converting a PFX file to SPC and PVK files
- Work around to moving Microsoft® Authenticode® (Multi-Purpose) Certificate to different machines running different Windows platforms
- CAcert Wiki: Authenticode
Thursday, March 19, 2009
A JavaScript Story
Just came across a story about JavaScript. The story tells how JavaScript was initially ignored by Real Programmerstm but after many years they came to appreciate it.
Learning To Love Java Script
It pretty much matches my experience. Except that I have never become a JavaScript programmer.
Learning To Love Java Script
It pretty much matches my experience. Except that I have never become a JavaScript programmer.
Wednesday, March 18, 2009
Web Page Layout
As far as I see currently fixed page layout dominates web design. But for some reason I don't like it. Perhaps because I look at it from the programmer standpoint.
For a designer the task is to put stuff on the page so that it fits it best. And fixed layout makes it easy. It gives precise control to the designer.
For me fixed layout is effectively disabling layout as such. Web standards were designed with maximum flexibility in mind. Modern browsers go with sophisticated layout engines that support this flexibility. Any window size, any available font, any user preferences - the browser is up to do all the required job.
But they say - no, thank you, 960px window is all that we need. We assume that that's going to work fine with all monitor resolutions that are on the market these days.
But such assumptions go against my software developer's instincts. Putting arbitrary limits on the page size for me is akin to using a fixed-size buffer for input. A sufficiently large buffer would work in 99.999% cases. But the rules of robust software development require to take care of the remaining 0.001% cases as well.
Therefore I like the idea of fluid and elastic layouts. I created my home page more than a year ago and I used a simple minded elastic layout for it. I had left the page design technology in the late 90s when the dominant layout method was based on tables. So I had to play the catch up game. I learned about these new layouts methods from various blog entries:
http://green-beast.com/blog/?p=199
Now as I am pondering about improving my home page design I came across a number of articles that describe the grid design approach. Somehow I missed it the first time around. For instance, I like this article:
Fluid Grids
I am going to play with "fluid grid" a bit and if it goes well then I perhaps will adopt it for activationrecord.net.
For a designer the task is to put stuff on the page so that it fits it best. And fixed layout makes it easy. It gives precise control to the designer.
For me fixed layout is effectively disabling layout as such. Web standards were designed with maximum flexibility in mind. Modern browsers go with sophisticated layout engines that support this flexibility. Any window size, any available font, any user preferences - the browser is up to do all the required job.
But they say - no, thank you, 960px window is all that we need. We assume that that's going to work fine with all monitor resolutions that are on the market these days.
But such assumptions go against my software developer's instincts. Putting arbitrary limits on the page size for me is akin to using a fixed-size buffer for input. A sufficiently large buffer would work in 99.999% cases. But the rules of robust software development require to take care of the remaining 0.001% cases as well.
Therefore I like the idea of fluid and elastic layouts. I created my home page more than a year ago and I used a simple minded elastic layout for it. I had left the page design technology in the late 90s when the dominant layout method was based on tables. So I had to play the catch up game. I learned about these new layouts methods from various blog entries:
http://green-beast.com/blog/?p=199
Now as I am pondering about improving my home page design I came across a number of articles that describe the grid design approach. Somehow I missed it the first time around. For instance, I like this article:
Fluid Grids
I am going to play with "fluid grid" a bit and if it goes well then I perhaps will adopt it for activationrecord.net.
Saturday, December 6, 2008
OOP
The days of OOP hype are gone. It is simply taken for granted. Or is it?
- Richard Gabriel claims: Objects Have Failed
- Guy L. Steele Jr. responds: Objects have not failed
- Meantime Alan Kay recollects
- While others complain on Platypus effect
- Richard Gabriel claims: Objects Have Failed
- Guy L. Steele Jr. responds: Objects have not failed
- Meantime Alan Kay recollects
- While others complain on Platypus effect
Friday, June 27, 2008
Subscribe to:
Posts (Atom)