This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Date: Mon, 19 Mar 2001 17:18:28 -0800 (PST) From: Perrin Harkins <perrin@primenet.com> To: modperl@apache.org Subject: dbm locking info in the guide While working on adding info on Berkeley DB to the Guide, I came across this statement: "If you need to access a dbm file in your mod_perl code in the read only mode the operation would be much faster if you keep the dbm file open (tied) all the time and therefore ready to be used. This will work with dynamic (read/write) databases accesses as well, but you need to use locking and data flushing to avoid data corruption." Is anyone aware of a safe to way to do multi-process read/write access through a dbm module other than BerkeleyDB.pm without tie-ing and untie-ing every time? I thought that was the only safe thing to do because of buffering issues, but this seems to be implying that careful use of sync calls or something similar would do the trick. Maybe this is just left over from before the problem with the old technique described in the DB_File docs was discovered? Any comments? === Date: Tue, 20 Mar 2001 11:08:11 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: Perrin Harkins <perrin@primenet.com> Cc: <modperl@apache.org> Subject: Re: dbm locking info in the guide On Mon, 19 Mar 2001, Perrin Harkins wrote: > While working on adding info on Berkeley DB to the Guide, I came across > this statement: > "If you need to access a dbm file in your mod_perl code in > the read only mode the operation would be much faster if > you keep the dbm file open (tied) all the time and > therefore ready to be used. This will work with dynamic > (read/write) databases accesses as well, but you need to > use locking and data flushing to avoid data corruption." > Is anyone aware of a safe to way to do multi-process > read/write access through a dbm module other than > BerkeleyDB.pm without tie-ing and untie-ing every time? I > thought that was the only safe thing to do because of > buffering issues, but this seems to be implying that > careful use of sync calls or something similar would do > the trick. Maybe this is just left over from before the > problem with the old technique described in the DB_File > docs was discovered? Any comments? Well, I wrote this based on my experience. I've used the code that does locking coupled with sync() and it worked fine. I know that the guide doesn't cover the details, this is one of the things that needs to be done. I also suppose that this issue should be properly benchmarked, but I think that it's safe to assume that skipping tie/untie improves the speed. Certainly the note "would be much faster,..." is correct if the interaction with a dbm file is very light, which makes the overhead of tie of a significance. === Date: Tue, 20 Mar 2001 13:46:15 -0800 (PST) From: Perrin Harkins <perrin@primenet.com> To: Stas Bekman <stas@stason.org> Subject: Re: dbm locking info in the guide On Tue, 20 Mar 2001, Stas Bekman wrote: > > Is anyone aware of a safe to way to do multi-process read/write access > > through a dbm module other than BerkeleyDB.pm without tie-ing and > > untie-ing every time? I thought that was the only safe thing to do > > because of buffering issues, but this seems to be implying that careful > > use of sync calls or something similar would do the trick. Maybe this is > > just left over from before the problem with the old technique described in > > the DB_File docs was discovered? Any comments? > > Well, I wrote this based on my experience. I've used the code that does > locking coupled with sync() and it worked fine. You mean with DB_File? There's a big warning in the current version saying not to do that, because there is some initial buffering that happens when opening a database. === Date: Tue, 20 Mar 2001 15:03:39 -0800 From: Joshua Chamas <joshua@chamas.com> To: Perrin Harkins <perrin@primenet.com> Subject: Re: dbm locking info in the guide Perrin Harkins wrote: > Is anyone aware of a safe to way to do multi-process > read/write access through a dbm module other than > BerkeleyDB.pm without tie-ing and untie-ing every time? I > thought that was the only safe thing to do because of > buffering issues, but this seems to be implying that > careful use of sync calls or something similar would do > the trick. Maybe this is just left over from before the > problem with the old technique described in the DB_File > docs was discovered? Any comments? > I don't know how to do it safely, which is why I do the tie/untie in MLDBM::Sync in a locked critical section. I know the tie/untie MLDBM::Sync strategy with DB_File is slow, but what size data are you caching? It may be that you can use MLDBM::Sync with SDBM_File, with records < 100 bytes would be good, or MLDBM::Sync with MLDBM::Sync::SDBM_File which faster through around 5000-10000 byte records with Compress::Zlib installed. Generally, the tie/untie with a SDBM_File is pretty fast. Otherwise, DeWitt Clinton's Cache::Cache might also do it for you, as the file based cache is probably faster than DB_File with tie/untie per access for large records. === Date: Tue, 20 Mar 2001 16:28:12 -0800 (PST) From: Perrin Harkins <perrin@primenet.com> To: Joshua Chamas <joshua@chamas.com> Subject: Re: dbm locking info in the guide On Tue, 20 Mar 2001, Joshua Chamas wrote: > I know the tie/untie MLDBM::Sync strategy with DB_File is > slow, but what size data are you caching? I'm not. Well, actually I am, but I use BerkeleyDB which handles its own locking. I just noticed this in the Guide and figured that either it was out of date or I missed something interesting. > It may be that you can use MLDBM::Sync with SDBM_File, > with records < 100 bytes would be good, or MLDBM::Sync > with MLDBM::Sync::SDBM_File which faster through around > 5000-10000 byte records with Compress::Zlib installed. > Generally, the tie/untie with a SDBM_File is pretty fast. I'll update the Guide to mention your module in the dbm section. === Date: Wed, 21 Mar 2001 09:59:22 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: Perrin Harkins <perrin@primenet.com> Cc: <modperl@apache.org> Subject: Re: dbm locking info in the guide On Tue, 20 Mar 2001, Perrin Harkins wrote: > On Tue, 20 Mar 2001, Stas Bekman wrote: > > > Is anyone aware of a safe to way to do multi-process > > > read/write access through a dbm module other than > > > BerkeleyDB.pm without tie-ing and untie-ing every > > > time? I thought that was the only safe thing to do > > > because of buffering issues, but this seems to be > > > implying that careful use of sync calls or something > > > similar would do the trick. Maybe this is just left > > > over from before the problem with the old technique > > > described in the DB_File docs was discovered? Any > > > comments? > > Well, I wrote this based on my experience. I've used the > > code that does locking coupled with sync() and it worked > > fine. > You mean with DB_File? There's a big warning in the > current version saying not to do that, because there is > some initial buffering that happens when opening a > database. The warning says not to lock on dbm fd but an external file! That's where the problem happens. http://perl.apache.org/guide/dbm.html#Flawed_Locking_Methods_Which_Mus If you lock before you tie, and flush before you untie (or change the lock type), it should be safe. === Date: Tue, 20 Mar 2001 18:34:40 -0800 (PST) From: Perrin Harkins <perrin@primenet.com> To: Stas Bekman <stas@stason.org> Subject: Re: dbm locking info in the guide On Wed, 21 Mar 2001, Stas Bekman wrote: > > You mean with DB_File? There's a big warning in the current version > > saying not to do that, because there is some initial buffering that > > happens when opening a database. > > The warning says not to lock on dbm fd but an external file! I think you'll still have problems with this technique, unless you tie/untie every time. I'm looking at the perldoc for DB_File version 1.76, at the section titled "Locking: the trouble with fd". At the very least, you'd have to call sync after acquiring a write lock but before writing anything. === From: "David Harris" <dharris@drh.net> To: "Perrin Harkins" <perrin@primenet.com>, "Stas Bekman" <stas@stason.org> Cc: <modperl@apache.org> Subject: RE: dbm locking info in the guide Date: Tue, 20 Mar 2001 21:53:23 -0500 Perrin Harkins [mailto:perrin@primenet.com] wrote: > I think you'll still have problems with this technique, > unless you tie/untie every time. I'm looking at the > perldoc for DB_File version 1.76, at the section titled > "Locking: the trouble with fd". At the very least, you'd > have to call sync after acquiring a write lock but before > writing anything. Here is more information from the original discovery of the bug. This contains the test cases that actually show the database corruption. Also some documentation on the details such as systraces that show reading happens before the flock system call. http://www.davideous.com/misc/dblockflaw-1.2.tar.gz http://www.davideous.com/misc/dblockflaw-1.2/ Sync may or may not work, depending on how the low level buffering is implemented. If it re-reads all information from disk I don't think you have any advantage over simply closing the DB_File and opening it again. It's also worthwhile to use an external lock file because that properly locks for database creation. === Date: Wed, 21 Mar 2001 10:58:06 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: Perrin Harkins <perrin@primenet.com> Cc: mod_perl list <modperl@apache.org> Subject: Re: dbm locking info in the guide On Tue, 20 Mar 2001, Perrin Harkins wrote: > On Wed, 21 Mar 2001, Stas Bekman wrote: > > > You mean with DB_File? There's a big warning in the > > > current version saying not to do that, because there > > > is some initial buffering that happens when opening a > > > database. > > The warning says not to lock on dbm fd but an external file! > I think you'll still have problems with this technique, > unless you tie/untie every time. I'm looking at the > perldoc for DB_File version 1.76, at the section titled > "Locking: the trouble with fd". At the very least, you'd > have to call sync after acquiring a write lock but before > writing anything. Of course, you always call sync. The sync was always there, even with the flawed locking scheme. The DB_File doc was updated as a follow up of the research done by David Harris. This should work: flock SH tie() read... flock EX <===== start critical section write... sync() flock SH <===== end critical section read... untie() flock UN notice that the locking is done on the *external* file (or fd). The only problem in this approach is a possible writing starvation as explained: http://perl.apache.org/guide/dbm.html#Locking_dbm_Handlers_and_Write_L === Date: Wed, 21 Mar 2001 11:05:46 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: David Harris <dharris@drh.net> Cc: Perrin Harkins <perrin@primenet.com>, <modperl@apache.org> Subject: RE: dbm locking info in the guide On Tue, 20 Mar 2001, David Harris wrote: > Perrin Harkins [mailto:perrin@primenet.com] wrote: > > I think you'll still have problems with this technique, > > unless you tie/untie every time. I'm looking at the > > perldoc for DB_File version 1.76, at the section titled > > "Locking: the trouble with fd". At the very least, > > you'd have to call sync after acquiring a write lock but > > before writing anything. > Here is more information from the original discovery of > the bug. This contains the test cases that actually show > the database corruption. Also some documentation on the > details such as systraces that show reading happens before > the flock system call. > http://www.davideous.com/misc/dblockflaw-1.2.tar.gz > http://www.davideous.com/misc/dblockflaw-1.2/ > Sync may or may not work, depending on how the low level > buffering is implemented. So basically what you are saying is that sync() is broken and shouldn't be used at all. Something fishy is going on. The purpose of sync() is to flush the modifications to the disk. > If it re-reads all information from disk I don't think you have > any advantage over simply closing the DB_File and opening it again. Why should it re-read the file, when it's suppose to *write* it down to the disk. That's the benefit of using dbm, is that only some blocks of the file are read into a memory. Unless you are talking about a process that wants to read after some other process had changed the database, and there is a hazard that the former process has the data cached and will not know that dbm has been modified. > It's also worthwhile to use an external lock file because that properly > locks for database creation. That's for sure. === Date: Tue, 20 Mar 2001 22:58:22 -0800 From: Perrin Harkins <perrin@primenet.com> To: Stas Bekman <stas@stason.org> Subject: Re: dbm locking info in the guide Stas Bekman wrote: > So basically what you are saying is that sync() is broken > and shouldn't be used at all. Something fishy is going > on. The purpose of sync() is to flush the modifications to > the disk. Saving changes to disk isn't the problem. The issue is that some of the database gets cached in memory when you open the database (even if you don't actually read anything from it), so changes made in other processes will not be seen. To get around this, you would have to somehow reload the cached data from disk just after getting a write lock but before making any changes. > Unless you are talking about a process that wants to read > after some other process had changed the database, and > there is a hazard that the former process has the data > cached and will not know that dbm has been modified. Exactly. Keeping the database open is fine as long as you have a read-only app. For read/write, you have to tie/untie every time. Or use BerkeleyDB. === Date: Wed, 21 Mar 2001 22:45:00 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: Perrin Harkins <perrin@primenet.com> Cc: David Harris <dharris@drh.net>, <modperl@apache.org> Subject: Re: dbm locking info in the guide On Tue, 20 Mar 2001, Perrin Harkins wrote: > Stas Bekman wrote: > > So basically what you are saying is that sync() is > > broken and shouldn't be used at all. Something fishy is > > going on. The purpose of sync() is to flush the > > modifications to the disk. > Saving changes to disk isn't the problem. The issue is > that some of the database gets cached in memory when you > open the database (even if you don't actually read > anything from it), so changes made in other processes will > not be seen. To get around this, you would have to > somehow reload the cached data from disk just after > getting a write lock but before making any changes. > > Unless you are talking about a process that wants to > > read after some other process had changed the database, > > and there is a hazard that the former process has the > > data cached and will not know that dbm has been > > modified. > Exactly. Keeping the database open is fine as long as you > have a read-only app. For read/write, you have to > tie/untie every time. Or use BerkeleyDB. Ok, what about calling sync before accesing the database? (read and write) Will it force the process to sync its data with the disk, or will it cause the corruption of the file on the disk, as the process might have a stale data? === From: "Perrin Harkins" <perrin@primenet.com> To: "Stas Bekman" <stas@stason.org> Cc: "David Harris" <dharris@drh.net>, <modperl@apache.org> Subject: Re: dbm locking info in the guide Date: Wed, 21 Mar 2001 09:33:03 -0800 > Ok, what about calling sync before accesing the database? > (read and write) Will it force the process to sync its > data with the disk, or will it cause the corruption of the > file on the disk, as the process might have a stale data? Well, that's what we don't know. As David Harris pointed out, if it does do the right thing and re-read from disk, it's probably not much better than re-opening the database. I suppose it would avoid some Perl object creation though, so it would be at least a little faster. === Date: Fri, 23 Mar 2001 01:30:00 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: Perrin Harkins <perrin@primenet.com> Cc: David Harris <dharris@drh.net>, <modperl@apache.org> Subject: Re: dbm locking info in the guide On Wed, 21 Mar 2001, Perrin Harkins wrote: > > Ok, what about calling sync before accesing the > > database? (read and write) Will it force the process to > > sync its data with the disk, or will it cause the > > corruption of the file on the disk, as the process might > > have a stale data? > Well, that's what we don't know. As David Harris pointed > out, if it does do the right thing and re-read from disk, > it's probably not much better than re-opening the > database. I suppose it would avoid some Perl object > creation though, so it would be at least a little faster. As the person who has discovered this bad flaw in DB_File docs and made sure that the right thing will be done, may be David will have a time to go further and check up on this issue. I would definitely do it myself, but there so many things I've to do, I just cannot do it now :( Or anybody else who wants to contribute to the community by a little effort? Just grab the tgz which represents the problem, from the url posted a few days ago by David and see if you can tackle this issue of the correctness of using sync and the actual benchmarking to check whether it's faster to do tie/untie or using sync and locking... Thanks a bunch! === From: "David Harris" <dharris@drh.net> To: "Stas Bekman" <stas@stason.org>, "Perrin Harkins" <perrin@primenet.com> Cc: <modperl@apache.org> Subject: RE: dbm locking info in the guide Date: Thu, 22 Mar 2001 13:27:36 -0500 Stas Bekman [mailto:stas@stason.org] wrote: > As the person who has discovered this bad flaw in DB_File > docs and made sure that the right thing will be done, may > be David will have a time to go further and check up on > this issue. I would definitely do it myself, but there so > many things I've to do, I just cannot do it now :( > Or anybody else who wants to contribute to the community > by a little effort? Just grab the tgz which represents the > problem, from the url posted a few days ago by David and > see if you can tackle this issue of the correctness of > using sync and the actual benchmarking to check whether > it's faster to do tie/untie or using sync and locking... I have done some investigation of the sync method in DB_File and this is what I have determined: the sync method only writes cached information out to disk. Information already cached in the process is not invalidated causing a re-read from the disk. My example program and the annotated strace can be found here for anyone that wants to see the details: http://www.davideous.com/misc/dblockflaw-1.2-checksync/synctest.pl http://www.davideous.com/misc/dblockflaw-1.2-checksync/synctest.strace01 Here is what I think this means for locking: If you want to downgrade a lock from exclusive to shared, sync the database and change the lock status. This will allow other readers access to a fully-written database. No one else will be allowed to write the database (requiring your process to have invalidated any cached data) until you have released the shared lock. No problem there. If you want to upgrade a lock from shared to exclusive, simply request this upgrade from the locking subsystem and write to the database once an exclusive lock has been acquired. Since the database has been in a shared lock since it was opened no one has written to it. Therefore, no invalidation of cached data is required since the database on disk has not changed. Beware when upgrading shared locks to exclusive locks that: (a) you don't get a deadlock with two shared locks trying to upgrade at the same time, and (b) if your locking layer resolves this deadlock by denying one of the upgrade requests, make sure your program handles that appropriately. I imagine one would handle a lock upgrade failure by closing the database and then requesting an exclusive lock. Perhaps one would want to rollback changes made to the database or otherwise prepare for this transition. I'd rather just grab an exclusive lock at the beginning if I know there's a chance of needing to write the database later on. Or just close and re-open the database instead of trying the upgrade at all. Everyone may have their own particular application that needs something special. However, I'd rather just use a RDMS if I'm running into this level of locking details. Then again, none of that is related to sync as it is not required for a lock upgrade. :-) OK, summary: (1) Seems to me that sync should only be used for downgrading exclusive locks to shared, and that sync is well suited for this task. (2) You can upgrade locks from shared to exclusive without sync, but you might want to avoid needing to upgrade locks because of deadlock problems. Hope this helps..... (Thanks for the break from the Windows2k nightmare. Why does Oracle Enterprise Manager only run on w2k?! Well, back to work :-) === Date: Fri, 23 Mar 2001 08:38:37 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: David Harris <dharris@drh.net> Cc: mod_perl list <modperl@apache.org> Subject: RE: dbm locking info in the guide On Thu, 22 Mar 2001, David Harris wrote: Thanks, David! > I have done some investigation of the sync method in > DB_File and this is what I have determined: the sync > method only writes cached information out to > disk. Information already cached in the process is not > invalidated causing a re-read from the disk. > My example program and the annotated strace can be found > here for anyone that wants to see the details: > http://www.davideous.com/misc/dblockflaw-1.2-checksync/synctest.pl > http://www.davideous.com/misc/dblockflaw-1.2-checksync/synctest.strace01 > > Here is what I think this means for locking: > If you want to downgrade a lock from exclusive to shared, > sync the database and change the lock status. This will > allow other readers access to a fully-written database. No > one else will be allowed to write the database (requiring > your process to have invalidated any cached data) until > you have released the shared lock. No problem there. Are you sure? Doesn't it contradict with the fact that other readers have already cached the first 4k of data? And you have modified the database and possibly the first 4k during the write, so if this is the case, now readers have the wrong 4k in their cache. Or do you mean that when a process that switches from EX to SH, doesn't need to re-tie(), since it has *its* cache valid. Other process do need to re-tie when acquiring any kind of lock, if they don't have none yet. The rest seems to be correct. === From: "David Harris" <dharris@drh.net> To: "Stas Bekman" <stas@stason.org> Cc: "mod_perl list" <modperl@apache.org> Subject: RE: dbm locking info in the guide Date: Thu, 22 Mar 2001 20:07:16 -0500 Stas Bekman [mailto:stas@stason.org] wrote: > On Thu, 22 Mar 2001, David Harris wrote: > > If you want to downgrade a lock from exclusive to > > shared, sync the database and change the lock > > status. This will allow other readers access to a > > fully-written database. No one else will be allowed to > > write the database (requiring your process to have > > invalidated any cached data) until you have released the > > shared lock. No problem there. > Are you sure? Doesn't it contradict with the fact that > other readers have already cached the first 4k of data? > And you have modified the database and possibly the first > 4k during the write, so if this is the case, now readers > have the wrong 4k in their cache. > Or do you mean that when a process that switches from EX > to SH, doesn't need to re-tie(), since it has *its* cache > valid. Other process do need to re-tie when acquiring any > kind of lock, if they don't have none yet. Two points about switching from exclusive mode to shared mode: (1) When downgrading from EX to SH, no other processes need to have cached data invalidated because no one else can have the database open. There is no cache in other processes, therefore none to be invalidated. Explanation: Lets say the method for downgrading a lock from EX to SH is like this: write data, sync(), flock(FLOCK_SH), read data. Until the flock(FLOCK_SH) nobody else can have the database open because of the exclusive lock. Therefore, there will not be any other processes with the database open and the first 4k cached in memory when the sync() happens. (2) When downgrading from EX to SH, our processes does not need to invalidate cached data because its cached data is correct at the sync() and the data on disk will not be changed until the database is closed. Explanation: Again we downgrade form EX to SH by doing this: write data, sync(), flock(FLOCK_SH), read data. Our cache remains valid the entire time here. With the sync(), data in our cache is written to disk, so at that point we are good. Then after the flock(FLOCK_SH) we are still good because the shared lock prevents anyone else from writing to the database and changing the data on disk. There is no need to do a re-tie(). === Date: Fri, 23 Mar 2001 11:21:58 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: David Harris <dharris@drh.net> Cc: mod_perl list <modperl@apache.org> Subject: RE: dbm locking info in the guide On Thu, 22 Mar 2001, David Harris wrote: > Two points about switching from exclusive mode to shared mode: > (1) When downgrading from EX to SH, no other processes need to have cached > data invalidated because no one else can have the database open. There is no > cache in other processes, therefore none to be invalidated. > Explanation: Lets say the method for downgrading a lock > from EX to SH is like this: write data, sync(), > flock(FLOCK_SH), read data. Until the flock(FLOCK_SH) > nobody else can have the database open because of the > exclusive lock. Therefore, there will not be any other > processes with the database open and the first 4k cached > in memory when the sync() happens. David, please consider this scenario: ... At some point in time, processes A and B both read from the dbm via SH lock. 1. A completes its reading and unlocks the DBM, while still having the first 4k cached. (A still has the dbm tie()'d. 2. B wants to write, so it requests an EX lock and gets it granted. 3. B modifies the data in the first 4k, syncs it and releases the lock. 4. A asks for SH or EX lock, gets it, but its cache is invalid. => we have a data corruption (especially in the case A does writing into the first 4k) > (2) When downgrading from EX to SH, our processes does not need to > invalidate cached data because its cached data is correct at the sync() and > the data on disk will not be changed until the database is closed. > Explanation: Again we downgrade form EX to SH by doing > this: write data, sync(), flock(FLOCK_SH), read data. Our > cache remains valid the entire time here. With the sync(), > data in our cache is written to disk, so at that point we > are good. Then after the flock(FLOCK_SH) we are still good > because the shared lock prevents anyone else from writing > to the database and changing the data on disk. There is no > need to do a re-tie(). That's correct. === From: "David Harris" <dharris@drh.net> To: "Stas Bekman" <stas@stason.org> Cc: "mod_perl list" <modperl@apache.org> Subject: RE: dbm locking info in the guide Date: Fri, 23 Mar 2001 07:56:34 -0500 Stas, Sounds like you agree with me that downgrading locks from exclusive to shared is not a problem with the method I described in the last e-mail. Now you have a concern with upgrading locks from shared to exclusive: > David, please consider this scenario: > > ... At some point in time, processes A and B both read from the dbm via SH > lock. > > 1. A completes its reading and unlocks the DBM, while still having the > first 4k cached. (A still has the dbm tie()'d. > > 2. B wants to write, so it requests an EX lock and gets it granted. This will not happen. When B requests the EX lock it will block until all of the other shared locks have been released. Process A has to release the SH lock somehow for B to get the EX lock. Either A simply finishes and releases the lock, or A requests an upgrade, is denied, and handles this by releasing the lock. When the EX lock is granted (whether from an upgrade or not), by definition no other processes can have a SH lock and be reading the database. No other processes can have a first 4k cached because no other processes can have the file open. >From the flock manpage: "A single file may not simultaneously have both shared and exclusive locks." > 3. B modifies the data in the first 4k, syncs it and releases the lock. > > 4. A asks for SH or EX lock, gets it, but its cache is invalid. > > => we have a data corruption (especially in the case A does writing into > the first 4k) === Date: Fri, 23 Mar 2001 22:20:58 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: David Harris <dharris@drh.net> Cc: mod_perl list <modperl@apache.org> Subject: RE: dbm locking info in the guide On Fri, 23 Mar 2001, David Harris wrote: > Stas, > > Sounds like you agree with me that downgrading locks from exclusive to > shared is not a problem with the method I described in the last e-mail. That's correct. > Now you have a concern with upgrading locks from shared to exclusive: > > > David, please consider this scenario: > > > > ... At some point in time, processes A and B both read from the dbm via SH > > lock. > > > > 1. A completes its reading and unlocks the DBM, while still having the > > first 4k cached. (A still has the dbm tie()'d. > > > > 2. B wants to write, so it requests an EX lock and gets it granted. > This will not happen. When B requests the EX lock it will > block until all of the other shared locks have been > released. Process A has to release the SH lock somehow for > B to get the EX lock. Either A simply finishes and > releases the lock, or A requests an upgrade, is denied, > and handles this by releasing the lock. That's if you code it that way. Nothing prevents you from unlocking A, and then asking for some lock later. You always want to make the critical section as short as possible. So if you need to access the dbm file twice through the request. You may go through this scenario: A: flock SH B: flock SH A: flock UN B: flock EX B: flock SH A: flock SH 'A' still have the data cached and possibly invalid. Your proposed system is clean only in this case: You can never explicitly unlock dbm and then relock it without calling untie(). You can safely upgrade the lock from SH to EX and downgrade from EX to SH though, without using UN (sort of semi-atomically). > When the EX lock is granted (whether from an upgrade or not), by definition > no other processes can have a SH lock and be reading the database. No other > processes can have a first 4k cached because no other processes can have the > file open. They can, if there weren't untie()d per my above explanation. === From: "David Harris" <dharris@drh.net> To: "Stas Bekman" <stas@stason.org> Cc: "mod_perl list" <modperl@apache.org> Subject: RE: dbm locking info in the guide Date: Fri, 23 Mar 2001 14:55:58 -0500 Stas Bekman [mailto:stas@stason.org] wrote: > > Now you have a concern with upgrading locks from shared to exclusive: > > > > > David, please consider this scenario: > > > ... At some point in time, processes A and B both read > > > from the dbm via SH lock. > > > 1. A completes its reading and unlocks the DBM, while still having the > > > first 4k cached. (A still has the dbm tie()'d. > > > > > > 2. B wants to write, so it requests an EX lock and gets it granted. > > This will not happen. When B requests the EX lock it > > will block until all of the other shared locks have been > > released. Process A has to release the SH lock somehow > > for B to get the EX lock. Either A simply finishes and > > releases the lock, or A requests an upgrade, is denied, > > and handles this by releasing the lock. > That's if you code it that way. Nothing prevents you from > unlocking A, and then asking for some lock later. You > always want to make the critical section as short as > possible. So if you need to access the dbm file twice > through the request. You may go through this scenario: > A: flock SH > B: flock SH > A: flock UN > B: flock EX > B: flock SH > A: flock SH > > 'A' still have the data cached and possibly invalid. > > Your proposed system is clean only in this case: > > You can never explicitly unlock dbm and then relock it without calling > untie(). You can safely upgrade the lock from SH to EX and downgrade from > EX to SH though, without using UN (sort of semi-atomically). Perhaps we have a misunderstanding here. I would NEVER flock(UN) without having just previously untie()d the database. And I would ALWAYS acquire a lock immediately before tie()ing the database. I would never drop the lock while keeping the database open (but not writing to it) and then reacquire a lock at a later date and start reading or writing the database. Perhaps this _could_ be done in some cases (even though it seems it invalidate the whole idea of a lock on a resource IMHO), however, since sync() does not re-read data from disk it can't be done in this case. When I said: }} Process A has to release the SH lock somehow for B to get }} the EX lock. Either A simply finishes and releases the }} lock, or A requests an upgrade, is denied, and handles }} this by releasing the lock. I assumed that the if A releases the SH lock, it has closed the database. Releasing the lock is a guarantee that A no longer has the database open. If A at some future time reacquires a lock, it then just reopen the database. > > When the EX lock is granted (whether from an upgrade or > > not), by definition no other processes can have a SH > > lock and be reading the database. No other processes can > > have a first 4k cached because no other processes can > > have the file open. > They can, if there weren't untie()d per my above explanation. Just don't unlock without untie()ing first and you are fine. Like I said, this was my assumption, just never stated. Sorry for not being clear. === Date: Sat, 24 Mar 2001 12:59:29 +0800 (SGT) From: Stas Bekman <stas@stason.org> To: David Harris <dharris@drh.net> Cc: mod_perl list <modperl@apache.org> Subject: RE: dbm locking info in the guide > Perhaps we have a misunderstanding here. I would NEVER flock(UN) without > having just previously untie()d the database. And I would ALWAYS acquire a > lock immediately before tie()ing the database. That's the point. We have to write down all the assumptions or people will do the wrong thing. I'll try to summarize our discussion later. ===