Provided by: inn2-dev_2.6.4-2build4_amd64 bug

NAME

       dbzinit,  dbzfresh,  dbzagain,  dbzclose, dbzexists, dbzfetch, dbzstore, dbzsync, dbzsize, dbzgetoptions,
       dbzsetoptions, dbzdebug - database routines

SYNOPSIS

       #include <inn/dbz.h>

       bool dbzinit(const char *base)

       bool dbzclose(void)

       bool dbzfresh(const char *base, long size)

       bool dbzagain(const char *base, const char *oldbase)

       bool dbzexists(const HASH key)

       off_t dbzfetch(const HASH key)
       bool dbzfetch(const HASH key, void *ivalue)

       DBZSTORE_RESULT dbzstore(const HASH key, off_t offset)
       DBZSTORE_RESULT dbzstore(const HASH key, void *ivalue)

       bool dbzsync(void)

       long dbzsize(long nentries)

       void dbzgetoptions(dbzoptions *opt)

       void dbzsetoptions(const dbzoptions opt)

DESCRIPTION

       These functions provide an indexing system for rapid random access to a text file (the base file).

       Dbz stores offsets into the base text file for rapid retrieval.  All retrievals are keyed on a hash value
       that is generated by the HashMessageID() function.

       Dbzinit opens a database, an index into the base file base, consisting of files base.dir ,  base.index  ,
       and  base.hash  which  must  already  exist.  (If the database is new, they should be zero-length files.)
       Subsequent accesses go to that database until dbzclose is called to close the database.

       Dbzfetch searches the database for the specified key,  returning  the  corresponding  value  if  any,  if
       <--enable-tagged-hash  at  configure>  is  specified.   If  <--enable-tagged-hash  at  configure>  is not
       specified, it returns true and content of ivalue is set.  Dbzstore stores the key -  value  pair  in  the
       database, if <--enable-tagged-hash at configure> is specified.  If <--enable-tagged-hash at configure> is
       not  specified,  it  stores  the  content  of  ivalue.   Dbzstore will fail unless the database files are
       writable.  Dbzexists will verify whether or not the given hash exists or not.  Dbz is optimized for  this
       operation and it may be significantly faster than dbzfetch().

       Dbzfresh is a variant of dbzinit for creating a new database with more control over details.

       Dbzfresh's  size  parameter  specifies the size of the first hash table within the database, in key-value
       pairs.  Performance will be best if the number of key-value pairs stored in the database does not  exceed
       about  2/3  of size.  (The dbzsize function, given the expected number of key-value pairs, will suggest a
       database size that meets these criteria.)  Assuming that an fseek offset is 4 bytes, the .index file will
       be 4 * size bytes.  The .hash file will be DBZ_INTERNAL_HASH_SIZE * size bytes (the .dir file is tiny and
       roughly constant in size) until the number of key-value pairs exceeds about 80% of size.  (Nothing  awful
       will  happen  if  the database grows beyond 100% of size, but accesses will slow down quite a bit and the
       .index and .hash files will grow somewhat.)

       Dbz stores up to DBZ_INTERNAL_HASH_SIZE bytes of the message-id's hash in the .hash  file  to  confirm  a
       hit.   This  eliminates  the  need to read the base file to handle collisions.  This replaces the tagmask
       feature in previous dbz releases.

       A size of ``0'' given to dbzfresh is synonymous with the local default; the normal  default  is  suitable
       for  tables  of  5,000,000  key-value  pairs.  Calling dbzinit(name) with the empty name is equivalent to
       calling dbzfresh(name, 0).

       When databases are regenerated periodically, as in news, it is simplest to pick the parameters for a  new
       database  based on the old one.  This also permits some memory of past sizes of the old database, so that
       a new database size can be chosen to cover expected fluctuations.  Dbzagain is a variant of  dbzinit  for
       creating  a  new  database  as  a new generation of an old database.  The database files for oldbase must
       exist.  Dbzagain is equivalent to calling dbzfresh with a size equal to the result of applying dbzsize to
       the largest number of entries in the oldbase database and its previous 10 generations.

       When many accesses are being done by the same program, dbz is massively faster if its first hash table is
       in memory.  If the ``pag_incore'' flag is set to INCORE_MEM, an attempt is made to read the table in when
       the database is opened, and dbzclose writes it out to disk again (if it was  read  successfully  and  has
       been  modified).   Dbzsetoptions  can  be  used to set the pag_incore and exists_incore flag to new value
       which should be ``INCORE_NO'',  ``INCORE_MEM'',  or  ``INCORE_MMAP''  for  the  .hash  and  .index  files
       separately;  this  does not affect the status of a database that has already been opened.  The default is
       ``INCORE_NO'' for the .index file and ``INCORE_MMAP'' for the .hash file.  The attempt to read the  table
       in  may  fail  due  to  memory  shortage;  in  this case dbz fails with an error.  Stores to an in-memory
       database are not (in general) written out to the file until dbzclose or dbzsync, so if robustness in  the
       presence  of crashes or concurrent accesses is crucial, in-memory databases should probably be avoided or
       the writethrough option should be set to ``true'';

       If the nonblock option is ``true'', then writes to the .hash and .index files will  be  done  using  non-
       blocking I/O.  This can be significantly faster if your platform supports non-blocking I/O with files.

       Dbzsync  causes  all  buffers  etc. to be flushed out to the files.  It is typically used as a precaution
       against crashes or concurrent accesses when a dbz-using process will be running for a long time.  It is a
       somewhat expensive operation, especially for an in-memory database.

       Concurrent reading of databases is fairly safe, but there is no (inter)locking, so concurrent updating is
       not.

       An open database occupies three stdio streams and two file descriptors; Memory consumption is  negligible
       (except for stdio buffers) except for in-memory databases.

SEE ALSO

       dbm(3), history(5), libinn(3)

DIAGNOSTICS

       Functions  returning bool values return ``true'' for success, ``false'' for failure.  Functions returning
       off_t values return a value with -1 for failure.  Dbzinit attempts to have errno set plausibly on return,
       but otherwise this is not guaranteed.  An errno of EDOM from dbzinit indicates that the database did  not
       appear to be in dbz format.

       If  DBZTEST is defined at compile-time then a main() function will be included.  This will do performance
       tests and integrity test.

HISTORY

       The original dbz was written by Jon Zeeff (zeeff@b-tech.ann-arbor.mi.us).  Later contributions  by  David
       Butler   and   Mark  Moraes.   Extensive  reworking,  including  this  documentation,  by  Henry  Spencer
       (henry@zoo.toronto.edu) as part of the C News project.  MD5 code borrowed from RSA.  Extensive  reworking
       to   remove   backwards   compatibility   and   to   add   hashes  into  dbz  files  by  Clayton  O'Neill
       (coneill@oneill.net)

BUGS

       Unlike dbm, dbz will refuse to dbzstore with a key already in the database.  The user is responsible  for
       avoiding this.

       The  RFC5322  case  mapper  implements  only  a first approximation to the hideously-complex RFC5322 case
       rules.

       Dbz no longer tries to be call-compatible with dbm in any way.

                                                   6 Sep 1997                                             DBZ(3)