Provided by: gscan2pdf_2.13.4-4_all bug

NAME

       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents

USAGE

       1. Scan one or several pages in with File/Scan
       2. Create PDF of selected pages with File/Save

REQUIRED ARGUMENTS

       None

OPTIONS

       gscan2pdf has the following command-line options:

       --device=device
           Specifies  the device to use, instead of getting the list of devices from via the SANE API.  This can
           be useful if the scanner is on a remote computer which is not broadcasting its existence.

       --help
           Displays this help page and exits.

       --log=log-file
           Specifies a file to store logging messages.

       --debug, --info, --warn, --error, --fatal
           Defines the log level.  If a log file is specified, this defaults to --debug, otherwise --error.

       --import=PDF|DjVu|images
           Imports the specified file(s). If the document has more than one  page,  a  window  is  displayed  to
           select the required pages.

       --import-all=PDF|DjVu|images Imports all pages of the specified file(s).
       --version
           Displays the program version and exits.

       Scanning  is  handled  with  SANE via scanimage.  PDF conversion is done by PDF::Builder.  TIFF export is
       handled by libtiff (faster and smaller memory footprint for multipage files).

DIAGNOSTICS

       To diagnose a possible error, start gscan2pdf from the command line with logging enabled:

       "gscan2pdf --log=file.log"

       and check file.log.

EXIT STATUS

       None

CONFIGURATION

       gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The directory can be changed by  setting
       the $XDG_CONFIG_HOME variable. Generally, however, preferences should be changed via the Edit/Preferences
       menu, or are captured automatically during normal usage of the program.

INCOMPATIBILITIES

       None known.

BUGS AND LIMITATIONS

       Whilst  it  is  possible  to  import  PDFs,  this  is  intended to be able to round-trip files created by
       gscan2pdf.

Download

       gscan2pdf is available on Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).

   Debian-based
       If you are using Debian, you should find that sid <https://www.debian.org/releases/sid/> has  the  latest
       version already packaged.

       If you are using a Ubuntu-based system, you can automatically keep up to date with the latest version via
       the ppa:

       "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"

       If you are you are using Synaptic, then use menu Edit/Reload Package Information, search for gscan2pdf in
       the package list, and lo and behold, you can install the nice shiny new version.

       From the command line:

       "sudo apt update"

       "sudo apt install gscan2pdf"

   From source
       The   source   is   hosted   in   the   files   section   of   the   gscan2pdf   project  on  Sourceforge
       (<https://sourceforge.net/projects/gscan2pdf/files/>).

   From the repository
       gscan2pdf   uses   Git   for   its   Revision   Control   System.   You   can   browse   the   tree    at
       <https://sourceforge.net/p/gscan2pdf/code/>.

       Git users can clone the complete tree with "git clone git://git.code.sf.net/p/gscan2pdf/code"

Building gscan2pdf from source

       Having  downloaded  the source either from a Sourceforge file release, or from the Git repository, unpack
       it if necessary with "tar xvfz gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"

       "perl Makefile.PL", will create the Makefile.

       "make test" should run several hundred tests to confirm that things will work properly on your system.

       You can install directly from the source with "make install", but building the  appropriate  package  for
       your  distribution  should  be  as straightforward as "make debdist" or "make rpmdist". However, you will
       additionally need the rpm, devscripts, fakeroot, debhelper and gettext packages.

Dependencies

       The  list  below  looks  daunting,  but  all  packages  are  available  from  any  reasonable  up-to-date
       distribution.  If  you  are  using  Synaptic,  having  installed gscan2pdf, locate the gscan2pdf entry in
       Synaptic, right-click it and you can install them under Recommends. Note  also  that  the  library  names
       given  below  are  the Debian/Ubuntu ones. Those distributions using RPM typically use perl(module) where
       Debian has libmodule-perl.

       Required
           libgtk3-perl >= 0.028
               There is a bug in version of libgtk3-perl before  0.028  that  causes  gscan2pdf  to  crash  when
               saving.  Whilst  I  could  prevent  gscan2pdf from crashing, it would still be impossible to save
               anything, rendering gscan2pdf rather useless.

           libgtk3-simplelist-perl
               A simple interface to Gtk3's complex MVC list widget

           liblocale-gettext-perl (>= 1.05)
               Using libc functions for internationalisation in Perl

           libpdf-builder-perl
               provides the functions for creating PDF documents in Perl

           libsane
               API library for scanners

           libimage-sane-perl
               Perl bindings for libsane.

           libset-intspan-perl
               manages sets of integers

           libtiff-tools
               TIFF manipulation and conversion tools

           Imagemagick
               Image manipulation programs

           perlmagick
               A perl interface to the libMagick graphics routines

           sane-utils
               API library for scanners -- utilities.

       Optional
           sane
               scanner graphical frontends. Only required for the scanadf frontend.

           unpaper
               post-processing tool for scanned pages. See <https://www.flameeyes.eu/projects/unpaper>.

           xdg-utils
               Desktop  integration  utilities  from  freedesktop.org.  Required  for   Email   as   PDF.    See
               <https://www.freedesktop.org/wiki/Software/xdg-utils/>

           djvulibre-bin
               Utilities for the DjVu image format. See <http://djvu.sourceforge.net/>

           gocr
               A command line OCR. See <http://jocr.sourceforge.net/>.

           tesseract
               A command line OCR. See <https://github.com/tesseract-ocr/tesseract>

           cuneiform
               A command line OCR. See <http://launchpad.net/cuneiform-linux>

Support

       There are two mailing lists for gscan2pdf:

       gscan2pdf-announce
           A   low-traffic   list   for   announcements,   mostly   of   new  releases.  You  can  subscribe  at
           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>

       gscan2pdf-help
           General       support,       questions,        etc..        You        can        subscribe        at
           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>

Reporting bugs

       Before reporting bugs, please read the "FAQs" section.

       Please  report  any  bugs  found,  preferably  against the Debian package[1][2].  You do not need to be a
       Debian user, or set up an account to do this.  The Debian tool "reportbug" provides a convenient GUI  for
       doing so.

       1. https://packages.debian.org/sid/gscan2pdf
       2. https://www.debian.org/Bugs/

       Alternatively,    there    is    a    bug    tracker   for   the   gscan2pdf   project   on   Sourceforge
       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).

       Please include the log file created by "gscan2pdf --log=log" with any new bug report.

Translations

       gscan2pdf has already been partly translated into several languages.  If you would like to contribute  to
       an existing or new translation, please check out Rosetta: <https://translations.launchpad.net/gscan2pdf>

       Note  that  the  translations for the scanner options are taken directly from sane-backends. If you would
       like  to  contribute  to  these,  you  can  do  so  either  at  contact  the  sane-devel   mailing   list
       (sane-devel@lists.alioth.debian.org)   and  have  a  look  at  the  po/  directory  in  the  source  code
       <http://www.sane-project.org/cvs.html>.

       Alternatively, Ubuntu has its own translation  project.  For  the  9.04  release,  the  translations  are
       available at <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>

       If  you  have  updated an ".po" file in the "po" directory of the gscan2pdf source tree and would like to
       test it, pick a test directory for the compiled locales, e.g.  "./locale", and  create  the  ".mo"  files
       with:

       "perl Makefile.PL LOCALEDIR=./locale"

       If the updated locale is your standard one, then the following will find the updated file:

       "perl -I lib bin/gscan2pdf --log=log --locale=locale"

       If it is not your standard locale, you will need something like (for Russian):

       "LC_ALL=ru_RU.utf8 LC_MESSAGES=ru_RU.utf8 LC_CTYPE=ru_RU.utf8 LANG=ru_RU.utf8 LANGUAGE=ru_RU.utf8 perl -I
       lib bin/gscan2pdf --log=log --locale=locale"

       or German:

       "LC_ALL=de_DE  LC_MESSAGES=de_DE  LC_CTYPE=de_DE  LANG=de_DE  LANGUAGE=de_DE  perl  -I  lib bin/gscan2pdf
       --log=log --locale=locale"

       If the above doesn't work, make sure it is in the list produced by "locale  -a",  including  any  ".utf8"
       suffix. If necessary, generate new locales with "sudo dpkg-reconfigure locales"

DESCRIPTION

   File
       New

       Clears the page list.

       Open

       Opens  any  format that imagemagick supports. PDFs will have their embedded images extracted and imported
       one per page.

       Note that files can also be imported by dragging them  into  the  thumbnail  list  from  a  program  like
       nautilus or konqueror.

       Scan

       Sets options before scanning via SANE.

       Device

       Chooses between available scanners.

       # Pages

       Selects the number of pages, or all pages to scan.

       Source document

       Selects between single sided or double sides pages.

       This  affects the page numbering.  Single sided scans are numbered consecutively.  Double sided scans are
       incremented (or decremented, see below) by 2, i.e. 1, 3, 5, etc..

       Side to scan

       If double  sided  is  selected  above,  assuming  a  non-duplex  scanner,  i.e.  a  scanner  that  cannot
       automatically  scan  both  sides  of  a  page,  this determines whether the page number is incremented or
       decremented by 2.

       To scan both sides of three pages, i.e. 6 sides:

       1. Select:
           # Pages = 3 (or "all" if your scanner can detect when it is out of paper)

           Double sided

           Facing side

       2. Scans sides 1, 3 & 5.
       3. Put pile back with scanner ready to scan back of last page.
       4. Select:
           # Pages = 3 (or "all" if your scanner can detect when it is out of paper)

           Double sided

           Reverse side

       5. Scans sides 6, 4 & 2.
       6. gscan2pdf automatically sorts the pages so that they appear in the correct order.

       Device-dependent options

       These, naturally, depend on your scanner.  They can include

       Page size.
       Mode (colour/black & white/greyscale)
       Resolution (in PPI)
       Batch-scan
           Guarantees that a "no documents" condition will be returned after the last scanned page,  to  prevent
           endless flatbed scans after a batch scan.

       Wait-for-button/Button-wait
           After  sending  the  scan  command,  wait  until the button on the scanner is pressed before actually
           starting the scan process.

       Source
           Selects the document source.  Possible options can include Flatbed or ADF.  On some scanners, this is
           the only way of generating an out-of-documents signal.

       Save

       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or GIF.

       Metadata

       Metadata are information that are not visible when viewing the PDF/DjVu, but are embedded in the file and
       so searchable and can be examined, typically with the "Properties" option of the document viewer.

       The metadata are completely optional, but can also be used to generate the filename see  preferences  for
       details.

       The  date  can  be  selected  with  use  of the calendar widget. The displayed date can be incremented or
       decremented with use of the '+' and '-' keys.

       DjVu

       Both   black   and   white,   and   colour   images   produce   better   compression   than   PDF.    See
       <http://www.djvuzone.org/> for more details.

       Email as PDF

       Attaches  the  selected or all pages as a PDF to a blank email.  This requires xdg-email, which is in the
       xdg-utils package.  If this is not present, the option is ghosted out.

       Print

       Prints the selected or all pages.

       Compress temporary files

       If your temporary ($TMPDIR) directory is getting full, this function can  be  useful  -  compressing  all
       images  at  LZW-compressed  TIFFs.  These  require  much less space than the PNM files that are typically
       produced by SANE or by importing a PDF.

   Edit
       Delete

       Deletes the selected page.

       Renumber

       Renumbers the pages from 1..n.

       Note that the page order can also be changed by drag and drop in the thumbnail view.

       Select

       The select menus can be used to select, all, even, odd, blank, dark or modified pages. Selecting blank or
       dark pages runs imagemagick to make the decision.  Selecting modified  pages  selects  those  which  have
       modified by threshold, unsharp, etc., since the last OCR run was made.

       Properties

       When an image is scanned, gscan2pdf attempts to extract the resolution from the scan options. This nearly
       always works without problem.

       Importing  an  image  can be trickier, however. Some image formats such as PNM do not encode metadata for
       resolution. In other cases, the data is incorrect.  Edit/Properties allows the user to  manually  correct
       the  metadata  for  a particular page, thus correcting the size of final PDF or DjVu. The image itself is
       otherwise not changed - it is not down- or upscaled.

       Preferences

       The preferences menu item allows the control of the default behaviour of various functions. Most of these
       are self-explanatory.

       Frontends

       gscan2pdf initially supported two frontends, scanimage and scanadf.  scanadf support was  added  when  it
       was  realised  that  scanadf  works  better  than  scanimage with some scanners. On Debian-based systems,
       scanadf is in the sane package, not, like scanimage, in sane-utils. If scanadf is not present, the option
       is obviously ghosted out.

       In 0.9.27, Perl bindings for SANE were introduced. These are called libsane-perl.

       Before 1.2.0, options available through CLI frontends like scanimage were made visible as users asked for
       them. In 1.2.0, all options can be shown or hidden  via  Edit/Preferences,  along  with  the  ability  to
       specify which options trigger a reload.

       In  1.8.3,  New  Perl  bindings for SANE were introduced. These are called libimage-sane-perl and are the
       preferred frontend.

       In 1.8.5, support for libsane-perl was removed.

       Device blacklist

       Ignore listed devices.

       Note that this is a device name regular expression, e.g. /dev/video, and not the name as  listed  in  the
       scan window, e.g. Noname Integrated_Webcam_HD.

       Default filename for PDF or DjVu files

       All  strftime  codes  (e.g.  %Y  for  the  current  year)  are available as variables, with the following
       additions:

       %Da author

       %De filename extension

       %Dt title

       All document date codes use strftime codes with a leading D, e.g.:

       %DY document year

       %Dm document month

       %Dd document day

   View
       Zoom 100%

       Zooms to 1:1. How this appears depends on the desktop resolution.

       Zoom to fit

       Scales the view such that all the page is visible.

       Zoom in

       Zoom out

       Rotate 90° clockwise

       The rotate options require the package imagemagick and, if this is not present, are ghosted out.

       Rotate 180°

       Rotate 90° anticlockwise

   Tools
       Threshold

       Changes all pixels darker than the given value to black; all others become white.

       Unsharp mask

       The unsharp option sharpens an image. The image is convolved with a Gaussian operator of the given radius
       and standard deviation (sigma). For reasonable results, radius should be larger than sigma. Use a  radius
       of 0 to have the method select a suitable radius.

       Crop

       unpaper

       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility for cleaning up a scan.

       OCR (Optical Character Recognition)

       The gocr, tesseract or cuneiform utilities are used to produce text from an image.

       There is an OCR output buffer for each page and is embedded as plain text behind the scanned image in the
       PDF produced. This way, Beagle can index (i.e. search) the plain text.

       In  DjVu  files,  the  OCR  output  buffer  is embedded in the hidden text layer.  Thus these can also be
       indexed by Beagle.

       There        is        an        interesting        review        of        OCR        software        at
       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.  An important conclusion was that
       400ppi is necessary for decent results.

       Up  to  v2.04,  the  only  way  to  tell  which languages were available to tesseract was to look for the
       language files. Therefore, gscan2pdf checks the path returned by:

       "tesseract '' '' -l ''"

       If there are no language files in the above location, then  gscan2pdf  assumes  that  tesseract  v1.0  is
       installed, which had no language files.

       Variables for user-defined tools

       The following variables are available:

       %i  input filename

       %o  output filename

       %r  resolution

       An image can be modified in-place by just specifying %i.

FAQs

   Why isn't option xyz available in the scan window?
       Possibly because SANE or your scanner doesn't support it.

       If  an option listed in the output of "scanimage --help" that you would like to use isn't available, send
       me the output and I will look at implementing it.

   I've only got an old flatbed scanner with no automatic sheetfeeder. How do I scan a multipage document?
       In Edit/Preferences, tick the box "Allow batch scanning from flatbed".

       Some Brother scanners report "out of documents", despite scanning  from  flatbed.   This  can  be  worked
       around by ticking the box "Force new scan job between pages".

       If you are lucky, you have an option like Wait-for-button or Button-wait, where the scanner will wait for
       you to press the scan button on the device before it starts the scan, allowing you to scan multiple pages
       without touching the computer.

       If  you  are  quick,  you  might  be  able  to change the document on the flatbed whilst the scan head is
       returning.

       Otherwise, you have to set the number of pages to scan to 1 and hit the scan button on  the  scan  window
       for each page.

   Why is option xyz ghosted out?
       Probably  because the package required for that option is not installed.  Email as PDF requires xdg-email
       (xdg-utils), unpaper and the rotate options require imagemagick.

   Why can I not scan from the flatbed of my HP scanner?
       Generally for HP scanners with an ADF, to scan from the flatbed, you should set "#  Pages"  to  "1",  and
       possibly "Batch scan" to "No".

   When I update gscan2pdf using the Update Manager in Ubuntu, why is the list of changes never displayed?
       As  far  as I can tell, this is pulled from changelogs.ubuntu.com, and therefore only the changelogs from
       official Ubuntu builds are displayed.

   Why can gscan2pdf not find my scanner?
       If your scanner is not connected directly to the machine on which you are running gscan2pdf and you  have
       not  installed  the  SANE  daemon,  saned,  gscan2pdf cannot automatically find it. In this case, you can
       specify the scanner device on the command line:

       "gscan2pdf --device <device">

   How can I search for text in the OCR layer of the finished PDF or DJVU file?
       pdftotext or djvutxt can extract the text layer from PDF or DJVU files. See the respective man pages  for
       details.

       Having opened a PDF or DJVU file in evince or Acrobat Reader, the search function will typically find the
       page with the requested text and highlight it.

       There are various tools for searching or indexing files, including PDF and DJVU:

       •   (meta) Tracker (<https://projects.gnome.org/tracker/>)

       •   plone (<http://plone.org/>)

       •   pdfgrep (<http://pdfgrep.sourceforge.net/>

       •   swish-e (<http://www.swish-e.org/>)

       •   recoll (<http://www.lesbonscomptes.com/recoll/>)

       •   terrier (<http://www.lesbonscomptes.com/recoll/>)

   How can I change the colour of the selection box in the image viewer?
       Create a file called "~/.config/gtk-3.0/gtk.css" with the following content:

        .rubberband,
        rubberband,
        flowbox rubberband,
        treeview.view rubberband,
        .content-view rubberband,
        .content-view .rubberband {
          border: 1px solid #2a76c6;
          background-color: rgba(42, 118, 198, 0.2); }

   How can I change the colour of the OCR output
       Create a file called "~/.config/gtk-3.0/gtk.css" with the following content:

        #gscan2pdf-ocr-output {
          color: black;
        }

See Also

       XSane (<http://xsane.org/>)

       Scan Tailor (<http://scantailor.org/>)

Author

       Jeffrey Ratcliffe (jffry at posteo dot net)

Thanks to

       •   all the people who have sent patches, translations, bugs and feedback.

       •   the gtk+ project for a most excellent graphics toolkit.

       •   the Gtk3-Perl project for their superb Perl bindings for GTK3.

       •   The SANE project for scanner access

       •   Björn Lindqvist for the gtkimageview widget

       •   Sourceforge for hosting the project.

LICENSE AND COPYRIGHT

       Copyright (C) 2006--2024 Jeffrey Ratcliffe <jffry@posteo.net>

       This  program is free software: you can redistribute it and/or modify it under the terms of the version 3
       GNU General Public License as published by the Free Software Foundation.

       This program is distributed in the hope that it will be useful, but WITHOUT ANY  WARRANTY;  without  even
       the  implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
       License for more details.

       You should have received a copy of the GNU General Public License along with this program.  If  not,  see
       <https://www.gnu.org/licenses/>.

perl v5.38.2                                       2024-08-15                                      GSCAN2PDF(1p)