The ScanTailor version that merges the features of the ScanTailor Featured and
ScanTailor Enhanced versions, brings new ones and fixes.
Contents:
• Description
• Features
• ScanTailor Enhanced
• Auto margins [improved]
• Page detect [reworked]
• Deviation [reworked]
• Picture shape [reworked]
• Multi column thumbnails view [reworked]
• ScanTailor Featured
• ScanTailor Featured fixes & improvements
• Line vertical dragging on dewarp
• Square picture zones [reworked]
• Auto save project [optimized]
• Quadro Zoner [reworked]
• Marginal dewarping
• ScanTailor Universal
• ScanTailor Universal fixes & improvements
• ScanTailor Advanced
• ScanTailor Advanced fixes & improvements
• Light and Dark color schemes
• Multi-threading support for batch processing
• Full control over settings on output
• Filling outside areas
• Tiff compression
• Adaptive binarization
• Splitting output
• Original background
• Color segmenter and posterization
• Rectangular picture shape
• New zone interaction modes
• Saving zoom and focus on switching output tabs
• Measurement units system
• Status bar panel
• Default parameters
• Collapsible filter options
• Auto adjusting content area
• Black on white detection
• Guides
• Building
Description
ScanTailor is an interactive post-processing tool for scanned pages. It performs operations such as:
• page splitting,
• deskewing,
• adding/removing borders,
• selecting content
• ... and others.
You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DjVu file.
Scanning, optical character recognition, and assembling multi-page documents are out of scope of
this project.
Features
ScanTailor Enhanced
•
1. Deviation [reworked]
Deviation feature enables highlighting of different pages. Highlighted in red are pages from Deskew
filter with too high skew, from Select Content filter pages with different size of content and in
Margins filter are highlighted pages which does not match others.
This feature has been reworked. See ScanTailor Advanced fixes & improvements for more information.
1. Picture shape [reworked]
Picture shape feature adds option for mixed pages to choose from free shape and rectangular shape
images. This patch does not improve the original algoritm but creates from the detected "blobs"
rectangular shapes and the rectangles that intersects joins to one.
This feature has been reworked. See rectangular picture shape feature description.
ScanTailor Featured
•
• Marginal dewarping
An automatic dewarping mode. Works ONLY with such raw scans that have the top and
bottom curved page borders (on the black background). It automatically sets the red points
of the blue mesh along these borders (to create a distortion model) and then dewarps the
scan according to them. Works best on the low-curved scans.
Note: Other features of this version, such as Export, Dont_Equalize_Illumination_Pic_Zones,
Original_Foreground_Mixed has't been moved due to dirty realization. Their functionality is fully
covered by full control over settings on output and splitting output features.
ScanTailor Universal
•
ScanTailor Advanced
•
1. Reworked apply cut feature. Now on applying cut to the pages with different
dimensions than the page the cut applied to, ScanTailor tries to adapt cutters instead
of fully rejecting the cut setting and switching to auto mode for those pages as it was
before. The later was annoying as pages could be similar and had the difference in a
few pixels.
2. Added check to reject invalid cut settings in manual mode.
3. UI: Added cutters interaction between each other. They can't more intersect each
other, which created a wrong page layout configuration before.
• Reworking on multi column thumbnails view feature from ver. Enhanced. Now thumbnails
are shown evenly.
• Added option to control highlighting the thumbnails of pages with high deviation with red
asterisks. The option refreshes the thumbnails instantly.
• Deviation feature reworked.
1. Added a feature of dragging both content and page areas by using Shift+LMB
combination.
2. A page box implementation reworked. Now it's interactive and can be adjusted by the
same way as a content box is done.
3. The page rectangle does not require refreshing page and won't be reset on the content
area changes.
4. Implemented applying the page/content boxes to the other pages automatically
correcting the position of the boxes.
5. Added width and height parameters to regulate the page box size in manual mode.
6. Auto margins option has been moved out of the alignment settings and does no more
force to use only the original layout.
7. Auto margins feature now considers page box changes made at the selection content
stage.
8. Other bug fixes and improvements.
• Auto and original alignment modes reworked:
1. The original and auto alignment modes didn't work correctly due to the error in code.
2. Both the modes didn't work rightly after select content stage or reopening the project
file, always requiring secondary batch processing of every page at margins stage to
work correctly.
3. Reworked calculation method for the original alignment. Now it is more precise.
4. Original alignment mode now considers the page box from 4th stage.
5. Fixed behaviour of horizontal alignment, when the original mode enabled, and auto
margins has been enabled/disabled. Also on applying auto-margins / original
alignment to the set of pages, that is now set correctly for each page.
6. Added ability to separately control vertical and horizontal automatic alignment when
auto or original alignment mode enabled.
• Changed the way of the adjustment of the despeckle strength.
Now that's set via the slider. It allows to adjust the despeckle strength more smoothly and
exactly. Value 1.0 matches the old cautious mode, 2.0 - normal and 3.0 - aggressive.
• Improvements on the thumbnails view and navigation:
1. Tiff compression
Tiff compression options allow to disable or change compression method in tiff files.
There are two options in settings dialog: B&W and color compression.
1. The B&W one has None, LZW, Deflate and CCITT G4 (Default) options.
2. The color one has None, LZW (Default), Deflate and JPEG options.
1. Adaptive binarization
Sauvola and Wolf binarization algorithms have been added. They can be applied when normalizing
illumination does not help.
1. Splitting output
The feature allows to split the mixed output scans into the pairs of a foreground (letters) and
background (images) layer.
You can choose between B&W or color (original) foreground.
It can be useful:
• for the further DjVu encoding,
• to apply different filters to letters and images, which when being applied to the whole image
gives worse results.
• to apply a binarization to the letters from a third party app without affecting the images.
Note: That does not rename files to 0001, 0002... It can be made by a third party app, for example Bulk
Rename Utility
1. Original background
This feature is a part of the splitting output feature.
It allows to preserve the original image background in the format ready for the further processing,
when BW foreground is used. It can be used to encode into DjVu the pages with the complex
background using the semi-auto "split layers" method which gives much higher quality results than
DjVu auto segmenter. Also this feature can be used to extract high contrast elements of gradient
images into the foreground layer by using second processing of the layer with pictures
("background").
Properties of the original background:
• Original background images are saved into "original_background" folder in "out" directory.
• Pure black (#000000) and white (#ffffff) colors of original background image are
reserved into #010101 and #fefefe, respectively.
• Picture zones are marked with black when the BW content is marked with white. This
property allow to use "select by color" feature of an image editor to select needed areas for
their further processing, for example, apply blur to white holes and their nearest areas to get
an effective compression level of the background layer in DjVu.
• Filling zones feature also removes trash and speckles from the original background when
applied to the foreground layer.
1. Default parameters
Default parameters system supporting custom profiles has been implemented.
The system allows to manage the default filter settings for every stage. Those filter parameters will be
set as defaults for any new project created.
For example, it allows to set your own default margins standard, but not default 5, 10, 5, 10 mm, and
so for the other parameters.
Peculiarities:
1. There are two default profiles: "Default" and "Source". The "Default" profile represents default
ST filter settings, the "Source" one represents the settings giving the source as output without
any changes.
2. A user can create its own profiles. User profiles are stored in config/profiles folder or
in an system specific one for application data.
3. The system consider the units settings from the measurement units system. Units are stored
in the profile and ST automatically converts the values if needed.
It's much faster now to correct the content area if, for example, the page number has been missed by
the auto algorithm. It is no more required to manually and laboriously move the corners and edges
of the content box.
Building
Go to this repository and follow the instructions given there.