Bug 2330 - tiff2pdf changes tiff image order
: tiff2pdf changes tiff image order
Status: RESOLVED LATER
: libtiff
default
: 3.9.3
: PC Linux
: P2 normal
: ---
Assigned To:
:
:
: migrated_to_gitlab
:
:
  Show dependency treegraph
 
Reported: 2011-06-17 14:25 by
Modified: 2019-10-01 14:19 (History)


Attachments
Tif created with tiffcp (correct order) (713.60 KB, image/tiff)
2011-06-17 14:32, GD
Details
PDF created with tiff2pdf from the previously attached tif (resulting wrong order) (715.95 KB, application/pdf)
2011-06-17 14:34, GD
Details
tiffinfo report on the .tif file. (2.52 KB, text/plain)
2011-07-16 17:41, Frank Warmerdam
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2011-06-17 14:25:12
Under some circumstances, which are repeatable, the tiff2pdf conversion will
re-order the pages causing the output pdf to have a different page order than
the input multi-image tiff file.

The attached tiff was generated with tiffcp. The attached pdf was generated
with:
tiff2pdf -o bla.pdf bla.tif

You will notice that from page 1 - 5 the orders match, from 6 onward the order
is messed up.

The size of the file does not seem to impact the result, it is only brought on
by "certain" images. The original multi-page tiff files are composed of about
600 pages so it took a while to narrow down this bug and be able to snapshot it
happening on a small enough scale to be able to upload files so that someone
more knowledgeable could try to duplicate the problem.

I am using 3.9.4 on ubuntu amd64 (3.9.4-5ubuntu6)

Please let me know if any further information is needed!
------- Comment #1 From 2011-06-17 14:32:57 -------
Created an attachment (id=460) [details]
Tif created with tiffcp (correct order)
------- Comment #2 From 2011-06-17 14:34:03 -------
Created an attachment (id=461) [details]
PDF created with tiff2pdf from the previously attached tif (resulting wrong
order)
------- Comment #3 From 2011-06-17 20:00:04 -------
This is the exact command sequence used to generate the above files:
$ ls -1 *.tif
00404.tif
00405.tif
00406.tif
00407.tif
00408.tif
$ tiffcp *.tif bla.tif
TIFFReadDirectory: Warning, 00404.tif: unknown field with tag 292 (0x124)
encountered.
TIFFReadDirectory: Warning, 00404.tif: unknown field with tag 41728 (0xa300)
encountered.
TIFFReadDirectory: Warning, 00404.tif: unknown field with tag 42016 (0xa420)
encountered.
TIFFReadDirectory: Warning, 00407.tif: unknown field with tag 292 (0x124)
encountered.
TIFFReadDirectory: Warning, 00407.tif: unknown field with tag 41728 (0xa300)
encountered.
TIFFReadDirectory: Warning, 00407.tif: unknown field with tag 42016 (0xa420)
encountered.
TIFFReadDirectory: Warning, 00408.tif: unknown field with tag 292 (0x124)
encountered.
TIFFReadDirectory: Warning, 00408.tif: unknown field with tag 41728 (0xa300)
encountered.
TIFFReadDirectory: Warning, 00408.tif: unknown field with tag 42016 (0xa420)
encountered.
$ tiff2pdf -o bla.pdf bla.tif

If you would like the individual tif files separately please let me know
------- Comment #4 From 2011-07-16 17:36:00 -------
GD,

Are you aware that some of the TIFF directories in the composite file have page
numbers set on them, while others do not?  tiff2pdf seems to have some logic
for using the page numbers and I suspect that is why you are getting odd
behavior. 

I don't see any obvious options on tiff2pdf to ignore built in page numbers,
and I don't, off hand, know a way of clearing just page numbers from tiff
files. 

But it should be easy to prepare a customized version of tiff2pdf that does not
use the page numbers.
------- Comment #5 From 2011-07-16 17:41:17 -------
Created an attachment (id=465) [details]
tiffinfo report on the .tif file.
------- Comment #6 From 2011-07-16 19:04:52 -------
Mr. Warmerdam,

First of all thank you for your reply. I must confess that not only was I
unaware that some of the TIFFs had internal page numbers set, I was unaware
that they even could. Had I looked at the tiffinfo output I may not have even
noticed the correlation. I will do some research into clearing some or all of
the metadata to see if preprocessing the TIFFs through some process will allow
the workflow to have a consistently ordered output.

Thanks for everything! I will do some testing and report back if clearing the
meta data, in particular page numbering, resolves the problem or not. Based on
what you said, I expect that it will.
------- Comment #7 From 2011-07-16 19:43:31 -------
Well testing that was fast. Simply doing an:
$ exiv2 rm *.tif

At the beginning of the process resolves the problem. So it indeed was what you
suspected it to be. I found some other things which I don't understand thanks
to your tiffinfo tip, such as why single visible page files are reporting
multiple pages. I will see if wikipedia and other resources can help me on that
one or I will take it to the mailing list.

To return on topic for this bug report, I have a question regarding expected
behavior with regards to respecting ordering. As someone not terribly familiar
with the intricacies of TIFFs, I would expect that if I am explicit in the
order of files sent to tiffcp, that the result will preserve my chosen explicit
order. And tiffcp in fact does. Likewise if I have a multipage tiff which is
processed through tiff2pdf, I would expect it to retain the order of the
original file. Perhaps these expectations are unfair when dealing with TIFFs
where they may be entirely valid with other types of files.

The only analog that comes to mind and it may be a really shoddy apples to
oranges comparison, but happens to be the only file I can think of that has
something related to numbering in the metadata is audio files. If I concat 4
audio files which in their metadata come from different albums and thus
different track positions, the concat process I would expect to follow my order
and not that of their tags as coming from potentially different albums their
tags are rendered irrelevant.

I apologize for this turning into such a verbose reply, but where what I am
trying to express is very clear in my mind, it does not seem as clear when I
read back my words. Please let me know if I have muddied the issue with too
many words and I will try to summarize somehow to better explain what I am
thinking.

In any case I have a temporary work around which is what I was really after to
get this project underway, the rest has more likely to do with philosophical
expectations and behavior and thus not of high priority. I will change this bug
report to reflect that.

Summary workaround:
$ exiv2 rm *.tif
$ tiffcp *.tif bla.tif
$ tiff2pdf -o bla.pdf bla.tif

Thanks!
------- Comment #8 From 2011-07-17 01:15:59 -------
I wanted to confirm that it was in fact the Page Number metadata piece that was
throwing the wrench. After you brought it up, everything around it pointed to
that being the problem. After some searching I rediscovered exiftool. This
allowed me to clear just the Page Number field instead of all of the metadata,
bringing the following solution:

$ exiftool -PageNumber= *.tif
$ tiffcp *.tif bla.tif
$ tiff2pdf -o bla.pdf bla.tif

And the resulting order is as desired. This confirms that what you said was the
problem, was indeed the problem (Not that I had doubts, I just like to be 100%
sure whenever possible).

This is slightly cleaner than the exiv2 option as it retains other header
information. Exiv2 unfortunately doesn't support the 0x0129 PageNumber field
and the only alternative I have found to edit just the desired field easily was
exiftool which works brilliantly.

I do look forward to your thoughts on "expected behaviour" but I am in no hurry
with my present workarounds.

Thank you again!
------- Comment #9 From 2019-10-01 14:19:36 -------
Bugzilla is no longer used for tracking libtiff issues. Remaining open tickets,
such as this one, have been migrated to the libtiff GitLab instance at
https://gitlab.com/libtiff/libtiff/issues .

The migrated tickets have their summary prefixed with [BZ#XXXX] where XXXX is
the initial Bugzilla issue number.