You need to log in before you can comment on or make changes to this bug.
This bug had been in the previous bugzilla. I'm resubmitting it here. This is debian bug 425778. For the original bug report, please see http://bugs.debian.org/425778. It is a complete and well-written bug report. In this bug report, the user has a specific jpeg-compressed tiff which when converted to a PDF results in a PDF with a bad image. I was trying to create a smaller tiff file with interesting results. The first one was just an 8x8 gray square. Running this through tiff2pdf resulted in a core dump. tiff2pdf -o gray-8x8.pdf gray-8x8.tiff The second one was a 200x200 gray square. This one displays properly in acrobat reader with warnings, but the PDF file is invalid. I'm attaching my two tiff files and the broken PDF. For the original file, please see the debian bug report or grab the tiff file from http://eppesuigoccas.homedns.org/~giuseppe/libtiff-tools.tiff2pdf.bug.tar.bz2
Created an attachment (id=232) [details] 8x8 grayscale jpeg-compressed tiff
Created an attachment (id=233) [details] 200x200 grayscale jpeg-compressed tiff
Created an attachment (id=234) [details] PDf generated from gray-200x200.tiff
Testing this with 4.0.0 beta 2, I still see the crash on the 8x8 image, but the 200x200 image generates a valid PDF file. The color is not correct, but I will be reporting that problem in a separate bug report.
Never mind about the separate bug report. The original user's tiff file still doesn't work with 4.0.0 beta 2. I will attach the user's file here.
Created an attachment (id=235) [details] debian user's original tiff file from debian bug report
I'm replacing gray-200x200.pdf with one generated by 4.0.0 beta 2. The file no longer has a corrupted xref table, but it shows as all white instead of gray.
Created an attachment (id=236) [details] PDF generated from gray-200x200.tiff -- updated to 4.0.0 beta 2
See bug 2135. That may help.
This problem, or something much like it, also exists in 3.9.4 according to https://bugzilla.redhat.com/show_bug.cgi?id=628261 I thought at first that tiff2pdf might be choking on YCbCr input, but it fails in the same way with grayscale JPEG input.
I looked into the tiff2pdf.c source code, and find that it's not surprising that it fails to convert most JPEG-compressed TIFFs; rather, the astonishing thing is that there are any cases at all where it appears to work even a little bit :-( The basic problem is that it's got an unworkable scheme for sewing together multiple JPEG-compressed strips into a single output JPEG stream. t2p_process_jpeg_strip tries to do that by editorializing on the strip marker contents, but it does so quite incompetently. It will end up emitting SOI and EOI markers for each strip, not just one pair, which I think is the core thing that's making most readers fall over; although emitting DHT and DQT markers in the midst of compressed data isn't legal per spec either, and it also fails entirely on markers longer than 255 bytes, and it won't work at all if the incoming data uses restart markers, and there are probably some other bugs in that comment-free excuse for code as well. Now these points (other than the restart issue) could probably be fixed up with a little bit of hacking, but it is still fundamentally Not Gonna Work unless all the strips use identical DQT/DHT definitions --- an assumption explicitly outlawed in TIFF Tech Note #2. I'm not real sure if it's worth the marginal hacking to make it work when that assumption does hold, which it probably does for the vast majority of real-world JPEG TIFFs. A proper fix would involve restructuring so that each strip is emitted as a separate PDF image object with only minimal modification of the JPEG datastreams, the way tiles are handled. I find this code sufficiently unreadable that I haven't tried hard to see what that would take. BTW, the tile case is hardly problem-free either, see bug #1960.
I was using single strip TIFFs which is probably why it's working for me with 3.9.4.
Until this multi-strip/tile problem using the -n option can be a work-around (although recompressing with jpeg is a problem. (See bug 2150.)
(In reply to comment #11) > The basic problem is that it's got an unworkable scheme for sewing together > multiple JPEG-compressed strips into a single output JPEG stream. Hmm, isn't the right thing to do is to embed each tile/strip as a separate image object in the PDF and position them next to each other on the page? This is probably the only way to go without recompression, but will have interpolation artifacts on the tile boundaries. Opinions?
Bugzilla is no longer used for tracking libtiff issues. Remaining open tickets, such as this one, have been migrated to the libtiff GitLab instance at https://gitlab.com/libtiff/libtiff/issues . The migrated tickets have their summary prefixed with [BZ#XXXX] where XXXX is the initial Bugzilla issue number.