The code is invisible to some PDF analysis tools

Mar 3, 2015 09:36 GMT  ·  By

Malicious code can be hidden inside greyscale (black and white) JPEG pictures integrated in PDF files and decoded on a target computer without losing information, a researcher has found.

Although antivirus products include detection for malformed PDFs, they do not verify data compressed with lossy compressors such as DCTDecode, a JPEG-compatible filter.

Security experts underestimated lossy compressors

For performance reasons, and because it was considered that information was lost during the decompression procedure and malicious payloads would not survive the process; briefly put, the resulting file would be an approximation of the original one.

When DCTDecode is used, a JPEG image in RGB color space is downsampled, basically dropping some information (high-frequency components) that would not affect the overall result in a PDF file.

However, not the same would occur in the case of greyscale pictures, as Dénes Óvári from Danish security company CSIS demonstrated in a proof-of-concept he created.

In black and white images, the pixel carries only intensity information, with black being the strongest and white being the weakest.

The researcher encoded a short script as a greyscale JPEG image and used the highest quality settings. The malicious code was padded with 0x00 bytes until the following pixel block (MCU), the process being repeated multiple times.

In the paper called Script in a Lossy Stream, Óvári says that the code was integrated in an image filtered with DCTDecode, referenced by a JavaScript action entry.

Some tweaking is required for Adobe Reader XI

During the experiment, he observed that the code was decompressed in Adobe Reader 9 without losing any of the information. On version XI of the PDF reader, however, some of the bits suffered modifications in the decompressed data and the original file became corrupted.

On the other hand, he managed to apply some tweaks to make the file usable again by changing some characters in each pixel block of the stream.

Hiding the malicious code this way would make it invisible to some standard tools employed in the analysis of malformed PDF files.

The researcher says that despite the fact that his PoC demonstrates a new way to store data in PDF files, an exploit would still be needed in order to carry out nefarious activity. His findings are only meant to show that antivirus products can be fooled as a result of ruling out DCTDecode as a potential avenue of attack.

"We should also perhaps re-examine the handling of other file formats in which data in JPEG format is assumed always to be lossily compressed, while a greyscale mode is still available," the researcher concludes.

Code hidden in PDF (2 Images)

Script works when opened in Adobe Reader 9
There is no trace of the script in PdfStreamDumper
Open gallery