Compress PDF document using C#, VB.NET

Blog category: PDF.NET

July 12, 2024

VintaSoft PDF .NET Plug-in can be used for compression and optimization of PDF document. Also VintaSoft PDF .NET Plug-in can be used for removing metadata from PDF document. Reducing the size of a PDF file can help reduce network traffic when transferring PDF files and reduce the storage space taken up by PDF files. This is especially useful in areas such as archiving, emailing, and using PDF documents in web applications.

To optimize a PDF document the VintaSoft PDF .NET Plug-in can perform the following actions:

Pack PDF document

PDF document can contain not used resources. VintaSoft PDF .NET Plug-in allows to determine and remove not used resources in PDF document.
Also PDF document can contain revision history of PDF document. VintaSoft PDF .NET Plug-in allows to remove the revision history from PDF document.
Also resources of PDF document can be compressed with not optimal compression algorithm. VintaSoft PDF .NET Plug-in can compress resources using more optimal compression algorithms.
Also PDF document contains not compressed cross-reference table if PDF file uses PDF format 1.4 or earlier. VintaSoft PDF .NET Plug-in can save PDF document in PDF format 1.5 or higher and use compressed cross-reference table.

Here is C# code that demonstrates how to load existing PDF document, remove not used resources from PDF document, compress used PDF resources with optimal compression algorithm and save PDF document with optimal PDF format:
/// <summary>
/// Packs the PDF document.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void PackDocument(string inPdfFilename, string outPdfFilename)
{
    // create compressor with empty compression settings
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
        Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateEmptyCompressor();

    // specify that compressor must use maximum Flate compression level (best compression)
    compressor.FlateCompressionLevel = 9;
    // specify that compressor must recompress all resource that uses None, LZW, Flate compression using Flate compression
    compressor.RecompressFlateCompression = true;
    compressor.UseFlateInsteadLzwCompression = true;
    compressor.UseFlateInsteadNoneCompression = true;

    // specify that compressor must remove incremental update info and unused objects
    compressor.PackDocument = true;

    // if version of PDF document is lower than 1.7
    if (GetPdfDocumentVersion(inPdfFilename) < 17)
    {
        // set output format to PDF 1.7
        compressor.DocumentPackFormat = Vintasoft.Imaging.Pdf.PdfFormat.Pdf_17;
    }

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}

/// <summary>
/// Returns the PDF document version.
/// </summary>
/// <param name="pdfFilename">The PDF filename.</param>
/// <returns>The version number in dual-digit format (10,11,12,13,14,15,16,17,20,...).</returns>
private static int GetPdfDocumentVersion(string pdfFilename)
{
    using (Vintasoft.Imaging.Pdf.PdfDocument document = new Vintasoft.Imaging.Pdf.PdfDocument(pdfFilename))
        return document.Format.VersionNumber;
}


Optimize fonts in PDF document

Some font glyphs are not used for rendering of text in a PDF document. VintaSoft PDF .NET Plug-in allows to remove not used glyphs from fonts in PDF document.

Here is C# code that demonstrates how to optimize fonts in PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand class:
/// <summary>
/// Subsets fonts in PDF document.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void SubsetFonts(string inPdfFilename, string outPdfFilename)
{
    // create compressor with empty compression settings
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
       Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateEmptyCompressor();

    // specify that compressor must subset fonts in PDF document
    compressor.SubsetFonts = true;

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}


Compress images in PDF document

Many PDF documents contain images. VintaSoft PDF .NET Plug-in allows to decrease resolution and color bit depth for images in PDF document for reducing size of PDF file.

Here is C# code that demonstrates how to decrease resolution and color bit depth for resources in PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand class:
/// <summary>
/// Detects "read color depth" of PDF image resources and compress PDF document with intent to view in 150DPI.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void CompressToViewIn150DPI(string inPdfFilename, string outPdfFilename)
{
    // create compressor that will compress PDF document using lossy compression algorithms
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
       Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateLossyCompressor(150, false, false, false);

    // specify that compressor must use JPEG compression for color images
    compressor.ColorImagesCompression = Vintasoft.Imaging.Pdf.PdfCompression.Jpeg;
    // specify that compressor must set JPEG quality to 70
    compressor.ColorImagesCompressionSettings.JpegQuality = 70;

    // specify that compressor must detect if image is bitonal image and use optimal compression for bitonal image
    compressor.DetectBitonalImageResources = true;
    // specify that compressor must detect if image is black-white image and use optimal compression for black-white image
    compressor.DetectBlackWhiteImageResources = true;
    // specify that compressor must detect if image is grayscale image and use optimal compression for grayscale image
    compressor.DetectGrayscaleImageResources = true;

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}


Clear content in PDF document

PDF document can contain not used objects, for example: resources, pages, fonts, images, names, content operators. Also PDF document can contain resource duplicates, for example, image copies or font copies. VintaSoft PDF .NET Plug-in can determine and remove not used objects and resource duplicates in PDF document.

Here is C# code that demonstrates how to remove not used objects and resource duplicates from PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand class:
/// <summary>
/// Removes unused and duplicated resources in the PDF document.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void RemoveUnsusedResources(string inPdfFilename, string outPdfFilename)
{
    // create compressor with empty compression settings
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
       Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateEmptyCompressor();

    // specify that compressor must remove duplicate resources from PDF document
    compressor.RemoveDuplicateResources = true;
    // specify that compressor must remove unused names resources from PDF document
    compressor.RemoveUnusedNamedResources = true;
    // specify that compressor must remove unused names from PDF document
    compressor.RemoveUnusedNames = true;
    // specify that compressor must remove unused pages from PDF document
    compressor.RemoveUnusedPages = true;
    // specify that compressor must remove invalid bookmarks from PDF document
    compressor.RemoveInvalidBookmarks = true;
    // specify that compressor must remove invalid links from PDF document
    compressor.RemoveInvalidLinks = true;

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}


Remove metadata and other elements from PDF document

PDF document can contain objects, which do not affect to the displaying of PDF page, for example: metadata, bookmarks, embedded files, interactive form, page thumbnails, structure tree, document information. VintaSoft PDF .NET Plug-in allows to remove objects from PDF document if objects are not necessary in document.

Here is C# code that demonstrates how to remove objects from PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand class:
/// <summary>
/// Removes metadata, bookmarks, document information, embedded files, embedded thumbnails, interactive form and structure tree of the PDF document.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void RemoveObjects(string inPdfFilename, string outPdfFilename)
{
    // create compressor with empty compression settings
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
       Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateEmptyCompressor();

    // specify that compressor must remove metadata from PDF document
    compressor.RemoveMetadata = true;
    // specify that compressor must remove bookmarks from PDF document
    compressor.RemoveBookmarks = true;
    // specify that compressor must remove document information from PDF document
    compressor.RemoveDocumentInformation = true;
    // specify that compressor must remove embedded files from PDF document
    compressor.RemoveEmbeddedFiles = true;
    // specify that compressor must remove embedded thumbnails from PDF document
    compressor.RemoveEmbeddedThumbnails = true;
    // specify that compressor must remove interactive form from PDF document
    compressor.RemoveInteractiveForm = true;
    // specify that compressor must remove structure tree from PDF document
    compressor.RemoveStructureTree = true;

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}


Remove annotations from PDF document

VintaSoft PDF .NET Plug-in allows to remove annotations from PDF page if annotations are not necessary on PDF page. Also VintaSoft PDF .NET Plug-in allows to convert annotations into graphics (flatten annotations) if annotations must be displayed on PDF page but user should not be able to interact with annotations.

Here is C# code that demonstrates how to convert annotations into graphics (flatten annotations) and remove interactive form from PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand class:
/// <summary>
/// Flatten an annotations and remove intractive form of the PDF document.
/// </summary>
/// <param name="inPdfFilename">The input PDF filename.</param>
/// <param name="outPdfFilename">The output PDF filename.</param>
public static void FlattenAnnotations(string inPdfFilename, string outPdfFilename)
{
    // create compressor with empty compression settings
    Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand compressor =
       Vintasoft.Imaging.Pdf.Processing.PdfDocumentCompressorCommand.CreateEmptyCompressor();

    // specify that compressor must remove interactive form from PDF document
    compressor.RemoveInteractiveForm = true;
    // specify that compressor must flatten annotations in PDF document
    compressor.FlattenAnnotations = true;

    // compress PDF document
    compressor.Compress(inPdfFilename, outPdfFilename);
}