Adding existing page form fields to new document.

Questions, comments and suggestions concerning VintaSoft PDF .NET Plug-in.

Moderator: Alex

Post Reply
IntegraHarlan
Posts: 84
Joined: Fri Jan 24, 2020 3:37 am

Adding existing page form fields to new document.

Post by IntegraHarlan »

Hi,
I am trying to create a new document using a page from an existing document and add the form fields from the page to the Interactive form.

When I add the page, I would expect that the form fields would be added to the Interactive form of the new document. This is not happening.
I tried to manually add the form fields, but that does not work either.

I must be doing this wrong, but I am not able to figure out the correct way to add the field. Any help would be appreciated.

Code: Select all

	using (PdfDocument pageDocument = new PdfDocument())
	{
		// create interactive form in PDF document
		pageDocument.InteractiveForm = new PdfDocumentInteractiveForm(pageDocument);	
		
		// Get the first page from the source document.
		PdfPage sourcePage = sourceDocument.Pages[0];
				
		// Get the list of interactive form fields.
		PdfInteractiveFormField[] fieldList = sourceDocument.InteractiveForm.GetFieldsLocatedOnPage(sourcePage);
					
		// Add the first page of the newly created document. I would expect that the form fields would be added.
		pageDocument.Pages.Add(sourcePage);
		
		// Try to add the form fields to the Interactive form
		foreach(PdfInteractiveFormField formField in fieldList)
                {
                	// Remove field from the source document.
                        sourceDocument.InteractiveForm.RemoveField(formField);
                        
                        // Error when adding the field.
                        pageDocument.InteractiveForm.AddField(formField, pageDocument.Pages.Last());
                }
		
		// Remove the first page from the source document.
		sourceDocument.Pages.Remove(sourcePage);
					
		...
		...
	}
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

If you need to copy page with interactive fields from one PDF document to another, you need to use Vintasoft.Imaging.Pdf.Processing.PdfDocumentCopyCommand class. Please see example here: https://www.vintasoft.com/docs/vsimagin ... mmand.html

Best regards, Alexander
IntegraHarlan
Posts: 84
Joined: Fri Jan 24, 2020 3:37 am

Re: Adding existing page form fields to new document.

Post by IntegraHarlan »

Hi Alex, Thanks you for your reply.

If I understand your response correctly, If I want to copy a page from one document to another document and add the form fields, I will need to get the page from the document I am copying the page from and create a new document then add the page?
Then use the copy document class to copy the newly created document to the document I want to add the page to?

Thanks
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

No, you do not need to create temporary document.

You need to do the following steps:
  • Open source PDF document.
  • For each unnecessary page in source PDF document:
    • Delete interactive fields, which are located on unnecessary page.
    • Delete unnecessary page.
  • Copy source PDF document (with 1 page) to the destination PDF document using Vintasoft.Imaging.Pdf.Processing.PdfDocumentCopyCommand class.
  • Close source PDF document without saving.
Best regards, Alexander
IntegraHarlan
Posts: 84
Joined: Fri Jan 24, 2020 3:37 am

Re: Adding existing page form fields to new document.

Post by IntegraHarlan »

Hi Alex,
Your suggestion works. However we are having a performance issue with larger documents.
What we are doing is taking a document and splitting every page apart and making the page its own document.
We want to make sure that each page retains its form fields.
We are doing this with the code below.
We call this method in a loop for each page in the document. The document is passed into the method and the page number in a list with one item.
We use this method to remove a range of pages at times also. For my issue we are splitting out every page and calling this method with the document and each page number. So for a 20 page document, this method is called 20 times and passed the document a the page number that is being split out.

We copy the document
Create a list of pages that are kept in the document. For my task it will be all the pages except for the passed in page number.
We then remove the form fields from the source document that belong to the page that we are splitting out, then the page that we are splitting out.
We then remove all of the form fields and pages from the copied document we are splitting out. This is a one page document so we are removing all the form fields and all the pages except for the one we are keeping.

This works except that with larger documents in can take up close to four minutes to do this with a 100 page document with around 100 form fields per page.

The biggest time delay is removing all of the fields and all the pages from the copied document that we are splitting out.
This is only a few seconds, but when you have one hundred page document, it adds up.

Is there a faster more effective way that I can do this?
Here is the method that we are using to split the document apart to separate pages.
We get page list and call this method in a loop for each page.
Any information on how to improve the performance would be appreciated.

Code: Select all

	private static FileStream CopyPageToDocument(PdfDocument sourceDocument, List<int> extractedPages)
        {
            FileStream temporaryParentStream = null;

            using (new StopwatchLog(string.Format($"Copying {extractedPages.Count} page(s).")))
            {
                // create document and copy from source.
                using (PdfDocument pageDocument = new PdfDocument())
                {
                    // Create copy command from the newly created page document.
                    PdfDocumentCopyCommand documentCopyCommand = new PdfDocumentCopyCommand(pageDocument)
                    {
                        CopyBookmarks = true,
                        CopyDocumentLevelJavaScripts = true,
                        CopyInteractiveForm = true
                    };

                    // Execute copy from the source document to the target document. 
                    documentCopyCommand.Execute(sourceDocument);

                    // Create a list of pages that will be kept in the source document.
                    // The extractedPages list is 1 based. Keep both lists 1 based.
                    List<int> pagesToKeep = new List<int>();
                    for (int pageIndex = 1; pageIndex <= sourceDocument.Pages.Count(); pageIndex++)
                    {
                        if (extractedPages.Exists(item => item == pageIndex) == false)
                        {
                            pagesToKeep.Add(pageIndex);
                        }
                    }

                    // Remove the extracted pages from the source document.
                    extractedPages.Reverse();
                    foreach (int pageIndex in extractedPages)
                    {
                        // Check to make sure the InteractiveForm exist.  There are scenarios where it is null
                        if (sourceDocument.InteractiveForm != null)
                        { 
                            // Get a list of fields that belong to the pages that are not part of the new document and remove them.
                            PdfInteractiveFormField[] pageFields = sourceDocument.InteractiveForm.GetFieldsLocatedOnPage(sourceDocument.Pages[pageIndex - 1]);
                            foreach (PdfInteractiveFormField field in pageFields)
                            {
                                field.Remove();
                            }
                        }

                        // Remove unwanted pages.
                        sourceDocument.Pages.RemoveAt(pageIndex - 1);
                    }

                    // Remove the pages that are kept in the source document from the copied document.
                    pagesToKeep.Reverse();
                    foreach (int pageIndex in pagesToKeep)
                    {
                        // Check to make sure the InteractiveForm exist.  There are scenarios where it is null
                        if (pageDocument.InteractiveForm != null)
                        {
                            // Get a list of fields that belong to the pages that are kept with the source document and remove them from the copied document.
                            PdfInteractiveFormField[] pageFields = pageDocument.InteractiveForm.GetFieldsLocatedOnPage(pageDocument.Pages[pageIndex - 1]);
                            foreach (PdfInteractiveFormField field in pageFields)
                            {
                                field.Remove();
                            }
                        }

                        // Remove unwanted pages.
                        pageDocument.Pages.RemoveAt(pageIndex - 1);
                    }

                    // Return the copied document as a file stream.
                    temporaryParentStream = new FileStream(FileUtilities.GetTemporaryFile(FileUtilities.TemporarySubFolders.FileProcessing), FileMode.CreateNew, FileAccess.ReadWrite, FileShare.ReadWrite, 4096, FileOptions.DeleteOnClose);
                    pageDocument.SaveChanges(temporaryParentStream);
                }
            }

            return temporaryParentStream;
        }
        
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

Thank you for information.

Please send us (to support@vintasoft.com) a small console project, which allows to reproduce the problem. We will analyze your algorithm and PDF document and will try to suggest you solution with better performance.

Best regards, Alexander
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

Thank you for the project, we reproduced the problem.

In version 10.1.14.1 we have added the ability to copy several PDF pages in PdfDocumentCopyCommand class. This functionality allows to improve performance of your code and make your code much easier - see attached code. Please use version 10.1.14.1 with updated code and let me know if you will have any question or problem.

Starting from version 10.1.14.1 your need to use the following code:

Code: Select all

using System;
using System.Collections.Generic;
using System.IO;
using Vintasoft.Imaging.Pdf;
using Vintasoft.Imaging.Pdf.Processing;
using Vintasoft.Imaging.Pdf.Tree;
using Vintasoft.Imaging.Pdf.Tree.Annotations;
using Vintasoft.Imaging.Pdf.Tree.InteractiveForms;

namespace SplitPagesConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            List<string> splitPageDocuments = SplitAllPages("test.pdf");
        }

        /// <summary>
        /// Splits all pages of the pdf file into separate documents.
        /// </summary>
        /// <param name="pdfFile">The source PDF file.</param>
        /// <returns>An array of file names for the newly created files.</returns>
        private static List<string> SplitAllPages(string pdfFile)
        {
            List<string> pages = new List<string>();

            using (PdfDocument document = new PdfDocument(pdfFile))
            {
                for (int i = 0; i < document.Pages.Count; i++)
                {
                    Console.Write("Processing page {0}...", i + 1);
                    
                    string outputFile = string.Format("{0}_p{1}.pdf", Path.GetFileNameWithoutExtension(pdfFile), i + 1);
                    CopyPageToDocument(document, i, outputFile);
                    pages.Add(outputFile);

                    Console.WriteLine("done.");
                }
            }

            return pages;
        }

        /// <summary>
        /// Create a document from a page from the source document.
        /// </summary>
        /// <param name="sourceDocumentFilename">The source document that the copy is made from.</param>
        /// <param name="extractingPageIndex">The page that to be separated from the source document.</param>
        /// <param name="outputfile">The filename to save the extracted page to.</param>
        private static void CopyPageToDocument(PdfDocument sourceDocument, int extractingPageIndex, string outputfile)
        {
            using (PdfDocument outputDocument = new PdfDocument())
            {
                PdfDocumentCopyCommand copyCommand = new PdfDocumentCopyCommand(outputDocument);
                copyCommand.CopyInteractiveForm = true;
                copyCommand.CopyDocumentLevelJavaScripts = true;
                copyCommand.CopyBookmarks = true;
                copyCommand.PageIndexes = new int[] { extractingPageIndex };
                copyCommand.Execute(sourceDocument);
                outputDocument.Save(outputfile);
            }
        }
    }

}
Best regards, Alexander
IntegraHarlan
Posts: 84
Joined: Fri Jan 24, 2020 3:37 am

Re: Adding existing page form fields to new document.

Post by IntegraHarlan »

Hi Alex,
Thank you for the update.
The PdfDocumentCopy command has improved the copy performance significantly.
I do have a question about the copied for fields.
I notice that some Intaractiveform fields on the copied pager are null.
I assume those are fields that belong to other pages. Is it intentional to have null fields in the Interactive form?

Thanks
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

Yes, field list can contain null values. Null value means that field was not copied from source PDF document to the destination PDF document.

We can improve algorithm and remove null values from field list if this is necessary. Also you can remove null values from field list by yourself.

Best regards, Alexander
Alex
Site Admin
Posts: 2303
Joined: Thu Jul 10, 2008 2:21 pm

Re: Adding existing page form fields to new document.

Post by Alex »

Hi Harlan,

In version 10.1.16.1 we have optimized the algorithm of PdfDocumentCopyCommand class when the value of PdfDocumentCopyCommand.PageIndexes property is set. This improvement decreases size of created PDF documents.

For using new functionality you need to use this code:

Code: Select all

using System;
using System.Collections.Generic;
using System.IO;
using Vintasoft.Imaging.Pdf;
using Vintasoft.Imaging.Pdf.Processing;
using Vintasoft.Imaging.Pdf.Tree;
using Vintasoft.Imaging.Pdf.Tree.Annotations;
using Vintasoft.Imaging.Pdf.Tree.InteractiveForms;

namespace SplitPagesConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            string documentFile = "test.pdf";

            // Get the filepath of the document to be split.
            documentFile = AppDomain.CurrentDomain.BaseDirectory + documentFile;

            Console.WriteLine();

            // Set Timer.
            System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
            timer.Start();

            List<string> splitPageDocuments = SplitAllPages(documentFile);

            timer.Stop();

            Console.WriteLine();
            Console.WriteLine($"Execution time to split multi page pdf document to {splitPageDocuments.Count} seperate pdf documents: {timer.Elapsed.TotalSeconds} seconds.");
            Console.ReadKey();
        }

        /// <summary>
        /// Splits all pages of the pdf file into separate documents.
        /// </summary>
        /// <param name="pdfFile">The source PDF file.</param>
        /// <returns>An array of file names for the newly created files.</returns>
        private static List<string> SplitAllPages(string pdfFile)
        {
            List<string> pages = new List<string>();

            using (PdfDocument document = new PdfDocument(pdfFile))
            {
                for (int i = 0; i < document.Pages.Count; i++)
                {
                    Console.Write("Processing page {0}...", i + 1);
                    
                    string outputFile = string.Format("{0}_p{1}.pdf", Path.GetFileNameWithoutExtension(pdfFile), i + 1);
                    CopyPageToDocument(document, i, outputFile);
                    pages.Add(outputFile);

                    Console.WriteLine("done.");
                }
            }

            return pages;
        }

        /// <summary>
        /// Create a document from a page from the source document.
        /// </summary>
        /// <param name="sourceDocumentFilename">The source document that the copy is made from.</param>
        /// <param name="extractingPageIndex">The page that to be separated from the source document.</param>
        /// <param name="outputfile">The filename to save the extracted page to.</param>
        private static void CopyPageToDocument(PdfDocument sourceDocument, int extractingPageIndex, string outputfile)
        {
            using (PdfDocument outputDocument = new PdfDocument())
            {
                PdfDocumentCopyCommand copyCommand = new PdfDocumentCopyCommand(outputDocument);
                copyCommand.CopyInteractiveForm = true;
                copyCommand.CopyDocumentLevelJavaScripts = true;
                copyCommand.CopyBookmarks = true;
                copyCommand.PageIndexes = new int[] { extractingPageIndex };
                copyCommand.Execute(sourceDocument);
                outputDocument.Pack(outputfile, PdfFormat.Pdf_17);
            }
        }
    }

}
New code creates PDF document with size 135Kb. Previous code created PDF document with size 5.200Kb.

Best regards, Alexander
Post Reply