Skip to content

Commit 3340173

Browse files
authored
Merge pull request #521 from telerik/new-kb-split-big-pdf-documents-750b1b4135724865a2445dfd8bbbf135
Added new kb article split-big-pdf-documents
2 parents 68f43cf + eb8b34c commit 3340173

File tree

3 files changed

+103
-0
lines changed

3 files changed

+103
-0
lines changed
126 KB
Loading
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
title: Splitting Big PDF Documents into Smaller Files Efficiently
3+
description: Learn how to efficiently export subsets of pages from a RadFixedDocument to handle large PDFs by splitting them into smaller chunks using PdfStreamWriter.
4+
type: how-to
5+
page_title: How to Split Large PDFs into Smaller Chunks Using RadPdfProcessing
6+
slug: split-big-pdf-documents
7+
tags: pdf, large, big, slow, memory, split, document, processing
8+
res_type: kb
9+
ticketid: 1679749
10+
---
11+
12+
## Environment
13+
14+
| Version | Product | Author |
15+
| ---- | ---- | ---- |
16+
|2025.1.205| RadPdfProcessing |[Desislava Yordanova](https://www.telerik.com/blogs/author/desislava-yordanova)|
17+
18+
## Description
19+
20+
When working with large PDF documents, it might be necessary to split these documents into smaller "chunks" of a specified number of pages. This article shows a sample approach how to export efficiently subsets of pages without having to repeatedly re-import the entire document for each chunk.
21+
22+
## Solution
23+
24+
To efficiently export subsets of pages from a large PDF document, the [PdfStreamWriter]({%slug radpdfprocessing-formats-and-conversion-pdf-pdfstreamwriter-overview%}) functionality of [RadPdfProcessing]({%slug radpdfprocessing-overview%}) can be leveraged. This approach significantly improves performance and reduces memory usage, making it ideal for processing large volumes of PDFs. Below is a step-by-step guide on how to split a PDF document into smaller "chunks" using `PdfStreamWriter`.
25+
26+
### Step 1: Include Necessary Namespaces
27+
28+
Ensure you include the necessary namespaces in your project:
29+
30+
```csharp
31+
using Telerik.Windows.Documents.Fixed.FormatProviders.Pdf.Streaming;
32+
using Telerik.Windows.Documents.Fixed.Model;
33+
```
34+
35+
### Step 2: Implement the Splitting Logic
36+
37+
Use the following logic to split a PDF document into smaller chunks. This example splits a document into chunks of 10 pages each, but you can adjust the `pagesCount` variable to change the chunk size as needed.
38+
39+
```csharp
40+
internal class Program
41+
{
42+
public const string ResultDirName = @"..\..\..\Demo results";
43+
public const string InputFileMultipageDocument = "YourLargePDF.pdf";
44+
static void Main(string[] args)
45+
{
46+
EnsureEmptyResultDirectory();
47+
SplitDocumentPages("YourLargePDF_Chunk[{0}-{1}].pdf");
48+
}
49+
50+
private static void SplitDocumentPages(string resultFileFormat)
51+
{
52+
string documentToSplit = InputFileMultipageDocument;
53+
int pagesCount = 10; // Number of pages per chunk
54+
using (PdfFileSource fileToSplit = new PdfFileSource(File.OpenRead(documentToSplit)))
55+
{
56+
int totalPages = fileToSplit.Pages.Length;
57+
int currentTotalPageIndex = 0;
58+
int numberOfSplitDocs = totalPages / pagesCount + (totalPages % pagesCount == 0 ? 0 : 1);
59+
60+
for (int docIndex = 0; docIndex < numberOfSplitDocs; docIndex++)
61+
{
62+
int currentPagesEnd = Math.Min(currentTotalPageIndex + pagesCount, totalPages);
63+
string splitDocumentName = Path.Combine(ResultDirName, string.Format(resultFileFormat, docIndex * pagesCount + 1, currentPagesEnd));
64+
65+
using (PdfStreamWriter fileWriter = new PdfStreamWriter(File.OpenWrite(splitDocumentName)))
66+
{
67+
for (int pageIndex = currentTotalPageIndex; pageIndex < currentPagesEnd; pageIndex++)
68+
{
69+
PdfPageSource page = fileToSplit.Pages[pageIndex];
70+
fileWriter.WritePage(page);
71+
currentTotalPageIndex++;
72+
}
73+
}
74+
}
75+
}
76+
}
77+
78+
private static void EnsureEmptyResultDirectory()
79+
{
80+
if (Directory.Exists(ResultDirName))
81+
{
82+
foreach (string fileName in Directory.EnumerateFiles(ResultDirName))
83+
{
84+
File.Delete(fileName);
85+
}
86+
}
87+
else
88+
{
89+
Directory.CreateDirectory(ResultDirName);
90+
}
91+
}
92+
}
93+
```
94+
95+
This solution efficiently processes the splitting operation by writing each chunk directly to a new file without the need to re-import the original document multiple times:
96+
97+
![Split Large PDF](images/split-big-pdf.png)
98+
99+
## See Also
100+
101+
- [PdfStreamWriter]({%slug radpdfprocessing-formats-and-conversion-pdf-pdfstreamwriter-overview%})
102+
- [RadPdfProcessing]({%slug radpdfprocessing-overview%})

libraries/radpdfprocessing/formats-and-conversion/pdf/pdfstreamwriter/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,3 +69,4 @@ If the scenario requires you to read or modify the existing page content, you sh
6969
* [Features]({%slug radpdfprocessing-formats-and-conversion-pdf-pdfstreamwriter-features%})
7070
* [PdfFormatProvider]({%slug radpdfprocessing-formats-and-conversion-pdf-pdfformatprovider%})
7171
* [PdfStreamWriter Class]({%slug radpdfprocessing-formats-and-conversion-pdf-pdfstreamwriter-pdfstreamwriter%})
72+
* [Splitting Big PDF Documents into Smaller Files Efficiently]({%slug split-big-pdf-documents%})

0 commit comments

Comments
 (0)