The following COM Control application can be integrated into MS Office programs to import PDF documents as text.
You have to integrate the COM Control into the vba macro code and can then receive the text content of a PDF file by single call and then transfer it to the Excel or Word document
Application:
in Excel Word Powerpoint Access
Subject:
Vba Macro Code, PDF Reader PDF Import
Code page
Example in Excel
Also works in Ms Word, Office, Outlook, PowerPoint
With the lines of the PDF Reader is called and the text of the PDF document read out as a string
'< get PDF Text > Dim pdf_Reader As New Pdf_Text_Reader.pdf_Reader sText = pdf_Reader.get_Text(sFilename) '</ get PDF Text > |
Example code in vba
Option Explicit On
Public Sub Read_PDF_Text() '------------< Read_PDF_Text() >------------ Dim ws As Worksheet Set ws = ActiveSheet
Dim sFilename As String sFilename = "C:\_Daten\Desktop\VS_Projects\ActiveX\Pdf_Text_Reader\_Test\PDF_Import_Excel.pdf"
Dim sText As String
'< get PDF Text > Dim pdf_Reader As New Pdf_Text_Reader.pdf_Reader sText = pdf_Reader.get_Text(sFilename) '</ get PDF Text >
'----< Read as Lines >---- Dim arrLines arrLines = Split(sText, vbLf)
Dim iLine As Integer iLine = 1 Dim vLine For Each vLine In arrLines iLine = iLine + 1 ws.Cells(iLine + 20, 2).Value = vLine Next '----</ Read as Lines >----
'------------</ Read_PDF_Text() >------------ End Sub
|
Control integration
To do this, you have to integrate the following COM Control
(attached to the download)
The macro code / Vba code page comes with Alt-F11
Vba Code Page -> Menu -> Tools -> References
And then, by browsing the file: Pdf_Text_Reader.tlb embed
installation
Pdf_Text_Reader.dll
The Pdf_Text_Reader is a COM control file, which exists as a .dll. The application uses iTextSharp to read the text.
Register.bat and Unregister.bat are for installation on the computer.
The following files are required in the appendix.
register.bat
On the target computer you have to adjust the Register.bat.
Register.bat contains the code to install on the PC
In the register.bat you have to edit the path to the COM.dll file.
Just replace the xxxxx against the path where the file Pdf_Text_Reader.dll is located
C:\Windows\Microsoft.NET\Framework\v4.0.30319\regasm.exe "C:\\xxxxxxxxxxx\Pdf_Text_Reader.dll" /tlb /codebase pause |
Then run the file as an administrator