TET PDF IFilter能夠從PDF文檔中提取文本和元數據,并使其可用于Windows操作系統上的搜索和檢索軟件。這使得PDF文檔在本地桌面,企業服務器,或Web上能被搜索。TET PDF IFilter基于具有專利的PDFlib文本提取工具箱(TET),它是一個讓開發者從PDF文檔中可靠提取文本的產品。
TET PDF IFilter extracts text and metadata from PDF documents and makes it available to search and retrieval software on Windows. This allows PDF documents to be searched on the local desktop, a corporate server, or the Web. TET PDF IFilter is based on the patented PDFlib Text Extraction Toolkit (TET), which is a developer product for reliably extracting text from PDF documents.
TET PDF IFilter是微軟IFilter索引接口強大功能的體現。它能夠與所有的搜索和檢索產品一起工作,并支持IFilter接口,例如SharePoint和SQL Server。該產品使用的是特定格式的過濾器程序—稱為IFilter—針對一些特定的文件格式,例如,HTML。TET PDF IFilter是一個旨在為PDF文檔服務的程序。用于搜索文檔的用戶接口可以是Windows Explorer,一個Web或數據庫的前端,一個查詢腳本,或一個自定義的應用程序。作為一個用于交互式的搜索,查詢的特有產品,它能被嵌入到標準編程中而不需要任何用戶接口。
基于TET專利技術
PDFlib TET,是基于TET PDF IFilter的產品,它于2002年首次發布,并被世界范圍內的客戶廣泛應用于服務器和桌面環境中。作為一個獨特的產品,它可用于提取PDF頁面的內容和元數據來作為原始文本,TET還支持XML格式的文檔內容。TET還可以作為一個Adobe Acrobat的免費插件;該插件允許在TET的高質量文本中進行交互式測試和提取評估。
獨特優勢
TET PDF IFilter提供了以下特有功能:
不僅對頁面內容支持索引,而且還支持索引元數據,書簽,PDF附件,和PDF套包/集合。
甚至可以從那些Acrobat不能打開的PDF文檔中提取文本信息。
支持索引XMP圖像元數據。
性能:線程安全,快速穩健,支持32位和64位操作系統。
精益獨立的產品且沒有負面影響。
自動進行語言/腳本檢測。
專業的團隊為您提供高效的技術支持
企業級PDF搜索
TET PDF IFilter采用的是線程安全的32位和64位本地版本產品。您可以使用TET PDF IFilter和以下產品實現企業級的PDF搜索解決方案:
Microsoft Office SharePoint Server (MOSS)
Microsoft Search Server 2008 和免費的Search Server 2008 Express
Microsoft SQL Server
Microsoft Exchange Server
TET PDF IFilter可用于所有支持IFilter接口的其他微軟和第三方產品。
桌面PDF搜索
TET PDF IFilter也可以用來實現桌面PDF搜索,例如,以下產品:
TET PDF IFilter是免費提供給非商用的桌面程序使用,其提供了一個方便的基礎平臺用于測試和評估。
TET PDF IFilter is a robust implementation of Microsoft’s IFilter indexing interface. It works with all search and retrieval products which support the IFilter interface, e.g. SharePoint and SQL Server. Such products use format-specific filter programs – called IFilters – for particular file formats, e.g. HTML. TET PDF IFilter is such a program, aimed at PDF documents. The user interface for searching the documents may be the Windows Explorer, a Web or database frontend, a query script, or a custom application. As an alternative to interactive searches, queries can also be submitted programmatically without any user interface.
Based on patented TET technology
PDFlib TET, the basis of TET PDF IFilter, was first released in 2002, and has been used by customers worldwide in server and desktop environments. As an alternative to extracting PDF page contents and metadata as raw text, TET can supply the document contents in XML format. TET is also available as a free plugin for Adobe Acrobat; this plugin allows interactive test and evaluation of TET’s superior text extraction.
Unique advantages
TET PDF IFilter offers the following advantages:
- Indexes not only page content, but also metadata, bookmarks, PDF attachments, and PDF packages/portfolios
- Extracts text even from PDFs where Acrobat fails
- Indexes XMP image metadata
- Performance: thread-safe, fast and robust, 32- and 64-bit
- Lean stand-alone product without side effects
- Automatic language/script detection
- Actively supported by a dedicated team
Enterprise PDF search
TET PDF IFilter is available in fully thread-safe native 32- and 64-bit versions. You can implement enterprise PDF search solutions with TET PDF IFilter and the following products:
- Microsoft Office SharePoint Server (MOSS)
- Microsoft Search Server 2008 and the free Search Server 2008 Express
- Microsoft SQL Server
- Microsoft Exchange Server
TET PDF IFilter can be used with all other Microsoft and third-party products which support the IFilter interface.
Desktop PDF search
TET PDF IFilter can also be used to implement desktop PDF search, e.g. with the following products:
- Windows Desktop Search (WDS): integrated in Windows
- Vista; also available as free add-on for Windows XP
- Windows Indexing Service
TET PDF IFilter is freely available for non-commercial desktop use, which provides a convenient basis for test and evaluation.