Stellent: Supported File Formats

Supported Document Formats

This appendix contains a list of the document formats supported by the Inso filtering technology. The following topics are covered in this appendix:

About Document Filtering Technology
Supported Document Formats
Unsupported Formats
About Document Filtering Technology

Oracle Text uses document filtering technology licensed from Stellent Chicago, Inc. This filtering technology enables you to index most document formats. This technology also enables you to convert documents to HTML for document presentation, with the CTX_DOC package.

See Also:
For a list of supported formats, see “Supported Document Formats” in this Appendix.

To use Inso filtering for indexing and DML processing, you must specify the INSO_FILTER object in your filter preference.

To use Inso filtering technology for converting documents to HTML with the CTX_DOC package, you need not use the INSO_FILTER indexing preference, but you must still set up your environment to use this filtering technology as described in this appendix.

To convert documents to HTML format, Inso filtering technology relies on shared libraries and data files licensed from Stellent Chicago, Inc.

The following sections discuss the supported platforms and how to enable Inso filtering on the different platforms.

Supported Platforms

Supported Platforms

Inso filter technology is supported on the following platforms:

Sun Solaris on SPARC 32-bit and 64-bit (2.5.1 – 2.6,7-8)
IBM AIX 32-bit and 64-bit (4.2 – 4.3)
HP-UX 32-bit and 64-bit (10.2 – 11.0)
DEC UNIX for Alpha/Tru64 UNIX (4.0)
SGI IRIX 32-bit and 64-bit (6.3)
Microsoft Windows
Intel x86 WinNT (4.0 and above)
Intelx86 Win95, Win98 SE, Win2000, and Windows ME
Red Hat Linux for Intel x86 (5.2 – 7.0)
Environment Variables

All environment variables related to Inso filtering must be made visible to Oracle Text.

Requirements for UNIX Platforms

The following requirements apply to Solaris, IBM AIX, HP/UX, Digital UNIX, SGI, and Linux platforms:

Ensure the *.flt files have execute permission granted to the operating system user running the Oracle database and ctxsrv server.
Set the $PATH variable to include the location of the *.flt files, in particular to the location of the file isunx2.flt, and to $ORACLE_HOME/ctx/lib which is the location of the shared libraries for Inso filtering
Set the $HOME environment variable to allow Inso technology to write files to a sub-directory (.oit) in $HOME directory.
Access to a running X-Windows server is required to perform vector graphics image conversion.
Filtering Vector Graphic Formats

Follow these steps to filter vector graphic formats on UNIX platforms:

Start an X server to filter vector graphic formats. If no X server exists (system detects no X libraries, such as Xm, Xt, and X11), vector graphic filtering is not performed. Vector graphic formats include CAD drawings and presentation formats such as Power Point 97. Bitmap formats include GIF, JPEG, and TIF formats as well as bitmap formats.
Because the system depends on X libraries to perform vector graphic conversion, ensure that the system-specific library path environment variable for the X libraries is set correctly.
Set the $DISPLAY environment variable. For example, setting DISPLAY=:0.0 tells the system to use the X server on the console.
OLE2 Object Support

There are platform dependent limits on what Inso filter technology can do with OLE2 objects. On all platforms when a metafile snapshot is available, Inso technology will use it to convert the object.

When a metafile snapshot is not available on UNIX platforms, Inso technology cannot convert the OLE2 object.

However, when a metafile snapshot is not available on the NT platform, the original application is used (if available) to convert the OLE2 object.

Supported Document Formats

The following table lists all of the document formats that Oracle Text supports for filtering. Document filtering is used for indexing, DML, and for converting documents to HTML with the CTX_DOC package. This filtering technology is based on Outside In HTML Export and Outside In Content Access technology licensed from Stellent Chicago, Inc.

Note:
This list does not represent the complete list of formats that Oracle is able to process. The external filter framework enables Oracle to process any document format, provided an external filter exists which can filter all the formats to plain text.

Word Processing – Generic

Format Version
ASCII Text (7 &8 bit versions)

All versions

ANSI Text (7 & 8 bit)

All versions

Unicode Text

All versions

HTML

Versions through 3.0 (some limitations)

IBM Revisable Form Text

All versions

IBM FFT

All versions

Microsoft Rich Text Format (RTF)

All versions

Word Processing – DOS

Format Version
DEC WPS Plus (WPL)

Versions through 4.1

DEC WPS Plus (DX)

Versions through 4.0

DisplayWrite 2 & 3 (TXT)

All versions

DisplayWrite 4 & 5

Versions through Release 2.0

Enable

Versions 3.0, 4.0 and 4.5

First Choice

Versions through 3.0

Framework

Version 3.0

IBM Writing Assistant

Version 1.01

Lotus Manuscript

Versions through 2.0

MASS11

Versions through 8.0

Microsoft Word

Versions through 6.0

Microsoft Works

Versions through 2.0

MultiMate

Versions through 4.0

Navy DIF

All versions

Nota Bene

Version 3.0

Office Writer

Version 4.0 to 6.0

PC-File Letter

Versions through 5.0

PC-File+ Letter

Versions through 3.0

PFS:Write

Versions A, B, and C

Professional Write

Versions through 2.1

Q&A

Version 2.0

Samna Word

Versions through Samna Word IV+

SmartWare II

Version 1.02

Sprint

Versions through 1.0

Total Word

Version 1.2

Volkswriter 3 & 4

Versions through 1.0

Wang PC (IWP)

Versions through 2.6

WordMARC

Versions through Composer Plus

WordPerfect

Versions through 6.1

WordStar

Versions through 7.0

WordStar 2000

Versions through 3.0

XyWrite

Versions through III Plus

Word Processing – International

Format Version
JustSystems Ichitaro

Version 5.0, 6.0, 8.0, 9.0, and 10.0

Word Processing – Windows

Format Version
AMI/AMI Professional

Versions through 3.1

Corel WordPerfect for Windows

Versions through 2002

JustWrite

Versions through 3.0

Legacy

Versions through 1.1

Lotus WordPro (NT on Intel only)

SmartSuite 96, 97, Millennium and Millennium 9.6

Lotus WordPro (all supported platforms except NT on Intel; Text only)

SmartSuite 97, Millennium, and Millennium 9.6

Microsoft Windows Works

Versions through 4.0

Microsoft Windows Write

Versions through 3.0

Microsoft Word 97

Word 97

Microsoft Word 2000

Word 2000

Microsoft Word 2002 (Office XP)

Word 2002

Microsoft Word for Windows

Versions through 7.0

Microsoft WordPad

All versions

Novell Perfect Works

Version 2.0

Novell WordPerfect for Windows

Versions through 7.0

Professional Write Plus

Version 1.0

Q&A Write for Windows

Version 3.0

Star Office Writer for Windows (Text only)

Version 5.2

WordStar for Windows

Version 1.0

Word Processing – Macintosh

Format Version
Microsoft Word

Versions 4.0 through 6.0

Microsoft Word 98

Word 98

WordPerfect

Versions 1.02 through 3.0

Microsoft Works

Versions through 2.0

MacWrite II

Version 1.1

Word Processing – Unix

Format Version
Star Office Writer for Windows

Version 5.2

Desktop Publishing

Format Version
Adobe FrameMaker

Version 6.0

Spreadsheets Formats

Format Version
Enable

Versions 3.0, 4.0 and 4.5

First Choice

Versions through 3.0

Framework

Version 3.0

Lotus 1-2-3 (DOS & Windows)

Versions through 5.0

Lotus 1-2-3 for SmartSuite

SmartSuite 97, Millennium, and Millennium 9.6

Lotus 1-2-3 Charts (DOS & Windows)

Versions through Millennium 9.6

Lotus 1-2-3 (OS/2)

Versions through 2.0

Lotus 1-2-3 Charts (OS/2)

Versions through 2.0

Lotus Symphony

Versions 1.0,1.1 and 2.0

Microsoft Excel 97

Excel 97

Microsoft Excel 2000

Excel 2000

Microsoft Excel 2002 (Office XP)

Excel 2002

Microsoft Excel Windows

Versions 2.2 through 7.0

Microsoft Excel Macintosh

Versions 3.0 – 4.0 and 98

Microsoft Excel Charts

Versions 2.x – 7.0

Microsoft Multiplan

Version 4.0

Microsoft Windows Works

Versions through 4.0

Microsoft Works (DOS)

Versions through 2.0

Microsoft Works (Mac)

Versions through 2.0

Mosaic Twin

Version 2.5

Novell Perfect Works

Version 2.0

QuattroPro for DOS

Versions through 5.0

QuttroPro for Windows

Versions through 2002

PFS:Professional Plan

Version 1.0

SuperCalc 5

Version 4.0

SmartWare II

Version 1.02

VP Planner 3D

Version 1.0

Databases Formats

Format Version
Access

Versions through 2.0

dBASE

Versions through 5.0

DataEase

Version 4.x

dBXL

Version 1.3

Enable

Versions 3.0, 4.0 and 4.5

First Choice

Versions through 3.0

FoxBase

Version 2.1

Framework

Version 3.0

Microsoft Windows Works

Versions through 4.0

Microsoft Works (DOS)

Versions through 2.0

Microsoft Works (Mac)

Versions through 2.0

Paradox (DOS)

Versions through 4.0

Paradox (Windows)

Versions through 1.0

Personal R:BASE

Version 1.0

R:BASE 5000

Versions through 3.1

R:BASE System V

Version 1.0

Reflex

Version 2.0

Q & A

Versions through 2.0

SmartWare II

Version 1.02

Display Formats

Format Version
PDF – Portable Document Format

Acrobat Versions 2.1, 3.0, 4.0, and 5.0 including Japanese PDF.

Presentation Formats

Format Version
Corel Presentations

Versions 8.0, 9.0 and 2002

Novell Presentations

Versions 3.0 and 7.0

Harvard Graphics for DOS

Versions 2.x & 3.x

Harvard Graphics

Windows versions

Freelance 96

Freelance 96

Freelance for Windows

SmartSuite 97, Millennium, and Millennium 9.6

Freelance for Windows

Version 1.0 and 2.0

Freelance for OS/2

Versions through 2.0

Microsoft PowerPoint for Windows

Versions through 7.0

Microsoft PowerPoint 97

PowerPoint 97

Microsoft PowerPoint 2000

PowerPoint 2000

Microsoft PowerPoint 2002 (Office XP)

PowerPoint 2002

Microsoft PowerPoint for Macintosh

Version 4.0 and 98

Standard Graphic Formats

The following table lists the graphic formats that the INSO filter recognizes. This means that indexing a text column that contains any of these formats produces no error. As such, it is safe for the column to contain any of these formats.

Note:
The INSO filter cannot extract textual information from graphics.

Format Version
Binary Group 3 Fax

All versions

BMP (including RLE, ICO, CUR & OS/2 DIB)

Windows

CALS Raster

Type 1 and II

CDR (if TIFF image is embedded in it)

Corel Draw version 2.0 – 9.0

CGM – Computer Graphics Metafile

ANSI, CALS, NIST, Version 3.0

DCX (multi-page PCX)

Microsoft Fax

DRW – Micrografx Designer

Version 3.1

DRW – Micrografx Draw

Version 4.0

DXF (Binary and ASCII) AutoCAD Drawing Interchange Format

Versions through 14

EMF

Windows Enhanced Metafile

EPS – Encapsulated PostScript

If TIFF image is embedded in it

FPX – Kodak Flash Pix

No specific version

GIF – Graphics Interchange Format

Compuserve

GP4 – Group 4 CALS format

Types I and II

HPGL – Hewlett Packard Graphics Language

Version 2.0

IMG – GEM Paint

No specific version

JFIF (JPEG not in TIFF)

All versions

JPEG

All versions

Novell Perfect Works (Draw)

Novell version 2.0

PBM – Portable Bitmap

No specific version

PCD – Kodak Photo CD

Version 1.0

PCX Bitmap

PC Paintbrush

PGM – Portable Graymap

No specific version

PIC

Lotus 1-2-3 Picture File Format – No Specific Version

PICT1 & PICT2 (Raster)

Macintosh Standard

PNG – Portable Network Graphics Internet Format

Version 1.0

PNTG

MacPaint

PPM – Portable Pixmap

No specific version

Progressive JPEG

No Specific version

PSP – Paintshop Pro (NT on Intel only)

Versions 5.0 and 5.0.1

SDW

Ami Draw

Snapshot (Lotus)

All versions

SRS – Sun Raster File Format

No specific version

Targa

Truevision

TIFF

Versions through 6

TIFF CCITT Group 3 & 4

Fax Systems

VISO

Visio 4 (Page Preview only), 5, 2000, 2002

WBMP

No Specific version

WMF

Windows Metafile

WordPerfect Graphics [WPG and WPG2]

Versions through 2.0

XBM – X-Windows Bitmap

x10 compatible

XPM – X-Windows Pixmap

x10 compatible

XWD – X-Windows Dump

x10 compatible

Other

Format Version
Executable (EXE, DLL)

No specific version

Executable for Windows NT

No specific version

Microsoft Project (Text only)

Project 98

MSG (Text only)

Microsoft Outlook mail format

vCard Electronic Business Card

Versit version 2.1

WML

Compatible with version 5.2

Unsupported Formats

Password protected documents and documents with password protected content are not supported by the Inso filter.

Rate This Article

(0 out of 1 people found this article helpful)

Leave A Comment?