Τεχνικά Χαρακτηριστικά

World Lexer Features

Like MULTI_LEXER, the WORLD_LEXER lexer enables you to index documents that contain different languages; however, it automatically detects the languages of a document and so does not require you to create a language column in the base table.

WORLD_LEXER processes most languages whose characters are defined as part of Unicode 4.0. For WORLD_LEXER to be effective, documents with multiple languages must use AL32UTF-8 or UTF8 Oracle character set encoding (including supplementary, or "surrogate-pair," characters).

Table D-2 and Table D-3 show the languages supported by WORLD_LEXER. Note: this list may change as the Unicode standard changes, and in any case should not be considered exhaustive. (Languages are group by Unicode writing system, not by natural language groupings.)

Table D-2 Languages Supported by the World Lexer (Space-separated)

 Language  Group

 Languages Include

 Arabic

 Arabic, Farsi, Kurdish, Pashto, Sindhi, Urdu

 Armenian

 Armenian

 Bengali

 Assamese, Bengali

 Bopomofo

 Hakka Chinese, Minnan Chinese

 Cyrillic

 Over 50 languages, including Belorussian, Bulgarian, Macedonian, Moldavian, Russian, Serbian, Serbo-Croatian, Ukrainian

Devenagari

 Bhojpuri, Bihari, Hindi, Kashmiri, Marathi, Nepali, Pali, Sanskrit

 Ethiopic

 Amharic, Ge'ez, Tigrinya, Tigre

 Georgian

 Georgian

 Greek

 Greek

 Gujarati

 Gujarati, Kacchi

 Gurmukhi

 Punjabi

 Hebrew

 Hebrew, Ladino, Yiddish

 Kaganga

 Redjang

 Kannada

 Kanarese, Kannada

 Korean

 Korean, Hanja Hangul

 Latin

 Afrikaans, Albanian, Basque, Breton, Catalan, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Faeroese, Fijian, Finnish, Flemish, French,  Frisian, German, Hawaiian, Hungarian, Icelandic, Indonesian, Irish, Italian, Lappish, Classic Latin, Latvian, Lithuanian, Malay, Maltese, Pinyin Mandarin,  Maori, Norwegian, Polish, Portuguese, Provencal, Romanian, Rumanian, Samoan, Scottish Gaelic, Slovak, Slovene, Slovenian, Sorbian, Spanish,  Swahili, Swedish, Tagalog, Turkish, Vietnamese, Welsh

 Malayalam

 Malayalam

 Mongolian

 Mongolian

 Oriya

 Oriya

 Sinhalese,  Sinhala

 Pali, Sinhalese

 Syriac

 Aramaic, Syriac

 Tamil

 Tamil

 Telugu

 Telugu

 Thaana

 Dhiveli, Divehi, Maldivian

 

 

Operators

ABOUT Operator

Use the ABOUT operator to query on concepts. The system looks up concept information in the theme component of the index.

This feature is supported for English and French with CONTEXT indexes only.

Fuzzy Operator

This operator enables you to search for words that have similar spelling to specified word. Text supports fuzzy for English, German, Italian, Dutch, Spanish, Japanese, Optical Character recognition (OCR), and automatic language detection.

Stem Operator

This operator enables you to search for words that have the same root as the specified term. For example, a stem of $sing expands into a query on the wordssang, sung, sing. The Text stemmer supports the following languages: English, French, Spanish, Italian, German, Japanese and Dutch.

Supplied Stop Lists

A stoplist is a list of words that do not get indexed. These are usually common words in a language such as thisthat, and can in English.

Text provides a default stoplist for English, Chinese (traditional and simplified), Danish, Dutch, Finnish, French, German, Italian, Portuguese, Spanish, and Swedish.  lists the stoplists for various languages.

Knowledge Base

A Text knowledge base is a hierarchical tree of concepts used for theme indexing, ABOUT queries, and deriving themes for document services.

Text supplies knowledge bases in English and French only.

Knowledge Base Extension

You can extend theme functionality to languages other than English or French by loading your own knowledge base for any single byte white space delimited language, including Spanish.

Multi-Lingual Features Matrix

The following table summarizes the multilingual features for the supported languages.

Multilingual Features for Supported Languages

 LANGUAGE

 ALTERNATE SPELLING

 FUZZY MATCHING

 LANGUAGE SPECIFIC LEXER

 DEFAULT STOP LIST

 STEMMING

 ENGLISH

 N/A

 Yes

 Yes

 Yes

 Yes

 GERMAN

 Yes

 Yes

 Yes

 Yes

 Yes

 JAPANESE

 N/A

 Yes

 Yes

 No

 Yes

 FRENCH

 N/A

 Yes

 Yes

 Yes

 Yes

 SPANISH

 N/A

 Yes

 Yes

 Yes

 Yes

 ITALIAN

 N/A

 Yes

 Yes

 Yes

 Yes

 DUTCH

 N/A

 Yes

 Yes

 Yes

 Yes

 PORTUGUESE

 N/A

 Yes

 Yes

 Yes

 No

 KOREAN

 N/A

 No

 Yes

 No

 No

 SIMPLIFIED CHINESE

 N/A

 No

 Yes

 Yes

 No

 TRADITIONAL CHINESE

 N/A

 No

 Yes

 Yes

 No

 DANISH

 Yes

 No

 Yes

 No

 No

 SWEDISH

 Yes

 No

 Yes

 Yes

 No

 FINNISH

 N/A

 No

 Yes

 No

 No

 

 

 

Υποστηριζόμενοι τύποι αρχείων

 

Format

Version

Adobe FrameMaker (MIF)

Versions 3.0, 4.0, 5.0, and 6.0 and Japanese 3.0, 4.0, 5.0, and 6.0 (text only)

ANSI Text

7 and 8 bit

ASCII Text

7 and 8 bit

DEC WPS Plus (DX)

Versions through 3.1

DEC WPS Plus (WPL)

Versions through 4.1

DisplayWrite 2 and 3 (TXT)

All versions

EBCDIC

All versions

Enable

Versions 3.0, 4.0, and 4.5

First Choice

Versions through 3.0

Framework

Version 3.0

Hangul

Versions 97, 2002, and 2005

IBM FFT

All versions

IBM Revisable Form Text

All versions

IBM Writing Assistant

Version 1.01

Just System Ichitaro

Versions 4.x through 6.x, 8.x through 13.x and 2004

JustWrite

Versions through 3.0

Legacy

Versions 1.1

Lotus AMI/AMI Professional

Versions 3.1

Lotus Manuscript

Version 2.0

Lotus Word Pro (non-Windows)

Versions SmartSuite 97, Millennium, and Millennium 9.6 (text only)

Lotus Word Pro (Windows)

Versions SmartSuite 96, 97, and Millennium and Millennium 9.6

MacWrite II

Version 1.1

MASS11

Versions through 8.0

Microsoft Rich Text Format (RTF)

All versions

Microsoft Word (DOS)

Versions through 6.0

Microsoft Word (Mac)

Versions 4.0 - 2004

Microsoft Word (Windows)

Versions through 2007

Microsoft WordPad

All versions

Microsoft Works (DOS)

Versions through 2.0

Microsoft Works (Mac)

Versions through 2.0

Microsoft Works (Windows)

Versions through 4.0

Microsoft Windows Write

Versions through 3.0

MultiMate

Versions through 4.0

Navy DIF

All versions

Nota Bene

Version 3.0

Novell Perfect Works

Version 2.0

Novell/Corel WordPerfect (DOS)

Versions through 6.1

Novell/Corel WordPerfect (Mac)

Versions 1.02 through 3.0

Novell/Corel WordPerfect (Windows)

Versions through 12.0

Office Writer

Versions 4.0 - 6.0

OpenOffice Writer (Windows and UNIX)

OpenOffice version 1.1 and 2.0

PC-File Letter

Versions through 5.0

PC-File+ Letter

Versions through 3.0

PFS:Write

Versions A, B, and C

Professional Write Plus (Windows)

Version 1.0

Q&A (DOS)

Version 2.0

Q&A Write (Windows)

Version 3.0

Samna Word

Versions through Samna Word IV+

Signature

Version 1.0

SmartWare II

Version 1.02

Sprint

Versions through 1.0

StarOffice Writer

Version 5.2 (text only) and 6.x through 8.x

Total Word

Version 1.2

Unicode Text

All versions

UTF-8

All versions

Volkswriter 3 and 4

Versions through 1.0

Wang PC (IWP)

Versions through 2.6

WordMARC

Versions through Composer Plus

WordStar (Windows)

Version 1.0

WordStar 2000 (DOS)

Versions through 3.0

XyWrite

Versions through III Plus

 

Spreadsheet Formats

Format

Version

Enable

Versions 3.0, 4.0, and 4.5

First Choice

Versions through 3.0

Framework

Version 3.0

Lotus 1-2-3 (DOS & Windows)

Versions through 5.0

Lotus 1-2-3 (OS/2)

Versions through 2.0

Lotus 1-2-3 Charts (DOS & Windows)

Versions through 5.0

Lotus 1-2-3 for SmartSuite

Versions 97 - Millennium 9.6

Lotus Symphony

Versions 1.0, 1.1, and 2.0

Mac Works

Version 2.0

Microsoft Excel Charts

Versions 2.x - 7.0

Microsoft Excel (Mac)

Versions 3.0 - 4.0, 98, 2001, 2002, 2004, and v.X

Microsoft Excel (Windows)

Versions 2.2 through 2007

Microsoft Multiplan

Version 4.0

Microsoft Works (Windows)

Versions through 4.0

Microsoft Works (DOS)

Versions through 2.0

Microsoft Works (Mac)

Versions through 2.0

Mosaic Twin

Version 2.5

Novell Perfect Works

Version 2.0

PFS:Professional Plan

Version 1.0

Quattro Pro (DOS)

Versions through 5.0 (text only)

Quattro Pro (Windows)

Version through 12.0 (text only)

SmartWare II

Version 1.02

StarOffice/OpenOffice Calc (Windows and UNIX)

StarOffice versions 5.2 (text only) through 8.x and OpenOffice version 1.1 and 2.0

SuperCalc 5

Version 4.0

VP Planner 3D

Version 1.0

 

Presentation Formats

Format

Version

Corel/Novell Presentations

Versions through 12.0

Harvard Graphics (DOS)

Versions 2.x and 3.x

Harvard Graphics (Windows)

Windows versions

Freelance (Windows)

Versions through Millennium 9.6

Freelance (OS/2)

Versions through 2.0

Microsoft PowerPoint (Windows)

Versions 3.0 through 2007

Microsoft PowerPoint (Mac)

Versions 4.0 through v.x

StarOffice/OpenOffice Impress (Windows and UNIX)

StarOffice versions 5.2 (text only) and 6.x through 8.x (full support) and OpenOffice version 1.1 and 2.0 (text only)

 


 

Database Formats

Format

Version

Access

Versions through 2.0

dBASE

Versions through 5.0

DataEase

Version 4.x

dBXL

Version 1.3

Enable

Versions 3.0, 4.0, and 4.5

First Choice

Versions through 3.0

FoxBase

Version 2.1

Framework

Version 3.0

Microsoft Works (Windows)

Versions through 4.0

Microsoft Works (DOS)

Versions through 2.0

Microsoft Works (Mac)

Versions through 2.0

Paradox (DOS)

Versions through 4.0

Paradox (Windows)

Versions through 1.0

Personal R:BASE

Version 1.0

R:BASE 5000

Versions through 3.1

R:BASE System V

Version 1.0

Reflex

Version 2.0

Q & A

Versions through 2.0

SmartWare II

Version 1.02

 

Archive File Format

When filtering an archive file, all the contents of the files inside the archive will be exported to a single output file. This will also include the contents of all subfolders and files inside the archive file.

Supported Archive File Formats

Format

Version

GZIP

 

Microsoft Binder

Versions 7.0 - 97 (conversion of files contained in the Binder File is supported only on Windows)

UUEncode

 

UNIX Compress

 

UNIX Tar

 

ZIP

PKWARE versions through 2.04g

LZA Self-Extracting Compress

 

LZH Compress

 

 


 

Email Formats

Format

Version

Microsoft Outlook Folder (PST)

Microsoft Outlook Folder and Microsoft Outlook Offline Folder files versions 97, 98, 2000, 2002, 2003, and 2007

Microsoft Outlook Message (MSG)

Microsoft Outlook Message and Microsoft Outlook Form Template versions 97, 98, 2000, 2002, 2003, and 2007

MIME

MIME-encoded mail messages.

 

MIME Support Notes

The following formats are supported:

  • MIME formats
    • EML
    • MHT (Web Archive)
    • NWS (Newsgroup single-part and multi-part)
    • Simple Text Mail (defined in RFC 2822)
  • TNEF format
  • MIME encodings, including
    • base64 (defined in RFC 1521)
    • binary (defined in RFC 1521)
    • binhex (defined in RFC 1741)
    • btoa
    • quoted-printable (defined in RFC 1521)
    • utf-7 (defined in RFC 2152)
    • uue
    • xxe
    • yenc

In addition, the body of a message can be encoded in several ways. The following encodings are supported:

  • HTML
  • RTF
  • TNEF
  • Text/enriched (defined in RFC 1523)
  • Text/richtext (defined in RFC1341)
  • Embedded mail message (defined in RFC 822) - this is handled as a link to a new message

The attachments of a MIME message can be stored in many formats.

Other Formats

Format

Version

Executable (EXE, DLL)

 

HTML

Versions through 3.0, with some limitations

MacroMedia Flash

Macromedia Flash 6.x, MacroMedia Flash 7.x, and MacroMedia Flash Lite (text only)

Microsoft Project

Versions 98 - 2003 (text only)

MP3

ID3 information

vCard, vCalendar

Version 2.1

Windows Executable

 

WML

Version 5.2

XML

Text only

Yahoo Instant

 

 

 

 


 

Graphic Format

 

The following table lists the graphic formats that the AUTO_FILTER filter recognizes. This means that indexing a text column that contains any of these formats produces no error. As such, it is safe for the column to contain any of these formats.

Formats are categorized as either embedded graphics or standalone graphics. Embedded graphics are inserted or referenced within a document.

Note:

The AUTO_FILTER filter cannot extract textual information from graphics.

 

Supported Graphic Formats

Format

Version

Adobe Photoshop (PSD)

Version 4.0

Adobe Illustrator

Versions 7.0 and 9.0

Adobe FrameMaker graphics (FMV)

Vector/raster through 5.0

Adobe Acrobat (PDF)

Versions 1.0, 2.1, 3.0, 4.0, 5.0, 6.0, and 7.0 (including Japanese PDF)

Ami Draw (SDW)

Ami Draw

AutoCAD Interchange and Native Drawing formats (DXF and DWG)

AutoCAD Drawing Versions 2.5 - 2.6, 9.0-14.0, 2000i and 2002

AutoShade Rendering (RND)

Version 2.0

Binary Group 3 Fax

All versions

Bitmap (BMP, RLE, ICO, CUR, OS/2 DIB, and WARP)

All versions

CALS Raster (GP4)

Type I and Type II

Corel Clipart format (CMX)

Versions 5 through 6

Corel Draw (CDR)

Versions 3.x - 8.x

Corel Draw (CDR with TIFF header)

Versions 2.x - 9.x

Computer Graphics Metafile (CGM)

ANSI, CALS NIST version 3.0

Encapsulated PostScript (EPS)

TIFF header only

GEM Paint (IMG)

All versions

Graphics Environment Mgr (GEM)

Bitmap and vector

Graphics Interchange Format (GIF)

All versions

Hewlett Packard Graphics Language (HPGL)

Version 2.0

IBM Graphics Data Format (GDF)

Version 1.0

IBM Picture Interchange Format (PIF)

Version 1.0

Initial Graphics Exchange Spec (IGES)

Version 5.1

JBIG2

JBIG2 graphic embeddings in PDF files

JFIF (JPEG not in TIFF format)

All versions

JPEG (including EXIF)

All versions

Kodak Flash Pix (FPX)

All versions

Kodak Photo CD (PCD)

Version 1.0

Lotus PIC

All versions

Lotus Snapshot

All versions

Macintosh PIC1 and PICT2

Bitmap only

MacPaint (PNTG)

All versions

Micrografx Draw (DRW)

Versions through 4.0

Micrografx Designer (DRW)

Versions through 3.1

Micrografx Designer (DFS)

Windows 95, version 6.0

Novell PerfectWorks (Draw)

Version 2.0

OS/2 PM Metafile (MET)

Version 3.0

Paint Shop Pro 6 (PSP)

Windows only, versions 5.0 - 6.0

PC Paintbrush (PCX and DCX)

All versions

Portable Bitmap (PBM)

All versions

Portable Graymap (PGM)

No specific version

Portable Network Graphics (PNG)

Version 1.0

Portable Pixmap (PPM)

No specific version

Postscript (PS)

Levels 1-2

Progressive JPEG

No specific version

Sun Raster (SRS)

No specific version

StarOffice/OpenOffice Draw for Windows and UNIX

StarOffice versions 5.2 (text only) through 8.x and OpenOffice version 1.1 and 2.0

TIFF

Versions through 6

TIFF CCITT Group 3 and 4

Versions through 6

Truevision TGA (TARGA)

Version 2

Visio (preview)

Version 4

Visio

Versions 5, 2000, 2002, and 2003

WBMP

No specific version

Windows Enhanced Metafile (EMF)

No specific version

Windows Metafile (WMF)

No specific version

WordPerfect Graphics (WPG and WPG2)

Versions through 2.0

X-Windows Bitmap (XBM)

x10 compatible

X-Windows Dump (XWD)

x10 compatible

X-Windows Pixmap (XPM)

x10 compatible

 

Graphics Formats Limitations

AutoCAD drawing files are not supported on IBM AIX.

Δεν μπορούμε να βρούμε προϊόντα που να ταιριάζουν στην επιλογή.
To Top