| Information technologies have long settled in | | | | search engines or by search-based pages that |
| the corporate sector. Its a rare thing for | | | | started working on this technology 15-20 |
| the company not to have a well-organized | | | | years ago. Verity, iSYS and dtSearch |
| local network and various specialized | | | | companies, developing corporate search |
| software that would provide a proper control | | | | systems, make a good example. |
| of information flow, document storing and | | | | |
| information structuring with convenient | | | | Solutions |
| reports about the work process. | | | | |
| | | | Modern search technologies are based on two |
| Information Diversity | | | | root processes: indexing of available |
| | | | information and query processing followed by |
| Any companys information can be roughly | | | | display of results. What concerns the former, |
| divided into three types depending on its | | | | any program creates its own area of search. |
| virtual/ physical location and its use in the | | | | That is, it processes documents and creates |
| work process. Starting with files from the | | | | the index of those documents (an organized |
| user disk (plus electronic mail and logs of | | | | structure that contains information on the |
| various instant messaging programs, like ICQ | | | | processed data). Later on this created index |
| or MSN Messenger) and on to the corporate | | | | will be used by the program for quickly |
| information, the documents of different file | | | | getting the list of documents relating to the |
| types & electronic mail (MS Exchange, for | | | | query. |
| instance), or a file information archive on | | | | |
| the company server. And finally the data in | | | | Latest tests of software from dtSearch, ISYS, |
| various information systems: DMS, PDM, CRM, | | | | Verity, SearchInform and others have shown |
| etc. This may include everything from the | | | | their capacities to be quite amazing. The |
| system objects found in a file archive or in | | | | indexing speed was quite high (in some search |
| the database like MS SQL to external | | | | tools it even reaches 30 Gigabytes an hour) |
| electronic messages and documents used in the | | | | while the size of created index remaining |
| work of the system. | | | | small enough not to take up the whole of your |
| | | | drive space (SearchInform, for instance, |
| Search? | | | | makes 15-30% of clean text information |
| | | | volume). |
| Considering such a vast variety of | | | | |
| information, the conclusion follows that the | | | | Yet the requirements dont stop there. As weve |
| problem of information search has lately | | | | already figured out, one of the critical |
| become that of high priority. Common problems | | | | requirements is a precise and |
| with information search are physical data | | | | smoothly-running work with the local network. |
| volume, lack of proper organization of data | | | | In this case the corporate version of such |
| and a vast variety of file types containing | | | | tools as SearchInform, dtSearch, ISYS, Google |
| the needed data. As a result the demand for | | | | can offer a client-server architecture, |
| perfect search and information processing | | | | indexing files from all accessible (and if |
| tools keeps growing. However besides search | | | | theres administrators permission in all) |
| management (whether its a file archive, | | | | folders on all the computers in the local |
| corporate electronic mail or document | | | | network as well as indexing and subsequent |
| management system), there is quite a number | | | | search on all the connected network disks, |
| of other requirements that the corporate | | | | and user access management system based on |
| software has to comply with. This, obviously, | | | | NTFS authentication. That way the user can |
| includes working with local networks, which | | | | only search in the network resources that he |
| implies client-server software architecture; | | | | has permission to access. Of course its |
| compliance with information security policies | | | | possible that when functioning in a big |
| and user access management; as well as | | | | enterprise certain laps will occur from time |
| working not instead of some already installed | | | | to time, but from the technical point of view |
| system, but rather with it, without violating | | | | there are no complaints. |
| previously set business processes. Let us | | | | |
| look more carefully at these requirements. | | | | The third main requirement concerns working |
| | | | not only with the information on discs, but |
| Critical requirements to the corporate | | | | also with other sources of data. Standard |
| software | | | | packages of SearchInform and Verity for |
| | | | instance, include ability to index and search |
| Ability to work with the local network | | | | in MS Access databases. At that, the |
| implies client-server software architecture, | | | | procedure of connecting this data source is |
| flexible network policy settings, different | | | | just as simple as, say, work with the |
| types of operating system support, etc. One | | | | electronic letters or mail clients: you only |
| of the latest trends is having a | | | | need to select the data source (in this case |
| web-interface for the client part of the | | | | it would be MS Access database and show the |
| corporate software it rids of the problems | | | | program which fields should be indexed or |
| of additional workstations when extending the | | | | simply leave that to the program and itll |
| information structure. This version may be | | | | index all the fields automatically). The |
| more expensive because when using | | | | example with Access is only a single case. |
| web-interface the number of workstations is | | | | Its more than enough for any enterprise to |
| unlimited. Yet the choice between | | | | organize the search in all its information |
| web-interface and an independent client | | | | under the management of one program. |
| program depends solely on the needs and | | | | |
| problems to be solved by the software being | | | | Now lets look at the search capacities and |
| purchased. | | | | functions. First of all its the number of |
| | | | supported file formats: most search engines |
| The next critical factor of search softwares | | | | index standard formats like txt, doc, rtf, |
| work within the company is compliance with | | | | html, CHM, Open Office etc.; a few also |
| the information security policies and access | | | | support multimedia files (audio and video), |
| management. Any information system should be | | | | various specialized programmers formats, a |
| a structure with clearly defined channels of | | | | dozen archive types and in logs of instant |
| information exchange both between the users | | | | messaging programs (MSN Messenger, ICQ, |
| and with the outside world. Thus any | | | | Trillian). |
| corporate software must measure up to the | | | | |
| strict requirements of information security: | | | | Standard phrase search usually includes |
| user access differentiation, multi-level | | | | search with due consideration to stemming and |
| access to different sorts of information, | | | | synonyms, fuzzy search (with mistakes) and |
| authorization system and a flexible structure | | | | phrase search or search by separate words |
| of the security policies adjustment depending | | | | that the phrase contains, search by |
| on the clients query. | | | | attributes, etc. In reality the main features |
| | | | that should be used are, of course, stemming |
| Another factor is the feature of corporate | | | | search and search in found. |
| software that lets you work with companys | | | | |
| previously installed software products of | | | | However in each search tool there are some |
| various types. As it has already been | | | | peculiarities that shouldnt be left without |
| mentioned, the information in any | | | | due attention. Copernic, for instance, offers |
| organization can be stored in files both on | | | | an interesting search system where the user |
| disc or in DBMS and in various information | | | | can select the type of file (graphics, audio, |
| systems (whether its PDM, CRM or an | | | | video etc.), enter search query and pick the |
| accounting program). That is exactly why the | | | | features common only for that particular file |
| third feature of any information system is | | | | type. For instance, for audio files it might |
| the ability to function not instead of an | | | | be the features of mp3 tags (singer, album, |
| already existing in the company software, but | | | | data etc.), for graphics you can choose their |
| rather simultaneously with it. Its even more | | | | size (by extension). Afterwards quite an |
| crucial for the corporate search engine | | | | extensive list of information appears in the |
| because organization of search from all | | | | result window and if files of types different |
| companys information resources is the main | | | | from your specification also happened to fit |
| goal of the nominal search software | | | | the query, you can open them as well by |
| application. | | | | clicking on a certain link. |
| | | | |
| Search Functions | | | | ISYS Desktop offers templates for creating |
| | | | index by folder: My Documents, Mail, Specific |
| Besides the listed requirements, which put | | | | Folder, Folder with the choice of file types |
| various search systems on the same level with | | | | etc. and if when creating your index you |
| the corporate software, there are also | | | | checked Folder with the choice of file types, |
| requirements to the functional capacities of | | | | you have an option of choosing types of files |
| this software. That is, directly to the main | | | | for manual indexing (by extension). The |
| functions of the program, responsible for | | | | program also lets you sort documents by |
| that very high speed and efficient search, | | | | certain criteria (by default they are sorted |
| the demand for which only grows. | | | | by relevancy) and look thought already found |
| | | | files selecting separate folders (especially |
| Firstly, the old generation of straight | | | | convenient when the result displays a big |
| search (simple blind search) and search | | | | number of documents). |
| strictly by document attributes is replaced | | | | |
| by the full-text search with a prior | | | | A unique feature of dtSearch is sound search, |
| indexing. Its more than convenient as its | | | | which is something totally untypical even for |
| faster even when the search process is a | | | | professional search engines. The main catch |
| dozen times more complicated. | | | | is that the program will look for words that |
| | | | sound similar to the query exceptionally |
| Secondly, its the support of different file | | | | useful when searching in recorded calls |
| formats (both widely used and specialized | | | | database. |
| ones) as well as flawless work with various | | | | |
| types of DBMS, information systems etc. This | | | | SearchInform is known for its search for |
| list shouldnt neglect irreplaceable means of | | | | documents similar in their content to the |
| electronic mail (TheBat! or MS Exchange, for | | | | query text, so to say similar search. This |
| instance) and instant messaging programs like | | | | type of search is a lot more intellectual |
| ICQ or MSN Messenger. Another must-have | | | | than simple phrase search. In actual practice |
| attribute of a quality program is a set of | | | | it helps solve quite a few problems, like |
| search features: various types of search (by | | | | those related to the duration of the search |
| phrase or by separate words), search with due | | | | session, for example (continuously having to |
| consideration to stemming and synonyms and so | | | | pick new keywords, looking over and over and |
| on and so forth. And, of course, specifically | | | | comparing all the documents already existing |
| for the corporate sector with its gigantic | | | | in the companys database to see whether there |
| volumes of information, high performance | | | | are duplicates, etc.). The practice shows |
| speed (both in data indexing and in the | | | | that combining simple phrase search and |
| search itself) are not just wants, but | | | | similar document search allows you to |
| needs. | | | | successfully and with a greater benefit apply |
| | | | the full-text search software in information |
| Progress looking for compromise | | | | systems from DMS to ERP and PDM. |
| | | | |
| Now that we are clear on the requirements | | | | All in all, tools like dtSearch and ISYS |
| imposed on corporate search software, the | | | | mostly target the average business, while |
| only thing left is to actually find a program | | | | SearchInform and Verity find their market |
| system that would meet these requirements. | | | | namely in the corporate sector. Copernic |
| Obviously, its impossible to satisfy all the | | | | doesnt quite suit the corporate sector and is |
| needs without exception therell always be | | | | best put to use on the home PC, so a speaking |
| black wholes, lack of functions, bugs, which | | | | name of Desktop Search reserves the field of |
| will either have to be dealt with or covered | | | | desktop search engines for it. Google is also |
| with add-on programs. Thus we can forget | | | | a player on the market, although its |
| about the ideal, nothing stands still...that | | | | developers do not prioritize the corporate |
| which seemed perfect yesterday may very well | | | | sector and their key area of development is |
| be discarded before tonight is over. | | | | still the Internet search. |
| | | | |
| In general, developments in the field of | | | | Thus there are quite a few solutions to |
| full-text search are in full bloom these | | | | choose from when solving the essential today |
| days: Internet leading (Google being the | | | | problem of corporate search. Most of the |
| evidence) while the corporate sector is | | | | mentioned tools are able to live up and |
| catching up. All these developments are | | | | satisfy at least the nominal demands of the |
| mainly conducted either by companies that | | | | corporate user; the game here depends on what |
| have recently developed into popular online | | | | it is you are looking for. |