« How to change business processes to support an ECM initiative? | Main | ECM Employment: The New Generation »

August 28, 2007

How does Enterprise Search really work?

Search_3Here’s an illustration from CMS Watch that we use in the new AIIM Information Organization & Access (IOA) Certificate program that shows the different search subsystems and how they work together.

First is content indexing, which is created by crawling directories and websites and extracting content from databases and other repositories. This has to be done on a regular basis, so if one of those repositories is updated the search engine will have some sort of procedure that enables it to go in and source and index that updated content.

So once it gathers all that content, as I mentioned, it creates an index. That is a searchable index of all the content. And oftentimes, there’s other value added processing, such as metadata extraction, and also auto-summarization. What exactly does that mean?

Well, many search tools will actually take the collection and group documents together into some sort of category. That in turn could be searched on a user could get the results based on how the particular search engine has categorized it. So once this index exists, there can be the acceptance of queries. So a searcher will then type in some sort of query as to what they’re looking for. And query is essentially not necessarily in question form; it’s just a term or whatever you’re looking for, typed into the search box.

And then there’s an engine that processes this query. The query passes over the index, finds the documents that match that particular term or subject, and then it returns those documents and it goes through some sort of processor. The processor will sort the documents by various items, so, relevance or it will cluster the documents based on the categorization, or some other logic. If you have best bets or recommended best content, whatever it might be. That’s really up to you how you want to process them once that query returns the content.

Then of course lastly, we have the formatting. And that’s the results page that you’re used to seeing. It formats the results, usually in some sort of template. And there, you also have a lot of flexibility as to how you’d like to see it presented. Now, every single step along this process, all of these subsystems can be tweaked to accommodate your particular information organization and access needs. The part at the top around content indexing, that’s where you’re going to be particularly occupied with your information organization. And how your content is organized is going to effect how well the search tool can go through the collection and create that index.

The second part is customizing your access experience, if you wish to do so. The search tool will allow you to specify what kinds of queries you want to accept, what kinds of documents you want to return based on those queries, and then you have lots of options as to how you want them processed and how you want them presented.

The new AIIM IOA training programs covers how to optimize Enterprise Search, and I recommend you to sign up for this program if you have, or plan to, invest in Enterprise Search technologies. For more information visit www.aiim.org/training

By Atle Skjekkeland.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2133782/21156997

Listed below are links to weblogs that reference How does Enterprise Search really work?:

Comments

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In

My Photo

Enter your email address:

Delivered by FeedBurner

About Authors

AIIM - The Latest News