Creating Search Pages with Index Server and .NET
INTRODUCTION
When Index Server was released, it was easy to see that it had potential. A developer could not only index document content, but could easily index document properties and meta-data. To create search pages and result pages, you could use fairly simple IDQ and HTX files. With the inception of ActiveServer Pages (ASP) in Internet Information Server (IIS) 3.0, Index Server 2.0 added server-side helper objects. Finally, OLE DB drivers were added in Index Server 3.0 which was shipped with Windows 2000 so that developers could query Index Server in the same manner that they would query a database. It is this technology that makes powerful custom search pages in ASP.NET easy.Using the Index Server OLE DB provider with ADO.NET allows us to data-bind results to common ASP.NET WebControls like the
DataGrid
, Repeater
, and more. The simple paging features of these WebControls is also an advantage over previous models when we had to manage paging the results ourselves with server-side conditional statements and multiple forms on a page to handle either POST or GET requests.This article will show you the techniques involved to create a query and result page in ASP.NET, to add data-bound WebControls to the page, and to allow for advanced query statements while protecting sensitive content.
Before you continue with this article, you should already be familiar with the basic architecture of Index Server, which can be found in the Platform SDK at MSDN. These features will not be discussed in any detail throughout the article.
CREATE THE SEARCH PAGE
Before designing your search page, you'll first want to consider what document properties and meta-data the user can search and how those results are to be displayed. Typically, the user is allowed to enter a query to search for all words or any words, or that uses boolean expressions, exact expressions, or even natural language expressions. The user should be allowed to select what type of expression their query is, and "All Words" is usually default. It is also good to provide your user a way to limit their search to a particular scope, such as directories containing individual products or departments. This also gives you the ability to use your search page as a target for a UserControl that could appear at the top of each page and sets the scope to match its parent directory. You should also let the user specify how the results are to be sorted and how many appear on a page.You'll also need to decide how to display the results to the user. Below is a common format and the format that will be used later in this article:
Hide Copy Code
<a href="[VPath]">[DocTitle]</a>
[Characterization]...
<i><a href="[VPath]">[SERVER_NAME][VPath]</a> - Last Modified: [Write]</i>
Knowing what your results will look like helps determine what control to use. You could use aDataList
or Repeater
, but then you would have to manage paging yourself. Instead, you might choose to use a DataGrid
with a TemplateColumn
. Using this approach, you can acheive the same result with the additional paging functionality for virtually free.INDEX SERVER AND ADO.NET
When Microsoft added an OLE DB provider to Index Server 3.0 in Windows 2000, they provided developers the means to query Index Server using the same ADO techniques they use to query databases like SQL Server, Oracle, Access, and many more. These techniques are similar in ADO.NET, except ADO.NET presents developers with more features and capabilities likeDataSet
s, or disconnected recordsets. While features of DataSet
s like primary keys, relationships, and identity columns aren't used in this article, they are worth mentioning.Supporting ADO and ADO.NET also means that we can query Index Server using SQL statements like any other database. If you take a look at the Indexing Service Reference on MSDN, you'll see that Index Server supports views, batched statements, and all the basic SQL commands you probably know. An example of a SQL statement to select the fields from the above layout would look like:
Hide Copy Code
SELECT Rank, VPath, DocTitle, Filename, Characterization, Write
FROM SCOPE('DEEP TRAVERSAL OF "/"')
WHERE NOT CONTAINS(VPath, '"_vti_" OR ".config"')
AND CONTAINS(Contents, '"keyword1" AND "keyword2"')
AND CONTAINS(DocTitle, '"keyword1" AND "keyword2"')
The SCOPE
and CONTAINS
keywords may be new to you. The SCOPE
function allows us limit our query to a particular directory or directories, and whether or not we want subdirectories included. Where's the database reference? That's implied from the SQL connection string:
Hide Copy Code
Provider=MSIDXS.1;Data Source=Web
The CONTAINS
and FREETEXT
predicates are available in both Index Server 3.0 and SQL Server 2000 (since both use the same Full-text providers) and allow a user to query a particular column (or even a table) for a keyword or combination of keywords using a boolean expression. Use the CONTAINS
predicate to search for the existence of such keywords or the FREETEXT
predicate for natural language searches.The SQL statement above is the basis for the other queries used in the example code, but you can easily extend this sample to search for documents modified or created after certain dates, articles by certain authors (without a lot of work, this currently only applies to Office documents), and much more. You can even search for properties of media files or create your own filters (see the IFilterdocumentation). Index Server will also index your custom
META
elements. For information about using custom filters, see Using Custom Filters with Indexing Service on MSDN.Warning: When designing opened-ended SQL statements, especially those used on the Internet by anonymous users, always be sure to not allow malicious statements to destroy your databases and catalogs. One of the biggest mistakes of webmasters and designers are statement templates like:
Hide Copy Code
SELECT * FROM Table1 WHERE Field1 =
A condition would then be appended to the statement. So what's wrong with this approach? All it takes is someone to enter a "term" like the following and your day is ruined:
Hide Copy Code
"asdf"; DELETE FROM Table1;
Think this is unlikely? Think again. Most modern databases support meta-data queries, giving users the ability to find out just about anything about the database and its structure. A user with malicious intent could use a couple queries and potentially delete all data from all your tables or event drop them. So, always design your SQL statement templates with care. Since Index Server supports batched queries separated by semi-colons (";"), I stripped-out any semi-colons so the statement "DELETE FROM
Table1" would become no more than three keywords by which to search. This is a very basic way and there are better ways like having more advanced parsing functions, but this is only example and this is left as an exercise for you - although this approach should stop about everything.You should also pay attention to the first condition in the
WHERE
clause. Adding this condition causes the query to not return search results for files or directories that contain the terms listed. Use the suggested terms above and add your own to keep a user from seeing sensitive files or directories like your application's "Web.config" file or FrontPage directories like "_vti_cnf". While anonymous users may not be able to view these files or browse these directories directly, they can view sensitive information from the characterizations (summaries) in their search results.BINDING SEARCH RESULTS TO A DATAGRID
Data-bound controls in ASP.NET are very powerful and can be used in many applications. For this exmaple, we'll bind aDataSet
from the query discussed above to a DataGrid
and use its powerful paging functionality to let users navigate through pages of the search results. From this point on, almost all the functionality of the example is given to us at minimal development costs.Remember the search result template at the beginning of the article? This is obviously not a columnar template, so how do we acheive that in a
DataGrid
? A TemplateColumn
allows us to find fields and expressions using HTML as we normally would. To tell the binding container to use a particular expression for binding, we use the <%# %>
data binding expression syntax. Such an expression would look like the following:
Hide Copy Code
<asp:hyperlink runat="server"
NavigateUrl='<%# DataBinder.Eval(Container.DataItem, "VPath")%>'
><%# GetTitle(Container.DataItem)%></asp:hyperlink>
This would display a hyperlink with the document path as the target and the appropriate document title or filename. GetTitle()
is a protected
method in our code-behind class that determines ifDocTitle
is available and - if not - returns the Filename
instead. As you can see, a binding expression isn't limited to a data-binding expression. See the sample source code for a complete example of the result display format given at the beginning of this article. Feel free to try different layouts or add additional columns, from BoundColumn
s for a single column to TemplateColumn
s for custom layouts like the example above. Make sure to set AutoGenerateColumns
to false
, however, otherwise all selected fields will be output in additional and prior to your user-defined bound columns.PAGING THE RESULTS
As mentioned before, using theDataGrid
data-bound control to display our search results gives us a paging technique for virtually free. Previous solutions in ASP or even with IDQ/HTX files either forced us to use large GET queries - which can be affected by length restrictions imposed by browsers' address bars - or have multiple forms contained in our page. ASP.NET only allows a page to have one server-side form, although these previous techniques could be accomplished with client-side forms and page-output statements. This can be tedious and ASP.NET presents us with better, object-oriented options.To handle the paging of data, set
AllowPaging
to true
and add a new event handler forDataGrid.PageIndexChanged
event and enter the code below, replacing dgResultsGrid
with whatever control name you use:
Hide Copy Code
private void dgResultsGrid_PageIndexChanged(object source,
DataGridPageChangedEventArgs e)
{
this.dgResultsGrid.CurrentPageIndex = e.NewPageIndex;
this.Search(); // A private method that actually starts the search.
}
This will set the start position of the bound source to the DataGrid.CurrentPageIndex
multiplied by the DataGrid.PageSize
.You can change the style of the paging used from a page-numbering system like the example uses, form or text buttons, or even custom navigation controls. See the DataGrid Web Server Controldocumentation on MSDN for more information.
SUMMARY
Using the ADO.NET techniques you already know and use to retrieve results from a database, you can build fast, powerful search pages without a lot of work like that which was required before. The OLE DB provider for Index Server offers you great flexibility and additional functions and predicates that add more great features to your query. Using easy-to-use data-bound controls with which you're already familiar like theDataGrid
allow you to display those results quickly and without a lot of code. Combining all three technologies with the simple techniques discussed above and in the sample source code can give your site powerful search capabilities in a short amount of time.
0 comments:
Post a Comment