Monday, October 29, 2007

tutorial: advanced lucene.NET usage example

this post applies to Sitecore 5.3.1

After the previous post I wrote about the lucene.net search implementation I've had tons of questions about the search and indexes and almost everything inbetween, so I thought I'd make another post on this subject.. this post however uses more of the functionality that's already available and another, more advanced, approach to searching and displaying the results.

(a lot of this code is based on/taken from the simplesearch implemented in sitecore)

what you'll end up with is a sublayout somewhat similar to the following screenshot that you can use on your website(s):





lucence.net advanced search sublayout


this tutorial will cover the following steps:

  • create a new custom index that indexes data from the web database based on a certain template and indexes selected fields

  • create a sublayout that uses the index to search and render output
Step 1: Create the index
Add the following to the web.config file within the <indexes> section:

<!-- Custom Web Index (created as an example) -->
<index id="webindex" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<templates hint="list:AddTemplate">
<template>Sample Item</template>
</templates>
<fields hint="raw:AddField">
<field>title</field>
<field storage="unstored">text</field>
</fields>
</index>

Next, locate the definition for the Web database (within the <databases> section) and add the following to that definition after the proxydataprovider one:


<indexes hint="list:AddIndex">
<index path="indexes/index[@id='webindex']" />
</indexes>
<Engines.HistoryEngine.Storage>
<obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.$(database)">
<param desc="connection" ref="connections/$(id)">
</param>
<EntryLifeTime>30.00:00:00</EntryLifeTime>
</obj>
</Engines.HistoryEngine.Storage>

Step 2: Create the sublayout

Create a new sublayout/usercontrol and add the following elements:

  • SearchTextBox - Textbox
  • SearchButton - Button
  • SearchResultsPanel - Panel
  • lblStatus - Label
Now, hook up the click event of the button to do the following (note that "webindex" & "web" defines the indexname and database to search in):
AdvancedSearch(SearchTextBox.Text, "webindex", "web");
and here's the code for the AdvancedSearch() method:

/// <summary>
/// Searches for a specified string using the built-in lucene.net engine
/// with advanced functionality as like the one seen in sitecore when
/// performing a search..
/// </summary>
/// <param name="searchstring">the string to search for</param>
/// <param name="indexname">the name of the index</param>
/// <param name="database">the database to perform the search within</param>
private void AdvancedSearch(string searchstring, string indexname, string database)
{
try
{
// clear output holders..
this.SearchResultsPanel.Controls.Clear();
this.lblStatus.Text = "";

// make sure we don't do unwanted empty searches..
if (SearchTextBox.Text == string.Empty)
{
this.lblStatus.Text = "please specify your search..";
return;
}

// find the proper culture when comparing later..
System.Globalization.CultureInfo culture = Sitecore.Context.Culture;
if (culture.IsNeutralCulture)
{
culture = System.Globalization.CultureInfo.CreateSpecificCulture(culture.Name);
}

// timer to use when calculating time taken
HighResTimer timer = new HighResTimer(true);

// get the specified index
Index searchIndex = Sitecore.Configuration.Factory.GetIndex(indexname);
// get the database to perform the search in..
Database db = Sitecore.Configuration.Factory.GetDatabase(database);
// get a designated indexsearcher that exposes more functionality..
IndexSearcher searcher = searchIndex.GetSearcher(db);
// get a new standard analyser so we can create a query..
Analyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer();
Query query = Lucene.Net.QueryParsers.QueryParser.Parse(searchstring, "_content", analyzer);
// perform the search and get the results back as a Hits list..
Hits hits = searcher.Search(query);
// final timer for calculating time taken
double timeElapsed = timer.Elapsed();

// output friendly message about how many hits, time taken etc..
this.lblStatus.Text = string.Format(Sitecore.Globalization.Translate.Text("Found {0} {1} that matched query '{2}' ({3}{4})"), new object[] { hits.Length(), (hits.Length() == 1) ? Sitecore.Globalization.Translate.Text("document") : Sitecore.Globalization.Translate.Text("documents"), searchstring, timeElapsed.ToString("0.00"), Sitecore.Globalization.Translate.Text(" ms") });

// new stringbuilder that we'll be adding the content to prior to final output
StringBuilder sb = new StringBuilder();
// a new highlighter that gives us some abstract text of the item with the hits highlighted
Highlighter highlighter = new Highlighter(new QueryScorer(query));

// step through each result and format it before returning it to the client
for (int i = 0; i < hits.Length(); i++)
{
// get the actual item
Item itm = Index.GetItem(hits.Doc(i), db);
if (itm != null)
{
string retStr = string.Empty;
// get all the fields of the item..
Sitecore.Collections.FieldCollection fields = itm.Fields;
// .. and step through them so we'll be able to show where the hit was found
for (int j = 0; j < fields.Count; j++)
{
Sitecore.Data.Fields.Field field = itm.Fields[j];
if (field != null)
{
string fieldname = field.DisplayName;
if (string.IsNullOrEmpty(fieldname))
{
fieldname = Sitecore.Globalization.Translate.Text("[Unknown field]");
}
string s = StringUtil.RemoveTags(field.Value);
TokenStream tokenStream = analyzer.TokenStream(new System.IO.StringReader(s));
// use the highlighter to try and get highlighted hit in the text
string highlightedText = highlighter.GetBestFragments(tokenStream, s, 3, "...");
string formattedOutput = retStr;
if (highlightedText.Length > 0)
{
retStr = formattedOutput + "<div><span class=\"scField\">" + fieldname + ":</span> \"" + highlightedText + "\"</div>";
}
else if (s.IndexOf(searchstring, StringComparison.CurrentCultureIgnoreCase) >= 0)
{
retStr = formattedOutput + "<div><span class=\"scField\">" + fieldname + ":</span> \"" + StringUtil.Clip(s, 0x40, true) + "\"</div>";
}
}
}
string updated = itm.Statistics.Updated.ToString("d", culture);
string nameandversion = itm.Language.CultureInfo.DisplayName + ", " + itm.Version;
sb.Append("<div style=\"padding:8px 0px 8px 0px\"><a href=\"" + itm.Paths.GetFriendlyUrl(true) + "\" class=\"scResult\">" + Sitecore.Resources.Images.GetImage(itm.Appearance.Icon, 0x10, 0x10, "absmiddle", "0px 4px 0px 0px") + itm.DisplayName + "</a><br/>" + retStr + "<div class=\"scResultInfo\">" + itm.Paths.Path + "[" + nameandversion + "] - " + updated + "</div></div>");
}
else
{
sb.Append("<div class=\"scNotFound\" style=\"padding:8px 0px 8px 0px\">" + Sitecore.Resources.Images.GetImage("Applications/16x16/error.png", 0x10, 0x10, "absmiddle", "0px 4px 0px 0px") + Sitecore.Globalization.Translate.Text("Item not found") + "</div>");
}
}
this.SearchResultsPanel.Controls.Add(new LiteralControl(sb.ToString()));
searcher.Close();
}
catch (Exception exception)
{
this.SearchResultsPanel.Controls.Add(new LiteralControl(exception.Message));
}
}

If things go wrong: make sure you have set up the index correctly and that the index is created in the /indexes folder of your installation. to manually trigger the reindexing go via the databases option in the control panel in sitecore (sitecore menu > control panel).

the full source code for this post and the previous post will be made available here later on, for now you can email or comment if you want the details of the code sent to you..

feel free to comment or email if you have any ideas or questions.

Regards,

P.

Tuesday, October 23, 2007

Tutorial: implementing lucene.NET search

The following code implements a lucene.NET search routine, ready to run: C# Code
/// <summary> /// Searches for a specified string using the built-in lucene.net engine /// </summary> /// <param name="searchString">the string to search for</param> /// <param name="indexName">the name of the index</param> /// <param name="databaseName">the database to perform the search within</param> /// <returns>System.Collections.Generic.List<Sitecore.Data.Items.Item></returns> public List<Item> Search(string searchString, string indexName, string databaseName) { // initially set up the returning results list List<Item> results = new List<Item>(); // make sure string is not empty prior to starting the search if (searchString != string.Empty) { // get the specified index Index searchIndex = Sitecore.Configuration.Factory.GetIndex(indexName); // allocate a collection of hits.. Hits hits = null; // get the database to perform the search in.. Database db = Sitecore.Configuration.Factory.GetDatabase(databaseName); try { // run the search.. hits = searchIndex.Search(searchString, db); } catch (Exception ex) { // log error message to the sitecore log file.. Sitecore.Diagnostics.Log.Error("Custom Search failed with the following message: " + ex.Message, this); // .. and return null.. return null; } // iterate thru the hits we got from the search for (int i = 0; i < hits.Length(); i++) { // get a document referrer.. Document document = hits.Doc(i); // .. so we can get the id.. string itemID = document.Get("_docID"); // .. so we can get a pointer to the item in itself.. ItemPointer pointer = ItemPointer.Parse(itemID); // .. so we can get the actual item.. Item itm = Sitecore.Configuration.Factory.GetDatabase(databaseName).Items[pointer.ItemID, pointer.Language, pointer.Version]; // .. so we finally can add the item to the returning list if (itm != null) { results.Add(itm); } }
Usage example searching for "sample" in the system index in the master database:
System.Collections.Generic.List<Sitecore.Data.Items.Item> searchresults = Search("sample", "system", "master");
Extended usage example To use it on a site you could for example create a sublayout with a simple textbox and button, then hook it up to the routine and you'd be set to go.. something like: (this example assumes the existance of the SearchButton and SearchTextBox)
protected void SearchButton_Click(object sender, EventArgs e) { // get search, usually directly from a textbox string search = SearchTextBox.Text.Trim(); // try to get search results from a specified index List<Item> searchresults = Search(search, "system", "master"); // validate existing result prior to moving on if (searchresults!=null && searchresults.Count>0) { // step thru the items we've found in the search foreach (Item itm in searchresults) { // output the search results.. Response.Write(itm.Name + ": " + itm.Paths.GetFriendlyUrl(true)); } } }
Other information Add a reference to:
the Lucene.NET dll
add using declarations:
using Lucene.Net; using Lucene.Net.Search; using Lucene.Net.Documents;
Regards, P.

Thursday, October 04, 2007

Microsoft releases the sourcecode for the .NET Framework Libraries

According to Scott Guthrie they're releasing the sourcecode of the .NET Framework Libraries when they release .NET 3.5 & Visual Studio 2008.. way cool if you ask me. further reading: http://weblogs.asp.net/scottgu/archive/2007/10/03/releasing-the-source-code-for-the-net-framework-libraries.aspx

Wednesday, October 03, 2007

Summary of a really great start

It's been quite the hectic couple of days that just passed, but man am I looking forward to more of them! I guess it's official now, as of the 1:st of October i'm now employed as a Solution Architect @ Sitecore. (input big -yay! and some clapping)..! I've had a really good time in Copenhagen at Sitecore's HQ and the things we've talked about and the things I've seen all have one major thing in common: it's gonna be a fantastic future :) My mind's racing a bit right now, but that's the way it should be when you take on challenges like these (or it could just be the way-to-many cups of coffee too), and once I've just had some time to sleep and go over everything in more detail it looks like it's off to a flying start (seriously) next week and a very interesting couple of days in Budapest.

I'm really excited about this job and what it brings and proud to join the great people at Sitecore!

Oh, and a big, big 'thanks!' to all the amazing people working at Sitecore that really made me feel welcome when i arrived yesterday!

Right now I'm on the train back to Stockholm, so in some 30 hours or so i've managed to cover quite some distance.. stockholm > copenhagen > malmö > helsingborg > malmö > stockholm..

Expect some more Sitecore-related posts appearing in the near future :)

Regards,

P.

Tuesday, October 02, 2007

The battle for the capital of Scandinavia

some say Copenhagen, some say Stockholm.. two sides but only one may prevail, and i'm on my way to find out what defines the Capital of Scandinavia (well, maybe not, but i like the idea).. it's -way- to early in the morning right now, and considering i've already been up for about 2 hours there must be something seriously wrong with me. I'm on the train from Stockholm to Copenhagen, eager and restless like a little kid, on my way to draw out the details of my new job. yes, i'm starting a new job! tomorrow afternoon i'm gonna make another (better) post about it, right now i'm gonna kick it back on the train and maybe even get some (much needed) sleep.. take care, P.