Search inside Lucene in Action

Query parsed to: index fileindex

181 - 200 of 230 results (Page 10 of 12)

8.9 : Storing an index in Berkeley DB

starts on page 307 in chapter 8 (Tools and extensions)

... mechanism. An interesting side-effect of having a Lucene index in a database is the transactional...

6.0 : Extending search

starts on page 194

... such as wildcard and fuzzy queries. Custom filters allow infor- mation from outside the index to factor...

6.5.4 : Morals of performance testing

starts on page 220 under section 6.5 (Performance testing) in chapter 6 (Extending search)

...In addition to testing whether Lucene can perform acceptably with your envi- ronment and data, unit performance testing assists (as does basic JUnit testing) in the design of your code. In this case, you've seen how our original method of indexing dates was less than desirable even though our first unit test succeeded with the right number of results. Only when we tested with more data or with time and load constraints did an issue present itself. We could have swept the data failure under...

7.5 : Indexing a Microsoft Word document

starts on page 248 in chapter 7 (Parsing common document formats)

......

8.9.2 : Installing DbDirectory

starts on page 309 under section 8.9 (Storing an index in Berkeley DB) in chapter 8 (Tools and extensions)

...Erik had a hard time getting DbDirectory working, primarily because of issues with building and installing Berkeley DB 4.2.52 on Mac OS X. After many emails back and forth with Andi, the problems were resolved, and the index (and unshown searching) example worked. Follow the instructions for obtaining and installing Berkeley DB. Be sure to configure the Berkeley DB build with Java support enabled (./configure -- enable-java). You need Berkeley DB's db.jar as well as the DbDirectory (and friends...

5.7.2 : What category?

starts on page 189 under section 5.7 (Leveraging term vectors) in chapter 5 (Advanced search techniques)

...Each book in our index is given a single primary category: For example, this book is categorized as "/technology/computers/programming". The best category placement for a new book may be relatively obvious, or (more likely) several pos- sible categories may seem reasonable. You can use term vectors ... builds category vectors by walking every document in the index and aggregating book subject vectors ... and inverse cosine calculations and may be pro- hibitive in high-volume indexes....

6.4 : Using a custom filter

starts on page 209 in chapter 6 (Extending search)

...If all the information needed to perform filtering is in the index, there is no need to write your own filter because the QueryFilter can handle it. However, there are good reasons to factor external information into a custom filter. Using our book example data and pretending we're running ... is to store the specials flag in an index field. However, the specials change frequently. Rather ... . c isbn is indexed as a Keyword field and is unique, so we use IndexReader to jump directly...

8.1 : Playing in Lucene's Sandbox

starts on page 268 in chapter 8 (Tools and extensions)

... 8.3 ant An Ant <index> task Section 8.4 db Berkeley DB Directory implementation Section ... browsers Section 8.5 lucli Command-line interface to interact with an index Section ... index from WordNet database Section 8.6 There are a few more Sandbox components than those we cover...

9.2 : CLucene

starts on page 314 in chapter 9 (Lucene ports)

... between non-Unicode indexes and Unicode indexes. Linux-based CLucene will read Unicode indexes but may...

More Like This

David Spencer created a generic "more like this" facility and checked it into the Lucene Sandbox. See it in action at his index.php?id=44">SearchMorph - MoreLikeThis entry. [Permalink]

Lucene at SourceForge: Behind the Scenes

index.php?/authors/12-Chris-Conrad">Chris Conrad, a SourceForge Engineer, describes how SourceForge uses Lucene in index.php?/archives/10-Behind-the-Scenes-of-the-SourceForge.net-Search-System.html">Behind the Scenes of the SourceForge.net Search System. [Permalink]

9.4.1 : API compatibility

starts on page 319 under section 9.4 (Plucene) in chapter 9 (Lucene ports)

... = Plucene::Index::Writer->new("/tmp/index", Plucene::Plugin::Analyzer::PorterAnalyzer->new ... = Plucene::Search::IndexSearcher->new("/tmp/index"); my $hc = Plucene::Search::HitCollector->new...

4.2.4 : Filtering order can be important

starts on page 116 under section 4.2 (Analyzing the analyzer) in chapter 4 (Analysis)

..., tokens.length); for (int i = 0; i < tokens.length; i++) { Assert.assertEquals("index...

6.3.1 : Customizing QueryParser's behavior

starts on page 203 under section 6.3 (Extending QueryParser) in chapter 6 (Extending search)

... ranges by padding to match how numbers were indexed getWildcardQuery(String field, Wildcard queries can...

7.3 : Indexing a PDF document

starts on page 235 in chapter 7 (Parsing common document formats)

......

7.9 : Other text-extraction tools

starts on page 264 in chapter 7 (Parsing common document formats)

...In this chapter, we've presented text extraction from, and indexing of, the most common document formats. We chose tools that are the most popular among developers, tools that are still being developed (or at least maintained), and tools that are easy to use. All libraries that we've presented are freely available. There are, of course, a number of other free and commercial tools that you could use; several that we know of are listed in table 7.3. Table 7.3 Tools for parsing different document...

8.0 : Tools and extensions

starts on page 267

...This chapter covers Using Lucene's Sandbox components Working with third-party Lucene tools 267 You've built an index, but can you browse or query it without writing code? Abso- lutely! In this chapter, we'll discuss three tools to do this. Do you need analysis beyond what the built-in analyzers provide? Several specialized analyzers for many languages are available in Lucene's Sandbox. How about providing Google- like term highlighting in search results? We've got that, too! This chapter...

8.11 : Summary

starts on page 311 in chapter 8 (Tools and extensions)

...Don't reinvent the wheel. Someone has probably encountered the same situation you're struggling with--you need language-specific analysis, or you want to build an index during an Ant build process, or you want query terms highlighted in search results. The Sandbox and the other resources listed on the Lucene web site should be your first stops. If you end up rolling up your sleeves and creating something new and generally useful, please consider donating it to the Sandbox or making it available...

9.5 : Lupy

starts on page 320 in chapter 9 (Lucene ports)

...Lupy is a pure Python port of Lucene 1.2. The main developers of Lupy are Amir Bakhtiar and Allen Short. Some core Lucene functionality is missing from Lupy, such as QueryParser, some of the analyzers, index merging, locking, and a few other small items. Although Lupy is a port of a rather old Lucene version, its developers are busy adding features that should bring it closer to Lucene 1.4. The current version of Lupy is 0.2.1; you can find it at http://www.divmod.org/Home/ Projects/Lupy/....

Cool Hand Luke

Lucene expert Andrzej Bialecki has released a new version of the famed Lucene index toolbox, Luke. Version 0.6 of Luke adds JavaScript extensibility support as well as a host of other nice additions. If you use Lucene, you ought to have Luke handy! [Permalink]