Search inside Lucene in Action

Query parsed to: index fileindex

141 - 160 of 230 results (Page 8 of 12)

7.2 : Indexing XML

starts on page 226 in chapter 7 (Parsing common document formats)

...In this section, we'll convert a snippet of an XML document into a Lucene Document. First we'll use the SAX API, and then we'll use the Jakarta Commons Digest. Then we'll index the snippet with Lucene. Listing 7.2 is an XML snippet that represents a single entry from an imaginary address book. Our ultimate goal is to make this address book searchable so we can find matching entries in it using ... 7.2, in order to keep things simple. Our XML DocumentHandler implementations index each subelement...

10.2.5 : Queries

starts on page 335 under section 10.2 (Using Lucene at jGuru) in chapter 10 (Case studies)

...jGuru works hard to provide good, consistent search results. From carefully pre- pared indexes, jGuru grooms search words and translates them to Lucene-specific queries. This section summarizes how results are displayed, outlines how queries are generated, provides an English plural stripper ... text to index. Then, because Lucene assumes an OR-like default logic and users expect AND logic ... ," this entry should get a good score. If you only searched the indexed content, the FAQ entry would...

3.2.1 : Working with Hits

starts on page 76 under section 3.2 (Using IndexSearcher) in chapter 3 (Adding search to your application)

... documents to be loaded from the index when they aren't already cached. This leads us to recommend only... [Full sample chapter]

3.4.2 : Searching within a range: RangeQuery

starts on page 83 under section 3.4 (Creating queries programmatically) in chapter 3 (Adding search to your application)

...Terms are ordered lexicographically within the index, allowing for efficient searching of terms within a range. Lucene's RangeQuery facilitates searches from a starting term through an ending term. The beginning and ending terms may either be included or excluded. The following code illustrates range queries inclusive of the begin and end terms: public class RangeQueryTest extends LiaTestCase { private Term begin, end; protected void setUp() throws Exception { begin = new Term("pubmonth... [Full sample chapter]

3.6 : Summary

starts on page 100 in chapter 3 (Adding search to your application)

...Lucene rapidly provides highly relevant search results to queries. Most applica- tions need only a few Lucene classes and methods to enable searching. The most fundamental things for you to take from this chapter are an understanding of the basic query types (of which TermQuery, RangeQuery, and BooleanQuery are the primary ones) and how to access search results. Although it can be a bit daunting, Lucene's scoring formula (coupled with the index format discussed in appendix B and the efficient... [Full sample chapter]

4.7.1 : Leaving holes

starts on page 136 under section 4.7 (Stemming analysis) in chapter 4 (Analysis)

.../index.html. public PositionalStopFilter(TokenStream in, Set stopWords) { super(in); this.stopWords...

4.8 : Language analysis issues

starts on page 140 in chapter 4 (Analysis)

...Dealing with languages in Lucene is an interesting and multifaceted issue. How can text in various languages be indexed and subsequently retrieved? As a devel- oper building I18N-friendly applications around Lucene, what issues do you need to consider? You must contend with several issues when analyzing text in various lan- guages. The first hurdle is ensuring that character-set encoding is done properly such that external data, such as files, are read into Java properly. During the analysis...

5.5.1 : Using DateFilter

starts on page 171 under section 5.5 (Filtering a search) in chapter 5 (Advanced search techniques)

...The date field type is covered in section 2.4 along with its caveats. Having a date field, you filter as shown in testDateFilter() in listing 5.6. Our book data indexes the last modified date of each book data file as a modified field, indexed as a Field.Keyword(String, Date). We test the date ... in our index, allowing for comparisons when we use an all inclusive date filter. The first parameter to both of the DateFilter constructors is the name of a date field in the index. In our sample data...

5.8 : Summary

starts on page 193 in chapter 5 (Advanced search techniques)

...This chapter has covered some diverse ground, highlighting Lucene's additional built-in search features. Sorting is a dramatic new enhancement that gives you control over the ordering of search results. The new SpanQuery family leverages term-position information for greater searching precision. Filters constrain doc- ument search space, regardless of the query. Lucene includes support for multiple (including parallel) and remote index searching, giving developers a head start on distributed...

10.0 : Case studies

starts on page 325

... off the cleverness factor, Michaels.com uses Lucene to index and search for colors. And finally...

10.2.1 : Topic lexicons and document categorization

starts on page 330 under section 10.2 (Using Lucene at jGuru) in chapter 10 (Case studies)

... indexed by Lucene, but we use our topic vocabularies to compute the mostly likely topic(s). Users can...

3.5.5 : Range searches

starts on page 96 under section 3.5 (Parsing query expressions: QueryParser) in chapter 3 (Adding search to your application)

..., or parsing fails. In our example index, the field pubmonth isn't a date field; it's text of the format ... are represented in a lexico- graphically ordered text format. As long as our modified field was indexed... [Full sample chapter]

4.6 : Synonyms, aliases, and words that

starts on page 128 in chapter 4 (Analysis)

... indexing, you make searches find documents that may not contain the original search terms but match ... of some odd cases that arise in search- ing, though. Since synonyms are indexed just like other terms ... .addDocument(doc); Index single writer.close(); document searcher = new IndexSearcher(directory ... for the phrase "fox hops" also matches. The phrase "...fox jumps..." was indexed, and our SynonymAnalyzer ... , even using the same analyzer used for indexing. But, if we use the StandardAnalyzer (recall...

10.6.2 : Searching content

starts on page 367 under section 10.6 (Artful searching at Michaels.com) in chapter 10 (Case studies)

... for querying a given Lucene index and returning a list of documents that match that query. Listing ... more than acceptable, but much larger indexes and arbitrary queries change the landscape dramatically...

8.4.2 : Creating a custom document handler

starts on page 286 under section 8.4 (Java Development with Ant and Lucene) in chapter 8 (Tools and extensions)

...A swappable document-handler facility is built into the <index> task, allowing custom ... . We used the Ant <index> task, as shown in listing 8.3, to build the index used in the majority of the code for this book. Listing 8.3 Use of the <index> task to build the sample index for this book build-index" depends="compile"> Use custom document handler <index...

Groovy, baby!

Jeremy Rayner has ported Lucene in Action's Indexer.java and Searcher.java to Groovy and written a nice little -index.html">article showing the indexing of some books from Project Gutenberg. Thanks Jeremy! [Permalink]

8.2.2 : Luke: the Lucene Index Toolbox

starts on page 271 under section 8.2 (Interacting with an index) in chapter 8 (Tools and extensions)

...Andrzej Bialecki created Luke (found at http://www.getopt.org/luke/), an elegant Lucene index browser. This gem provides an intimate view inside a file system- based index from an attractive desktop ... in an index. Luke has become a regular part of our Lucene development toolkit. Its inter- connected user ... force an index to be unlocked when opening, optimize an index, and also delete and undelete documents ... , the first thing Luke needs is a path to the index file, as shown in the file-selection dialog...

7.2.1 : Parsing and indexing using SAX

starts on page 227 under section 7.2 (Indexing XML) in chapter 7 (Parsing common document formats)

... the Apache XML project and can be found at http://xml.apache.org/xerces2-j/index.html ... the information about the XML element that was just processed. We aren't interested in indexing ... , we aren't interested in indexing the element. However, we are interested in indexing ... , and we blindly index them as keyword Field.Keyword. Attribute values as well element data are indexed ... we outlined. As a result, we get a ready-to-index Lucene Document populated with Fields whose names...

5.7.1 : Books like this

starts on page 186 under section 5.7 (Leveraging term vectors) in chapter 5 (Advanced search techniques)

...[] args) throws IOException { String indexDir = System.getProperty("index.dir"); FSDirectory ... book document in the index and find books like each one. c Here we look up books that are like ... . In d, we used a different way to get the value of the author field. It was indexed as multiple fields ... , the subject field could have been reanalyzed or indexed such that individual subject terms were added ... the sample data was indexed). Our next example also uses the frequency component to a term vector...

7.7 : Indexing a plain-text document

starts on page 253 in chapter 7 (Parsing common document formats)

... ends up containing the full con- tent of the original document. This text is then indexed as a Field ... a small frame- work for parsing and indexing document of various formats. All the Document- Handler...