lucene 入门开发实例 -

liuzm

浏览: 98259 次
性别:
来自: 武汉

最近访客更多访客>>

fly_chao

maxwade

wd1282988143

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

lucene 入门开发实例

博客分类：

lucene

lucene 搜索引擎 C C++C#

自己在网上学习   lucene 经测试后.发布到自己的博客上来
     开发之间一定要有 lucene包哦
      然后看下子语法:

        IndexWriter:lucene中最重要的的类之一，它主要是用来将文档加入索引，同时控制索引过程中的一些参数使用。

      Analyzer:分析器,主要用于分析搜索引擎遇到的各种文本。常用的有StandardAnalyzer分析器,StopAnalyzer分析器,WhitespaceAnalyzer分析器等。

      Directory:索引存放的位置;lucene提供了两种索引存放的位置，一种是磁盘，一种是内存。一般情况将索引放在磁盘上；相应地lucene提供了FSDirectory和RAMDirectory两个类。

      Document:文档;Document相当于一个要进行索引的单元，任何可以想要被索引的文件都必须转化为Document对象才能进行索引。

Field：字段。

      IndexSearcher:是lucene中最基本的检索工具，所有的检索都会用到IndexSearcher工具;

        Query:查询，lucene中支持模糊查询，语义查询，短语查询，组合查询等等,如有TermQuery,BooleanQuery,RangeQuery,WildcardQuery等一些类。

      QueryParser: 是一个解析用户输入的工具，可以通过扫描用户输入的字符串，生成Query对象。

      Hits:在搜索完成之后，需要把搜索结果返回并显示给用户，只有这样才算是完成搜索的目的。在lucene中，搜索的结果的集合是用Hits类的实例来表示的。

      步骤好如下:
        1、在windows系统下的的C盘，建一个名叫liuzm的文件夹(这文件由你的程序定,因为在下面的程序

中我用的是liuzm这个文件),在该文件夹里面随便建2个txt文件，随便起名啦，就叫"1.txt","2.txt"

      txt 理面的文档输入"刘志猛博客 www.liuzm.com"

      准备工作完成开始建立索引:


     public static void main(String[] args) throws Exception {
            /**//* 指明要索引文件夹的位置,这里是C盘的liuzm文件夹下
             *   只是为了方便测试.所以文件是自己写的,在此目录下建1.txt 2.txt
             * */
            File fileDir = new File("c:\\liuzm");

            /**//* 这里放索引文件的位置 */
            File indexDir = new File("c:\\index");
            Analyzer luceneAnalyzer = new StandardAnalyzer();
            IndexWriter indexWriter = new IndexWriter(indexDir, luceneAnalyzer,
                    true);
            //第一个参数：索引存放在什么地方

           // 第二个参数：分析器，继承自org.apache.lucene.analysis.Analyzer类

           // 第三个参数：为true时，IndexWriter不管目录内是否已经有索引了，一律清空，重新建立；当为false时，则IndexWriter会在原有基础上增量添加索引。所以在更新的过程中，需要设置该值为false。

            File[] textFiles = fileDir.listFiles();
            long startTime = new Date().getTime();

            //增加document到索引去
            for (int i = 0; i < textFiles.length; i++) {
                if (textFiles[i].isFile()
                        && textFiles[i].getName().endsWith(".txt")) {
                    System.out.println("File " + textFiles[i].getCanonicalPath()
                            + "正在被索引.");
                    String temp = FileReaderAll(textFiles[i].getCanonicalPath(),
                            "GBK");
                    System.out.println(temp);
                    Document document = new Document();
                    Field FieldPath = new Field("path", textFiles[i].getPath(),
                            Field.Store.YES, Field.Index.NO);
                    Field FieldBody = new Field("body", temp, Field.Store.YES,
                            Field.Index.TOKENIZED,
                            Field.TermVector.WITH_POSITIONS_OFFSETS);
                    document.add(FieldPath);
                    document.add(FieldBody);
                    indexWriter.addDocument(document);
                }
            }
            //optimize()方法是对索引进行优化
            indexWriter.optimize();
            indexWriter.close();

            //测试一下索引的时间
            long endTime = new Date().getTime();
            System.out
                    .println("这花费了"
                            + (endTime - startTime)
                            + " 毫秒来把文档增加到索引里面去!"
                            + fileDir.getPath());
        }

        public static String FileReaderAll(String FileName, String charset)
                throws IOException {
            BufferedReader reader = new BufferedReader(new InputStreamReader(
                    new FileInputStream(FileName), charset));
            String line = new String();
            String temp = new String();

            while ((line = reader.readLine()) != null) {
                temp += line;
            }
            reader.close();
            return temp;
        }

然后就是查询:

public static void main(String[] args) throws IOException, ParseException {
        Hits hits = null;
        String queryString = "刘志猛";
        Query query = null;
        IndexSearcher searcher = new IndexSearcher("c:\\index");

        Analyzer analyzer = new StandardAnalyzer();
        try {
            QueryParser qp = new QueryParser("body", analyzer);
            try {
                query = qp.parse(queryString);


                // 以下是评分机制的代码.等看懂这个例子后,可以把注销代码删了

//                hits = searcher.search(query);
//                for(int i=0;i<hits.length();i++){
//                    Explanation explanation = searcher.explain(query,hits.id(i));
//                    System.out.println("得分"+hits.score(i));
//                    System.out.println("具体情况"+explanation);
//                    System.out.println("长度"+hits.length());
//
//                }


            } catch (org.apache.lucene.queryParser.ParseException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        } catch (ParseException e) {
        }
        if (searcher != null) {
            hits = searcher.search(query);
            if (hits.length() > 0) {
                System.out.println("找到:" + hits.length() + " 个结果!");
            }
            else{
                System.out.print("没有找到!!!!!!!!");
            }
        }

    }

      结果是:找到:2 个结果!这个例子只是一个入门,让刚刚接觛的人了解下lucene 搜索

官方链接：http://www.liuzm.com/article/java/9114.htm
官方博客:http://www.liuzm.com

1
顶

1
踩

分享到：

lucene 入门学习,简单实例模访google搜索 | JAVA插入数据库时java.sql.DataTruncation: ...

2010-05-28 13:40
浏览 1042
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene 入门开发实例

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene 入门 开发实例

评论

发表评论

相关推荐

lucene2.9 Highlighter中文分词的关键字变红显示用法

lucene2.9 中文分词学习和SmartChineseAnalyzer的用法

lucene 入门学习,简单实例模访google搜索

最近访客更多访客>>

lucene 入门开发实例