Monday, March 31, 2014

Implementing own LuceneQParserPlugin for Solr

Whenever you need to implement a query parser in Solr, you start by sub-classing the LuceneQParserPlugin:


public class MyGroundShakingQueryParser 
                           extends LuceneQParserPlugin {
    public QParser createParser(String qstr, 
                                SolrParams localParams,
                                SolrParams params,
                                SolrQueryRequest req) {}
}

In this way you will reuse the underlining functionality and parser of the LuceneQParserPlugin. The grammar of the parser is defined in QueryParser.jj file inside Lucene/Solr source code tree.

The grammar that QueryParser.jj uses is BNF. The JavaCC tool implements parsing of such grammars and producing the java code for you. The produced code is effectively a parser with built-in validation etc.

In Solr there is its own version of LuceneQParserPlugin: it is called QParserPlugin and in fact it pretty much implements almost the same functionality as its counterpart.

There could be use cases for customization of the lucene parsing grammar (stored in QueryParser.jj). Once the customization is done (let's rename the jj file to GroundShakingQueryParser.jj), we invoke the javacc tool and it produces a GroundShakingQueryParser.java and supplementary classes. In order to wire it into the Solr we need to do a few things. The final class inter-play is shown on the class diagram:

class diagram of inter-relations between classes
Going bottom up:
1. You implement your custom logic in GroundShakingQueryParser.jj that produces GroundShakingQueryParser.java. Make sure the class extends SolrQueryParserBase.
2. To wire this into Solr, we need to extend the GroundShakingQueryParser class in GroundShakingSolrQueryParser class.

/**
 * Solr's default query parser, a schema-driven superset of the classic lucene query parser.
 * It extends the query parser class with modified grammar stored in GroundShakingQueryParser.jj.
 */
public class GroundShakingSolrQueryParser extends GroundShakingQueryParser {

  public GroundShakingSolrQueryParser(QParser parser, String defaultField) {
    super(parser.getReq().getCore().getSolrConfig().luceneMatchVersion, defaultField, parser);
  }

}

3. Instance of GroundShakingSolrQueryParser is acquired in the GroundShakingLuceneQParser class.

class GroundShakingLuceneQParser extends QParser {
    GroundShakingSolrQueryParser lparser;

    public GroundShakingLuceneQParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {
        super(qstr, localParams, params, req);
    }


    @Override
    public Query parse() throws SyntaxError {
        String qstr = getString();
        if (qstr == null || qstr.length()==0) return null;

        String defaultField = getParam(CommonParams.DF);
        if (defaultField==null) {
            defaultField = getReq().getSchema().getDefaultSearchFieldName();
        }
        lparser = new GroundShakingSolrQueryParser(this, defaultField);

        lparser.setDefaultOperator
                (GroundShakingQueryParsing.getQueryParserDefaultOperator(getReq().getSchema(),
                        getParam(QueryParsing.OP)));

            return lparser.parse(qstr);
    }


    @Override
    public String[] getDefaultHighlightFields() {
        return lparser == null ? new String[]{} : new String[]{lparser.getDefaultField()};
    }

}


4. GroundShakingLuceneQParser is wired into GroundShakingQParserPlugin that extends the aforementioned QParserPlugin.


public class GroundShakingQParserPlugin extends QParserPlugin {
  public static String NAME = "lucene";

  @Override
  public void init(NamedList args) {
  }

  @Override
  public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {
    return new GroundShakingLuceneQParser(qstr, localParams, params, req);
  }
}


5.  Now we have our custom GroundShakingLuceneQParser which can be directly extended in our MyGroundShakingQueryParser!

public class MyGroundShakingQueryParser 
                           extends GroundShakingLuceneQParserPlugin {
    public QParser createParser(String qstr, 
                                SolrParams localParams,
                                SolrParams params,
                                SolrQueryRequest req) {}
}

To register the MyGroundShakingQueryParser in solr, you need to add the following line into solrconfig.xml:

<queryparser class="com.groundshaking.MyGroundShakingQueryParser" name="groundshakingqparser"/ >


To use it, just specify the name in the defType=groundshakingqparser as a query parameter to Solr.


By the way, one convenience of this implementation is that we can deploy the above classes in a jar under solr core's lib directory. I.e. we do not need to overhaul solr source code and deal with deploying some "custom" solr shards.