将字符串 “()“ “&“ “|“ 条件组成的复杂表达式转换为ES查询语句

应用场景

"()" "&" "|"  这几个条件对于我们来说并不陌生, 其表达的逻辑非常明了, 又能通过很少的字符表达很复杂的嵌套关系, 在一些复杂的查询中会经常用到, 因此我最近也遇到了类似的问题,一开始觉得这类的工具应该挺常见的, 结果搜了半天没有找到合适的,因此决定自己写一个

简介

此工具的复杂之处在于我们并不确定操作系统的人员会输入怎样的表达式,格式并不是固定的因此可能会书写出较为复杂的逻辑. 也有可能只嵌套一层就结束了,所以我们的代码一定要考虑的通用

此处我简单说一下它的原理, 主要是用到了一个java中栈的概念: 这个工具通过解析输入的逻辑查询字符串,使用栈来管理运算符和操作数,构建出对应的查询树,然后将其转换为Elasticsearch的多字段(如标题、摘要、正文)的搜索查询,实现复杂的逻辑查询条件的自动解析和执行。

以下代码全部都加了注释, 应该是不难理解的 

代码

package com.sinosoft.springbootplus.lft.business.dispatch.publicopinion.util;

import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.util.Stack;

/**
 * 构建ES复杂查询条件,包含括号、逻辑运算符和操作符
 *
 * @author zzt
 * @date 2024-05-28
 */
public class ESQueryParserUtil {

    /**
     * 解析输入字符串并将其转换为Elasticsearch的QueryBuilder
     *
     * @param query 输入的查询字符串
     * @return Elasticsearch的QueryBuilder
     */
    public static SearchSourceBuilder parseQuery(String query) {
        // 存储运算符的栈
        Stack<Character> operators = new Stack<>();
        // 存储操作数的栈
        Stack<QueryBuilder> operands = new Stack<>();

        for (int i = 0; i < query.length(); i++) {
            char ch = query.charAt(i);

            if (ch == '(' || ch == '&' || ch == '|') {
                // 遇到左括号或者运算符时,压入运算符栈
                operators.push(ch);
            } else if (ch == ')') {
                // 遇到右括号时,弹出运算符栈中的运算符并进行计算直到遇到左括号
                while (!operators.isEmpty() && operators.peek() != '(') {
                    char operator = operators.pop();
                    QueryBuilder right = operands.pop();
                    QueryBuilder left = operands.pop();
                    operands.push(applyOperator(left, right, operator));
                }
                operators.pop(); // 弹出左括号
            } else if (Character.isLetterOrDigit(ch) || ch == ' ') {
                // 遇到字母、数字、空格或者“地区”时,构建查询字符串
                StringBuilder sb = new StringBuilder();
                while (i < query.length() && (Character.isLetterOrDigit(query.charAt(i)) || query.charAt(i) == ' ')) {
                    sb.append(query.charAt(i));
                    i++;
                }
                i--; // 回退一个字符,因为外层for循环会前进一个字符
                operands.push(QueryBuilders.multiMatchQuery(sb.toString().trim(), "title", "sysAbstract", "content"));//此处是我的ES中要模糊搜索的三个字段, 这里请自行更改
            }
        }

        // 处理剩余的运算符
        while (!operators.isEmpty()) {
            char operator = operators.pop();
            QueryBuilder right = operands.pop();
            QueryBuilder left = operands.pop();
            operands.push(applyOperator(left, right, operator));
        }

        return new SearchSourceBuilder().query(operands.pop());
    }

    /**
     * 根据运算符将两个操作数组合成一个QueryBuilder
     *
     * @param left     左操作数
     * @param right    右操作数
     * @param operator 运算符
     * @return 组合后的QueryBuilder
     */
    private static QueryBuilder applyOperator(QueryBuilder left, QueryBuilder right, char operator) {
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        if (operator == '&') {
            boolQuery.must(left).must(right);
        } else if (operator == '|') {
            boolQuery.should(left).should(right);
        }
        return boolQuery;
    }

    public static void main(String[] args) {
        String query = "((北京|天津|(河北&石家庄))&(打架|辱骂|违法))&(中国)";
        SearchSourceBuilder searchSourceBuilder = parseQuery(query);
        System.out.println(searchSourceBuilder);
    }
}

 生成的查询条件

由于我写的这个算是稍微复杂一点的嵌套,生成的查询条件还是挺长的, 感兴趣的可以试一下

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "北京",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "天津",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "bool": {
                              "must": [
                                {
                                  "multi_match": {
                                    "query": "河北",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                },
                                {
                                  "multi_match": {
                                    "query": "石家庄",
                                    "fields": [
                                      "content^1.0",
                                      "sysAbstract^1.0",
                                      "title^1.0"
                                    ],
                                    "type": "best_fields",
                                    "operator": "OR",
                                    "slop": 0,
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "fuzzy_transpositions": true,
                                    "boost": 1.0
                                  }
                                }
                              ],
                              "adjust_pure_negative": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              },
              {
                "bool": {
                  "should": [
                    {
                      "multi_match": {
                        "query": "打架",
                        "fields": [
                          "content^1.0",
                          "sysAbstract^1.0",
                          "title^1.0"
                        ],
                        "type": "best_fields",
                        "operator": "OR",
                        "slop": 0,
                        "prefix_length": 0,
                        "max_expansions": 50,
                        "zero_terms_query": "NONE",
                        "auto_generate_synonyms_phrase_query": true,
                        "fuzzy_transpositions": true,
                        "boost": 1.0
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "multi_match": {
                              "query": "辱骂",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          },
                          {
                            "multi_match": {
                              "query": "违法",
                              "fields": [
                                "content^1.0",
                                "sysAbstract^1.0",
                                "title^1.0"
                              ],
                              "type": "best_fields",
                              "operator": "OR",
                              "slop": 0,
                              "prefix_length": 0,
                              "max_expansions": 50,
                              "zero_terms_query": "NONE",
                              "auto_generate_synonyms_phrase_query": true,
                              "fuzzy_transpositions": true,
                              "boost": 1.0
                            }
                          }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1.0
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
          }
        },
        {
          "multi_match": {
            "query": "中国",
            "fields": [
              "content^1.0",
              "sysAbstract^1.0",
              "title^1.0"
            ],
            "type": "best_fields",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  }
}

 

相关推荐

  1. 如何一个字符串转换整数?

    2024-06-06 10:12:10       37 阅读
  2. 字符串转换Python数据类型

    2024-06-06 10:12:10       12 阅读
  3. Rust条件语句:if-else表达式详解

    2024-06-06 10:12:10       30 阅读
  4. Es条件查询

    2024-06-06 10:12:10       40 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-06-06 10:12:10       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-06-06 10:12:10       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-06-06 10:12:10       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-06-06 10:12:10       20 阅读

热门阅读

  1. 在Web应用中如何处理会话跟踪

    2024-06-06 10:12:10       9 阅读
  2. 网络安全实战基础——实战工具与攻防环境介绍

    2024-06-06 10:12:10       9 阅读
  3. linux服务器配置openssl

    2024-06-06 10:12:10       9 阅读
  4. 面向小白的 Spark MLlib 入门教学

    2024-06-06 10:12:10       10 阅读
  5. make 中 DESTDIR 和 --prefix 的区别

    2024-06-06 10:12:10       8 阅读