小程序制作平台网站设计平台 互联网品牌制作专家
关于A5客户收购GoogleAdsense带有搜索账户的相关事宜更多

软文发布平台资讯中心

postgresql中的like查询优化方案

postgresql中的like查询优化方案

  项目招商找A5 快速获取精准代理名单

当时数量量比较庞大的时候,做模糊查询效率很慢,为了优化查询效率,尝试如下方法做效率对比。

一、对比情况说明:

1、数据量100w条数据

2、执行sql

二、对比结果

explain analyze SELECTc_patent,c_applyissno,d_applyissdate,d_applydate,c_patenttype_dimn,c_newlawstatus,c_abstract FROMpublic.t_knowl_patent_zlxx_temp WHEREc_applicant LIKE '%本溪满族自治县连山关镇安平安养殖场%';

1、未建索时执行计划:

"Gather (cost=1000.00..83803.53 rows=92 width=1278) (actual time=217.264..217.264 rows=0 loops=1)Workers Planned: 2Workers Launched: 2-> Parallel Seq Scan on t_knowl_patent_zlxx (cost=0.00..82794.33 rows=38 width=1278) (actual time=212.355..212.355 rows=0 loops=3) Filter: ((c_applicant)::text ~~ '%本溪满族自治县连山关镇安平安养殖场%'::text) Rows Removed by Filter: 333333Planning time: 0.272 msExecution time: 228.116 ms"

2、btree索引

建索引语句

1CREATE INDEX idx_public_t_knowl_patent_zlxx_applicant ON public.t_knowl_patent_zlxx(c_applicant varchar_pattern_ops);

执行计划

"Gather (cost=1000.00..83803.53 rows=92 width=1278) (actual time=208.253..208.253 rows=0 loops=1)Workers Planned: 2Workers Launched: 2-> Parallel Seq Scan on t_knowl_patent_zlxx (cost=0.00..82794.33 rows=38 width=1278) (actual time=203.573..203.573 rows=0 loops=3) Filter: ((c_applicant)::text ~~ '%本溪满族自治县连山关镇安平安养殖场%'::text) Rows Removed by Filter: 333333Planning time: 0.116 msExecution time: 218.189 ms"

但是如果将查询sql稍微改动一下,把like查询中的前置%去掉是这样的

Index Scan using idx_public_t_knowl_patent_zlxx_applicant on t_knowl_patent_zlxx_temp (cost=0.55..8.57 rows=92 width=1278) (actual time=0.292..0.292 rows=0 loops=1)Index Cond: (((c_applicant)::text ~>=~ '本溪满族自治县连山关镇安平安养殖场'::text) AND ((c_applicant)::text ~<~ '本溪满族自治县连山关镇安平安养殖圻'::text))Filter: ((c_applicant)::text ~~ '本溪满族自治县连山关镇安平安养殖场%'::text)Planning time: 0.710 msExecution time: 0.378 ms

3、gin索引

创建索引语句(postgresql要求在9.6版本及以上)

create extension pg_trgm;CREATE INDEX idx_public_t_knowl_patent_zlxx_applicant ON public.t_knowl_patent_zlxx USING gin (c_applicant gin_trgm_ops);

执行计划

Bitmap Heap Scan on t_knowl_patent_zlxx (cost=244.71..600.42 rows=91 width=1268) (actual time=0.649..0.649 rows=0 loops=1)Recheck Cond: ((c_applicant)::text ~~ '%本溪满族自治县连山关镇安平安养殖场%'::text)-> Bitmap Index Scan on idx_public_t_knowl_patent_zlxx_applicant (cost=0.00..244.69 rows=91 width=0) (actual time=0.647..0.647 rows=0 loops=1) Index Cond: ((c_applicant)::text ~~ '%本溪满族自治县连山关镇安平安养殖场%'::text)Planning time: 0.673 msExecution time: 0.740 ms

三、结论

btree索引可以让后置% "abc%"的模糊匹配走索引,gin + gp_trgm可以让前后置% "%abc%" 走索引。但是gin 索引也有弊端,以下情况可能导致无法命中:

搜索字段少于3个字符时,不会命中索引,这是gin自身机制导致。

当搜索字段过长时,比如email检索,可能也不会命中索引,造成原因暂时未知。

补充:PostgreSQL LIKE 查询效率提升实验

一、未做索引的查询效率

作为对比,先对未索引的查询做测试

EXPLAIN ANALYZE select * from gallery_map where author = '曹志耘'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=1025 width=621) (actual time=0.011..39.753 rows=1031 loops=1)Filter: ((author)::text = '曹志耘'::text)Rows Removed by Filter: 71315Planning time: 0.194 msExecution time: 39.879 ms(5 rows)Time: 40.599 msEXPLAIN ANALYZE select * from gallery_map where author like '曹志耘'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=1025 width=621) (actual time=0.017..41.513 rows=1031 loops=1)Filter: ((author)::text ~~ '曹志耘'::text)Rows Removed by Filter: 71315Planning time: 0.188 msExecution time: 41.669 ms(5 rows)Time: 42.457 msEXPLAIN ANALYZE select * from gallery_map where author like '曹志耘%'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.017..41.492 rows=1031 loops=1)Filter: ((author)::text ~~ '曹志耘%'::text)Rows Removed by Filter: 71315Planning time: 0.307 msExecution time: 41.633 ms(5 rows)Time: 42.676 ms

很显然都会做全表扫描

二、创建btree索引

PostgreSQL默认索引是btree

CREATE INDEX ix_gallery_map_author ON gallery_map (author);EXPLAIN ANALYZE select * from gallery_map where author = '曹志耘'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on gallery_map (cost=36.36..2715.37 rows=1025 width=621) (actual time=0.457..1.312 rows=1031 loops=1)Recheck Cond: ((author)::text = '曹志耘'::text)Heap Blocks: exact=438-> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..36.10 rows=1025 width=0) (actual time=0.358..0.358 rows=1031 loops=1) Index Cond: ((author)::text = '曹志耘'::text)Planning time: 0.416 msExecution time: 1.422 ms(7 rows)Time: 2.462 msEXPLAIN ANALYZE select * from gallery_map where author like '曹志耘'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on gallery_map (cost=36.36..2715.37 rows=1025 width=621) (actual time=0.752..2.119 rows=1031 loops=1)Filter: ((author)::text ~~ '曹志耘'::text)Heap Blocks: exact=438-> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..36.10 rows=1025 width=0) (actual time=0.560..0.560 rows=1031 loops=1) Index Cond: ((author)::text = '曹志耘'::text)Planning time: 0.270 msExecution time: 2.295 ms(7 rows)Time: 3.444 msEXPLAIN ANALYZE select * from gallery_map where author like '曹志耘%'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.015..41.389 rows=1031 loops=1)Filter: ((author)::text ~~ '曹志耘%'::text)Rows Removed by Filter: 71315Planning time: 0.260 msExecution time: 41.518 ms(5 rows)Time: 42.430 msEXPLAIN ANALYZE select * from gallery_map where author like '%研究室'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=2282 width=621) (actual time=0.064..52.824 rows=2152 loops=1)Filter: ((author)::text ~~ '%研究室'::text)Rows Removed by Filter: 70194Planning time: 0.254 msExecution time: 53.064 ms(5 rows)Time: 53.954 ms

可以看到,等于、like的全匹配是用到索引的,like的模糊查询还是全表扫描

三、创建gin索引

CREATE EXTENSION pg_trgm;CREATE INDEX ix_gallery_map_author ON gallery_map USING gin (author gin_trgm_ops);EXPLAIN ANALYZE select * from gallery_map where author like '曹%'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on gallery_map (cost=19.96..2705.69 rows=1028 width=621) (actual time=0.419..1.771 rows=1031 loops=1)Recheck Cond: ((author)::text ~~ '曹%'::text)Heap Blocks: exact=438-> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..19.71 rows=1028 width=0) (actual time=0.312..0.312 rows=1031 loops=1) Index Cond: ((author)::text ~~ '曹%'::text)Planning time: 0.358 msExecution time: 1.916 ms(7 rows)Time: 2.843 msEXPLAIN ANALYZE select * from gallery_map where author like '%耘%'; QUERY PLAN -----------------------------------------------------------------------------------------------------------------Seq Scan on gallery_map (cost=0.00..7002.32 rows=1028 width=621) (actual time=0.015..51.641 rows=1031 loops=1)Filter: ((author)::text ~~ '%耘%'::text)Rows Removed by Filter: 71315Planning time: 0.268 msExecution time: 51.957 ms(5 rows)Time: 52.899 msEXPLAIN ANALYZE select * from gallery_map where author like '%研究室%'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on gallery_map (cost=31.83..4788.42 rows=2559 width=621) (actual time=0.914..4.195 rows=2402 loops=1)Recheck Cond: ((author)::text ~~ '%研究室%'::text)Heap Blocks: exact=868-> Bitmap Index Scan on ix_gallery_map_author (cost=0.00..31.19 rows=2559 width=0) (actual time=0.694..0.694 rows=2402 loops=1) Index Cond: ((author)::text ~~ '%研究室%'::text)Planning time: 0.306 msExecution time: 4.403 ms(7 rows)Time: 5.227 ms

gin_trgm索引的效果好多了

由于pg_trgm的索引是把字符串切成多个3元组,然后使用这些3元组做匹配,所以gin_trgm索引对于少于3个字符(包括汉字)的查询,只有前缀匹配会走索引

另外,还测试了btree_gin,效果和btree一样

注意:

gin_trgm要求数据库必须使用UTF-8编码

demo_v1 # \l demo_v1 List of databasesName | Owner | Encoding | Collate | Ctype | Access privileges---------+-----------+----------+-------------+-------------+-------------------demo_v1 | wmpp_user | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

文章来源:脚本之家

来源地址:https://www.jb51.net/article/204875.htm

尊敬的看官您对postgresql中的like查询优化方案有什么看法呢?互联网品牌制作专家愿与您共同探讨!版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请加微信号oem365 举报,一经查实,本站将立刻删除。

上一篇: 解决PostgreSQLArray使用中的一些小问题   返 回   下一篇:文章收录率=关键词库?

相关资讯

小程序应用场景 | 小程序解决方案 | 小程序案例 | 小程序应功能 | 软文发布资源 | 网站设计

酒店预约解决方案
酒店预约小程序走红网络,也是传统转型的重要变化
详情
投票解决方案
强大功能,傻瓜式管理,有效防止作弊刷票
详情
新零售解决方案
小程序是现在新零售的新方式,深度结合线上线下
详情
预约服务解决方案
预约到店小程序通过用户在线预约到店等
详情
企业官网解决方案
企业展示小程序主要展示企业信息、产品案例等
详情
教育培训解决方案
主要展示教育机构的课程、可在线预约与购买课程
详情
推广解决方案
可以让企业通过推广模式全面展示自己的产品
详情
到店解决方案
到店服务解决方案,主要是面向实体餐饮门店
详情

我们的服务是否能满足您的需求?

如果不能,请联系我们或给我们留言,我们收到后会第一时间联系您!感谢您对我们的关注!

粤公网安备 44200002005005号