sql - Hardcoding function parameter yields 5x speed up -


i have following stored procedure generate dynamic query.

given list of conditions/filters, finds visitors belong given app. app_id passed in argument.

if call function app id, , use argument in dynamic query, runs in around 200ms.

however, if hardcode app_id, runs in < 20ms.

here example of how invoke procedure

select id find_matching_visitors('my_app_id', '{}', '{( field = ''app_name'' , string_value ilike ''my awesome app''  )}') 

any ideas ideas why?

    create or replace function find_matching_visitors(app_id text, default_filters text[], custom_filters text[])     returns table (       id varchar     )     $body$     declare         default_filterstring text;         custom_filterstring text;         default_filter_length integer;         custom_filter_length integer;         sql varchar;     begin         default_filter_length := coalesce(array_length(default_filters, 1), 0);         custom_filter_length := coalesce(array_length(custom_filters, 1), 0);          default_filterstring := array_to_string(default_filters, ' , ');         custom_filterstring := array_to_string(custom_filters, ' or ');          if custom_filterstring = '' or custom_filterstring null             custom_filterstring := '1=1';         end if;          if default_filterstring = '' or default_filterstring null             default_filterstring := '1=1';         end if;          sql := format('                     select v.id visitors v                     left join trackings t on v.id = t.visitor_id                     v.app_id = ''hardcoded_app_id'' , (%s) , (%s)                     group v.id                  ', custom_filterstring, default_filterstring, custom_filter_length, custom_filter_length);         return query execute sql;      end;     $body$     language 'plpgsql'; 

analyze without hardcoding app_id

limit  (cost=481.86..481.99 rows=50 width=531) (actual time=163.579..163.581 rows=9 loops=1) 2     ->  sort  (cost=481.86..484.26 rows=960 width=531) (actual time=163.578..163.579 rows=9 loops=1) 3           sort key: v0.last_seen desc 4           sort method: quicksort  memory: 30kb 5           ->  windowagg  (cost=414.62..449.97 rows=960 width=531) (actual time=163.553..163.560 rows=9 loops=1) 6                 ->  hash join  (cost=414.62..437.97 rows=960 width=523) (actual time=163.525..163.537 rows=9 loops=1) 7                       hash cond: ((find_matching_visitors.id)::text = (v0.id)::text) 8                       ->  function scan on find_matching_visitors  (cost=0.25..10.25 rows=1000 width=32) (actual time=153.918..153.918 rows=9 loops=1) 9                       ->  hash  (cost=354.19..354.19 rows=4814 width=523) (actual time=9.578..9.578 rows=4887 loops=1) 10                            buckets: 8192  batches: 1  memory usage: 2145kb 11                            ->  seq scan on visitors v0  (cost=0.00..354.19 rows=4814 width=523) (actual time=0.032..4.993 rows=4887 loops=1) 12                                  filter: ((not merged) , (((type)::text = 'user'::text) or ((type)::text = 'lead'::text))) 13                                  rows removed filter: 138 14  planning time: 1.134 ms 15  execution time: 163.705 ms 

analyze when hardcoding app_id

limit  (cost=481.86..481.99 rows=50 width=531) (actual time=25.890..25.893 rows=9 loops=1) 2     ->  sort  (cost=481.86..484.26 rows=960 width=531) (actual time=25.888..25.890 rows=9 loops=1) 3           sort key: v0.last_seen desc 4           sort method: quicksort  memory: 30kb 5           ->  windowagg  (cost=414.62..449.97 rows=960 width=531) (actual time=25.862..25.870 rows=9 loops=1) 6                 ->  hash join  (cost=414.62..437.97 rows=960 width=523) (actual time=25.830..25.841 rows=9 loops=1) 7                       hash cond: ((find_matching_visitors.id)::text = (v0.id)::text) 8                       ->  function scan on find_matching_visitors  (cost=0.25..10.25 rows=1000 width=32) (actual time=15.875..15.876 rows=9 loops=1) 9                       ->  hash  (cost=354.19..354.19 rows=4814 width=523) (actual time=9.936..9.936 rows=4887 loops=1) 10                            buckets: 8192  batches: 1  memory usage: 2145kb 11                            ->  seq scan on visitors v0  (cost=0.00..354.19 rows=4814 width=523) (actual time=0.013..5.232 rows=4887 loops=1) 12                                  filter: ((not merged) , (((type)::text = 'user'::text) or ((type)::text = 'lead'::text))) 13                                  rows removed filter: 138 14  planning time: 0.772 ms 15  execution time: 26.006 ms 

update 1: added explain both cases. note: they're exact same plans, time spent varies

update 2: turned out needed pass app_id argument format function, instead of embedding directly. took query time down around 20/30ms

the hard coded values important determine optimal query plan. example:

select * some_table id_person=231 select * some_table id_person=10 

when 90% of some_table has id_person=231 pg uses full table scan, because that's fastest. when 1% of records has id_person=10 , uses index scan. plan used depends on value of parameter.

when using not hard coded value, e.g.

select * some_table id_person=? 

it isn't able determine optimal query plan , query can slower.


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -