There is always a need for profiling tools in databases for admins or developers. While it is easy to understand the point where an SQL is spending more time using EXPLAIN or EXPLAIN ANALYZE in PostgreSQL, the same would not work for functions. Recently, Jobin has published a blog post where he demonstrated how plprofiler can be useful in profiling functions. plprofiler builds call graphs and creates flame graphs which make the report very easy to understand. Similarly, there is another interesting project called plpgsql_check which can be used for a similar purpose as plprofiler, while it also looks at code and points out compilation errors. Let us see all of that in action, in this blog post.
Installing plpgsql-check
You could use yum on RedHat/CentOS to install this extension from PGDG repository. Steps to perform source installation on Ubuntu/Debian are also mentioned in the following logs.
On RedHat/CentOS
1 | $ sudo yum install plpgsql_check_11 |
On Ubuntu/Debian
1 2 3 4 | $ sudo apt-get install postgresql-server-dev-11 libicu-dev gcc make $ git clone https://github.com/okbob/plpgsql_check.git $ cd plpgsql_check/ $ make && make install |
Creating and enabling this extension
There are 3 advantages of using plpgsql_check
- Checking for compilation errors in a function code
- Finding dependencies in functions
- Profiling functions
When using plpgsql_check for the first 2 requirements, you may not need to add any entry to shared_preload_libraries. However, if you need to use it for profiling functions (3), then you should add appropriate entries to shared_preload_libraries so that it could load both plpgsql and plpgsql_check. Due to dependencies, plpgsql must be before plpgsql_check in the shared_preload_libraries config as you see in the following example :
1 | shared_preload_libraries = plpgsql, plpgsql_check |
Any change to shared_preload_libraries requires a restart. You may see the following error when you do not have plpgsql before plpgsql_check in the shared_preload_libraries config.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | $ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql_check' $ pg_ctl -D /var/lib/pgsql/11/data restart -mf waiting for server to shut down.... done server stopped waiting for server to start....2019-07-07 02:07:10.104 EDT [6423] FATAL: could not load library "/usr/pgsql-11/lib/plpgsql_check.so": /usr/pgsql-11/lib/plpgsql_check.so: undefined symbol: plpgsql_parser_setup 2019-07-07 02:07:10.104 EDT [6423] LOG: database system is shut down stopped waiting pg_ctl: could not start server Examine the log output. $ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql, plpgsql_check' $ pg_ctl -D /var/lib/pgsql/11/data start ..... server started |
Once the PostgreSQL instance is started, create this extension in the database where you need to perform any of the previously discussed 3 tasks.
1 2 | $ psql -d percona -c "CREATE EXTENSION plpgsql_check" CREATE EXTENSION |
Finding Compilation Errors
As discussed earlier, this extension can help developers and admins determine compilation errors. But why is it needed? Let’s consider the following example where we see no errors while creating the function. By the way, I have taken this example from my previous blog post where I was talking about Automatic Index recommendations using hypopg and pg_qualstats. You might want to read that blog post to understand the logic of the following function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | percona=# CREATE OR REPLACE FUNCTION find_usable_indexes() percona-# RETURNS VOID AS percona-# $$ percona$# DECLARE percona$# l_queries record; percona$# l_querytext text; percona$# l_idx_def text; percona$# l_bef_exp text; percona$# l_after_exp text; percona$# hypo_idx record; percona$# l_attr record; percona$# /* l_err int; */ percona$# BEGIN percona$# CREATE TABLE IF NOT EXISTS public.idx_recommendations (queryid bigint, percona$# query text, current_plan jsonb, recmnded_index text, hypo_plan jsonb); percona$# FOR l_queries IN percona$# SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums, percona$# pg_qualstats_example_query(t.queryid) as query percona$# FROM percona$# ( percona$# SELECT qs.relid::regclass AS relname, qs.relid AS relid, qs.queryid, percona$# string_agg(DISTINCT attnames.attnames,',') AS attnames, qs.attnums percona$# FROM pg_qualstats_all qs percona$# JOIN pg_qualstats q ON q.queryid = qs.queryid percona$# JOIN pg_stat_statements ps ON q.queryid = ps.queryid percona$# JOIN pg_amop amop ON amop.amopopr = qs.opno percona$# JOIN pg_am ON amop.amopmethod = pg_am.oid, percona$# LATERAL percona$# ( percona$# SELECT pg_attribute.attname AS attnames percona$# FROM pg_attribute percona$# JOIN unnest(qs.attnums) a(a) ON a.a = pg_attribute.attnum percona$# AND pg_attribute.attrelid = qs.relid percona$# ORDER BY pg_attribute.attnum) attnames, percona$# LATERAL unnest(qs.attnums) attnum(attnum) percona$# WHERE NOT percona$# ( percona$# EXISTS percona$# ( percona$# SELECT 1 percona$# FROM pg_index i percona$# WHERE i.indrelid = qs.relid AND percona$# (arraycontains((i.indkey::integer[])[0:array_length(qs.attnums, 1) - 1], percona$# qs.attnums::integer[]) OR arraycontains(qs.attnums::integer[], percona$# (i.indkey::integer[])[0:array_length(i.indkey, 1) + 1]) AND i.indisunique))) percona$# GROUP BY qs.relid, qs.queryid, qs.qualnodeid, qs.attnums) t percona$# GROUP BY t.relid, t.relname, t.queryid, t.attnames, t.attnums percona$# LOOP percona$# /* RAISE NOTICE '% : is queryid',l_queries.queryid; */ percona$# execute 'explain (FORMAT JSON) '||l_queries.query INTO l_bef_exp; percona$# execute 'select hypopg_reset()'; percona$# execute 'SELECT indexrelid,indexname FROM hypopg_create_index(''CREATE INDEX on '||l_queries.relname||'('||l_queries.attnames||')'')' INTO hypo_idx; percona$# execute 'explain (FORMAT JSON) '||l_queries.query INTO l_after_exp; percona$# execute 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')' INTO l_idx_def; percona$# INSERT INTO public.idx_recommendations (queryid,query,current_plan,recmnded_index,hypo_plan) percona$# VALUES (l_queries.queryid,l_querytext,l_bef_exp::jsonb,l_idx_def,l_after_exp::jsonb); percona$# END LOOP; percona$# execute 'select hypopg_reset()'; percona$# END; percona$# $$ LANGUAGE plpgsql; CREATE FUNCTION |
From the above log, it has created the function with no errors. Unless we call the above function, we do not know if it has any compilation errors. Surprisingly, with this extension, we can use the plpgsql_check_function_tb() function to learn whether there are any errors, without actually calling it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | percona=# SELECT functionid, lineno, statement, sqlstate, message, detail, hint, level, position, left (query,60),context FROM plpgsql_check_function_tb('find_usable_indexes()'); -[ RECORD 1 ]------------------------------------------------------------ functionid | find_usable_indexes lineno | 14 statement | FOR over SELECT rows sqlstate | 42P01 message | relation "pg_qualstats_all" does not exist detail | hint | level | error position | 306 left | SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums, context | |
From the above log, it is clear that there is an error where a relation used in the function does not exist. But, if we are using dynamic SQLs that are assembled in runtime, false positives are possible, as you can see in the following example. So, you may avoid the functions using dynamic SQL’s or comment the code containing those SQLs before checking for other compilation errors.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | percona=# select * from plpgsql_check_function_tb('find_usable_indexes()'); -[ RECORD 1 ]------------------------------------------------------------------------------ functionid | find_usable_indexes lineno | 50 statement | EXECUTE sqlstate | 00000 message | cannot determinate a result of dynamic SQL detail | There is a risk of related false alarms. hint | Don't use dynamic SQL and record type together, when you would check function. level | warning position | query | context | -[ RECORD 2 ]------------------------------------------------------------------------------ functionid | find_usable_indexes lineno | 52 statement | EXECUTE sqlstate | 55000 message | record "hypo_idx" is not assigned yet detail | The tuple structure of a not-yet-assigned record is indeterminate. hint | level | error position | query | context | SQL statement "SELECT 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')'" |
Finding Dependencies
This extension can be used to find dependent objects in a function. This way, it becomes easy for administrators to understand the objects being used in a function without spending a huge amount of time reading the code. The trick is to simply pass your function as a parameter to plpgsql_show_dependency_tb() as you see in the following example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | percona=# select * from plpgsql_show_dependency_tb('find_usable_indexes()'); type | oid | schema | name | params ----------+-------+------------+----------------------------+----------- FUNCTION | 16547 | public | pg_qualstats | () FUNCTION | 16545 | public | pg_qualstats_example_query | (bigint) FUNCTION | 16588 | public | pg_stat_statements | (boolean) RELATION | 2601 | pg_catalog | pg_am | RELATION | 2602 | pg_catalog | pg_amop | RELATION | 1249 | pg_catalog | pg_attribute | RELATION | 1262 | pg_catalog | pg_database | RELATION | 2610 | pg_catalog | pg_index | RELATION | 16480 | public | idx_recommendations | RELATION | 16549 | public | pg_qualstats | RELATION | 16559 | public | pg_qualstats_all | RELATION | 16589 | public | pg_stat_statements | (12 rows) |
Profiling Functions
This is one of the very interesting features. Once you have added the appropriate entries to shared_preload_libraries as discussed earlier, you could easily enable or disable profiling through a GUC: plpgsql_check.profiler. This parameter can either be set globally or for only your session. Here’s an example to understand it better.
1 2 3 4 5 6 7 | percona=# ALTER SYSTEM SET plpgsql_check.profiler TO 'ON'; ALTER SYSTEM percona=# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) |
When you set it globally, all the functions running in the database are automatically profiled and stored. This may be fine in a development or a testing environment, but you could also enable profiling of functions called in a single session through a session-level setting as you see in the following example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | percona=# BEGIN; BEGIN percona=# SET plpgsql_check.profiler TO 'ON'; SET percona=# select salary_update(1000); salary_update --------------- (1 row) percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)'); lineno | avg_time | source --------+----------+--------------------------------------------------------------------------------------------------------------- 1 | | 2 | | DECLARE 3 | | l_abc record; 4 | | new_sal INT; 5 | 0.005 | BEGIN 6 | 144.96 | FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2' 7 | | LOOP 8 | NaN | new_sal := l_abc.salary + sal_raise; 9 | NaN | UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id; 10 | | END LOOP; 11 | | END; (11 rows) --- Create an Index and check if it improves the execution time of FOR loop. percona=# CREATE INDEX idx_1 ON employee (emp_id, dept_id); CREATE INDEX percona=# select salary_update(1000); salary_update --------------- (1 row) percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)'); lineno | avg_time | source --------+----------+--------------------------------------------------------------------------------------------------------------- 1 | | 2 | | DECLARE 3 | | l_abc record; 4 | | new_sal INT; 5 | 0.007 | BEGIN 6 | 73.074 | FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2' 7 | | LOOP 8 | NaN | new_sal := l_abc.salary + sal_raise; 9 | NaN | UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id; 10 | | END LOOP; 11 | | END; (11 rows) percona=# END; COMMIT percona=# show plpgsql_check.profiler; plpgsql_check.profiler ------------------------ on (1 row) |
In the above log, I have opened a new transaction block and enabled the parameter plpgsql_check.profiler only for that block. So any function that I have executed in that transaction is profiled, which can be extracted using plpgsql_profiler_function_tb(). Once we have identified the area where the execution time is high, the immediate action is to tune that piece of code. After creating the index, I have called the function again. It has now performed faster than earlier.
Conclusion
Special thanks to Pavel Stehule who has authored this extension and also to the contributors who have put this extension into a usable stage today. This is one of the simplest extensions that can be used to check for compilation errors and dependencies. While this can also be a handy profiling tool, a developer may find both plprofiler or plpgsql_check helpful for profiling as well.
Just note from plpgsql_check author – a) thank you for your article, b) profiler needs shared memory only when profile should be persistent (over sessions). If you don’t need it (you can use profiler just in one session), then preloading is not necessary.
Hi Sir,
How to use plpgsql_check extension in postgres advanced server (EnterpriseDB )? as we don’t mention language during creating package .
Please suggest can we use this extension to check user defined edb packages?