Table of Contents

Postgres Stack Traces

See also Systems Monitoring | Database | Postgresql | Index

Reference Matt Smiley's email Dated: 23 Jan 2013

Attached are annotated examples of:

How to Capture

# Capture stack traces for all non-idle postgres processes.
# Must run as user "postgres".  May run from a couple seconds to a couple minutes depending on system load.

[postgres@db04.prd ~]$ ~/bin/capture_stack_traces_from_nonidle_postgres_pids.sh > /db/scratch/stack_traces.$(date '+%Y%m%d_%H%M%S').out

How to Analyze

[postgres@db04.prd ~]$ ls -l /db/scratch/stack_traces.20130116_165754.out
-rw-r--r--. 1 postgres postgres 123111 Jan 16 16:58 /db/scratch/stack_traces.20130116_165754.out

[postgres@db04.prd ~]$ wc -l /db/scratch/stack_traces.20130116_165754.out
2365 /db/scratch/stack_traces.20130116_165754.out

[postgres@db04.prd ~]$ less /db/scratch/stack_traces.20130116_165754.out
----------------------------------------------------------------------------------
Start time: 2013-01-16 16:57:54
 16:57:54 up 32 days, 14:38,  6 users,  load average: 19.64, 21.55, 25.68

Stack trace for PID: 8239
[Thread debugging using libthread_db enabled]
0x00007fac158d3360 in __read_nocancel () from /lib64/libc.so.6

Thread 1 (Thread 0x7fac171f97c0 (LWP 8239)):
#0  0x00007fac158d3360 in __read_nocancel () from /lib64/libc.so.6
No symbol table info available.
#1  0x00000000006157dd in read (file=<value optimized out>, buffer=0x7fab76fae740 "\313\022", amount=8192) at /usr/include/bits/unistd.h:45
No locals.
#2  FileRead (file=<value optimized out>, buffer=0x7fab76fae740 "\313\022", amount=8192) at fd.c:1191
        returnCode = 0
#3  0x000000000062d622 in mdread (reln=<value optimized out>, forknum=<value optimized out>, blocknum=11213365, buffer=0x7fab76fae740 "\313\022") at md.c:640
        seekpos = <value optimized out>
        nbytes = <value optimized out>
        v = 0x1568f00
        __func__ = "mdread"
#4  0x0000000000612591 in ReadBuffer_common (smgr=<value optimized out>, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=11213365, mode=RBM_NORMAL, strategy=0x1665290, hit=0x7fff72cb703f "") at bufmgr.c:438
...
[postgres@db04.prd ~]$ ls -l ~/bin/stack_trace_filter.*
-rwxr-xr-x. 1 postgres postgres 347 Jan 10 18:24 /var/lib/pgsql/bin/stack_trace_filter.regexp_match_on_whole_stack.pl
-rwxr-xr-x. 1 postgres postgres 117 Jan 10 22:27 /var/lib/pgsql/bin/stack_trace_filter.simplify_frames.pl
[postgres@db04.prd ~]$ cat /db/scratch/stack_traces.20130116_165754.out | ~/bin/stack_trace_filter.simplify_frames.pl | ~/bin/stack_trace_filter.regexp_match_on_whole_stack.pl '' | less
------------------------------------------------------------------------------
Frame 0 __read_nocancel from /lib64/libc.so.6
Frame 1 read at /usr/include/bits/unistd.h:45
Frame 2 FileRead at fd.c:1191
Frame 3 mdread at md.c:640
Frame 4 ReadBuffer_common at bufmgr.c:438
Frame 5 ReadBufferExtended at bufmgr.c:246
Frame 6 heapgetpage at heapam.c:230
Frame 7 heapgettup_pagemode at heapam.c:824
Frame 8 heap_getnext at heapam.c:1349
Frame 9 SeqNext at nodeSeqscan.c:66
Frame 10 ExecScanFetch at execScan.c:82
Frame 11 ExecScan at execScan.c:167
Frame 12 ExecProcNode at execProcnode.c:394
Frame 13 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 14 ExecHashJoin at nodeHashjoin.c:209
Frame 15 ExecProcNode at execProcnode.c:447
Frame 16 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 17 ExecHashJoin at nodeHashjoin.c:209
Frame 18 ExecProcNode at execProcnode.c:447
Frame 19 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 20 ExecHashJoin at nodeHashjoin.c:209
Frame 21 ExecProcNode at execProcnode.c:447
Frame 22 ExecSort at nodeSort.c:103
Frame 23 ExecProcNode at execProcnode.c:458
Frame 24 ExecLimit at nodeLimit.c:91
Frame 25 ExecProcNode at execProcnode.c:490
Frame 26 ExecutePlan at execMain.c:1440
Frame 27 standard_ExecutorRun at execMain.c:314
Frame 28 explain_ExecutorRun at auto_explain.c:214
Frame 29 pgss_ExecutorRun at pg_stat_statements.c:564
Frame 30 PortalRunSelect at pquery.c:943
Frame 31 PortalRun at pquery.c:787
Frame 32 exec_execute_message at postgres.c:1965
Frame 33 PostgresMain at postgres.c:4025
Frame 34 BackendRun at postmaster.c:3611
Frame 35 BackendStartup at postmaster.c:3296
Frame 36 ServerLoop at postmaster.c:1460
Frame 37 PostmasterMain at postmaster.c:1121
Frame 38 main at main.c:199

Frame 0 slot_deform_tuple at heaptuple.c:1085
Frame 1 slot_getattr at heaptuple.c:1213
Frame 2 ExecMakeFunctionResultNoSets at execQual.c:1864
Frame 3 ExecQual at execQual.c:5006
Frame 4 ExecScan at execScan.c:195
Frame 5 ExecProcNode at execProcnode.c:394
Frame 6 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 7 ExecHashJoin at nodeHashjoin.c:209
Frame 8 ExecProcNode at execProcnode.c:447
Frame 9 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 10 ExecHashJoin at nodeHashjoin.c:209
Frame 11 ExecProcNode at execProcnode.c:447
Frame 12 ExecHashJoinOuterGetTuple at nodeHashjoin.c:656
Frame 13 ExecHashJoin at nodeHashjoin.c:209
Frame 14 ExecProcNode at execProcnode.c:447
Frame 15 ExecSort at nodeSort.c:103
Frame 16 ExecProcNode at execProcnode.c:458
Frame 17 ExecLimit at nodeLimit.c:91
Frame 18 ExecProcNode at execProcnode.c:490
Frame 19 ExecutePlan at execMain.c:1440
Frame 20 standard_ExecutorRun at execMain.c:314
Frame 21 explain_ExecutorRun at auto_explain.c:214
Frame 22 pgss_ExecutorRun at pg_stat_statements.c:564
Frame 23 PortalRunSelect at pquery.c:943
Frame 24 PortalRun at pquery.c:787
Frame 25 exec_execute_message at postgres.c:1965
Frame 26 PostgresMain at postgres.c:4025
Frame 27 BackendRun at postmaster.c:3611
Frame 28 BackendStartup at postmaster.c:3296
Frame 29 ServerLoop at postmaster.c:1460
Frame 30 PostmasterMain at postmaster.c:1121
Frame 31 main at main.c:199
...

Find all stacks that contain a particular function call anywhere in the stack. If you need to see the context in which a function was called, you need more than just “grep”. This gives you the whole stack, regardless of how many frames are above/below the function call you're profiling.

[postgres@db04.prd ~]$ cat /db/scratch/stack_traces.20130116_165754.out | ~/bin/stack_trace_filter.simplify_frames.pl | ~/bin/stack_trace_filter.regexp_match_on_whole_stack.pl 'XLogWrite'

------------------------------------------------------------------------------

Frame 0 __fdatasync_nocancel from /lib64/libc.so.6
Frame 1 issue_xlog_fsync at xlog.c:8831
Frame 2 XLogWrite at xlog.c:1908
Frame 3 XLogFlush at xlog.c:2116
Frame 4 RecordTransactionCommit at xact.c:1086
Frame 5 CommitTransaction at xact.c:1844
Frame 6 CommitTransactionCommand at xact.c:2562
Frame 7 finish_xact_command at postgres.c:2452
Frame 8 exec_execute_message at postgres.c:1982
Frame 9 PostgresMain at postgres.c:4025
Frame 10 BackendRun at postmaster.c:3611
Frame 11 BackendStartup at postmaster.c:3296
Frame 12 ServerLoop at postmaster.c:1460
Frame 13 PostmasterMain at postmaster.c:1121
Frame 14 main at main.c:199
.
[postgres@db04.prd ~]$ cat /db/scratch/stack_traces.20130116_165754.out | ~/bin/stack_trace_filter.simplify_frames.pl | ~/bin/stack_trace_filter.regexp_match_on_whole_stack.pl '' | sort -k2 -n | uniq -c | less
------------------------------------------------------------------------------
     22 
      1 Frame 0 calc_bucket at dynahash.c:755
      1 Frame 0 CleanUpLock at lock.c:1092
      1 Frame 0 __fdatasync_nocancel from /lib64/libc.so.6
      4 Frame 0 __read_nocancel from /lib64/libc.so.6
     13 Frame 0 recv from /lib64/libc.so.6
      1 Frame 0 slot_deform_tuple at heaptuple.c:1065
      1 Frame 0 slot_deform_tuple at heaptuple.c:1085
      1 Frame 1 expand_table at dynahash.c:1252
      1 Frame 1 issue_xlog_fsync at xlog.c:8831
      1 Frame 1 LockReleaseAll at lock.c:1650
      4 Frame 1 read at /usr/include/bits/unistd.h:45
     13 Frame 1 recv at /usr/include/bits/socket2.h:45
      2 Frame 1 slot_getattr at heaptuple.c:1213
      2 Frame 2 ExecMakeFunctionResultNoSets at execQual.c:1864
      4 Frame 2 FileRead at fd.c:1191
      1 Frame 2 hash_search_with_hash_value at dynahash.c:959
      1 Frame 2 ProcReleaseLocks at proc.c:665
     13 Frame 2 secure_read at be-secure.c:304
      1 Frame 2 XLogWrite at xlog.c:1908
      2 Frame 3 ExecQual at execQual.c:5006
      1 Frame 3 LookupTupleHashEntry at execGrouping.c:376
      4 Frame 3 mdread at md.c:640
     13 Frame 3 pq_recvbuf at pqcomm.c:816
      1 Frame 3 ResourceOwnerReleaseInternal at resowner.c:266
      1 Frame 3 XLogFlush at xlog.c:2116
      1 Frame 4 buildSubPlanHash at nodeSubplan.c:539
      2 Frame 4 ExecScan at execScan.c:195
     13 Frame 4 pq_getbyte at pqcomm.c:857
      4 Frame 4 ReadBuffer_common at bufmgr.c:438
      1 Frame 4 RecordTransactionCommit at xact.c:1086
      1 Frame 4 ResourceOwnerRelease at resowner.c:186
...
[postgres@db04.prd ~]$ cat /db/scratch/stack_traces.20130116_165754.out | ~/bin/stack_trace_filter.simplify_frames.pl | ~/bin/stack_trace_filter.regexp_match_on_whole_stack.pl '' | sort -k2 -n | uniq -c | awk '$1 >= 3'
------------------------------------------------------------------------------
     22 
      4 Frame 0 __read_nocancel from /lib64/libc.so.6
     13 Frame 0 recv from /lib64/libc.so.6
      4 Frame 1 read at /usr/include/bits/unistd.h:45
     13 Frame 1 recv at /usr/include/bits/socket2.h:45
      4 Frame 2 FileRead at fd.c:1191
     13 Frame 2 secure_read at be-secure.c:304
      4 Frame 3 mdread at md.c:640
     13 Frame 3 pq_recvbuf at pqcomm.c:816
     13 Frame 4 pq_getbyte at pqcomm.c:857
      4 Frame 4 ReadBuffer_common at bufmgr.c:438
      4 Frame 5 ReadBufferExtended at bufmgr.c:246
     13 Frame 5 SocketBackend at postgres.c:345
      3 Frame 6 index_getnext at indexam.c:514
     13 Frame 6 ReadCommand at postgres.c:467
      3 Frame 7 IndexNext at nodeIndexscan.c:78
     13 Frame 7 PostgresMain at postgres.c:3932
     13 Frame 8 BackendRun at postmaster.c:3611
      3 Frame 8 ExecScanFetch at execScan.c:82
     13 Frame 9 BackendStartup at postmaster.c:3296
      3 Frame 9 ExecScan at execScan.c:167
      4 Frame 10 ExecProcNode at execProcnode.c:398
     13 Frame 10 ServerLoop at postmaster.c:1460
      4 Frame 11 ExecNestLoop at nodeNestloop.c:174
     13 Frame 11 PostmasterMain at postmaster.c:1121
      4 Frame 12 ExecProcNode at execProcnode.c:439
     13 Frame 12 main at main.c:199
      4 Frame 14 ExecProcNode at execProcnode.c:439