Our incident postmortem template
Sharing our incident postmortem template with some pointers on the review process, what to include in each section, and best practice examples.
"It's dead, Jim": How we write an incident postmortem
How to write an incident postmortem–what it is, why it’s important, who should write it, and considerations to keep in mind before putting pen to paper.
Deadlines, lies and videotape: The tale of a gRPC bug
If you use gRPC in your services, you’ll want to make sure you set a reasonable deadline for your RPC calls, upgrading to gRPC 1.16 as soon as possible is highly recommended. You should also enable client-side keepalive, and adjust the kernel setting for tcp_syn_retries (at least until the fix for this issue gets released).