Task #4106: Configure PostgreSQL memory on the CNs to accommodate large getLogRecords() calls - Infrastructure - DataONE Tasks

Task #4106

Configure PostgreSQL memory on the CNs to accommodate large getLogRecords() calls

Added by Chris Jones over 11 years ago. Updated over 11 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Chris Brumgard

Category:

d1_cn_buildout

Target version:

2013.44-Block.6.1

Start date:

2013-10-18

Due date:

% Done:

100%

Milestone:

None

Product Version:

Story Points:

Sprint:

Description

For Metacat-based MN and CN installations with millions of log records, calls to getLogRecords() take over 5 minutes to execute, and the limiting factor seems to be the SQL query. Ben has patched Metacat with two new indices in the database, which enables faster queries against the access_log table. However, this needs to be coupled with changes to the PostgreSQL memory configuration.

For a 32GB VM, we set the effective_cache_size to 16GB so the query planner would prioritize using indices and taking advantage of available RAM. We also set work_mem = 100MB, with max_connections at 100. This would result in a potential use of 10GB of RAM if all 100 connections were executing large or complex sort operations.

For the CNs, change these settings in dataone-cn-os-core postinst script to appropriate levels. For instance, if max_connections is 200, perhaps use work_mem = 50MB on a 32GB VM. The recommended settings for effective_cache_size is 1/2 to 3/4 of total RAM on the VM.

These changes should coincide with an upgrade to Metacat 2.2.2 on the CNs.

See http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Related issues

History

#1 Updated by Chris Brumgard over 11 years ago

Chris,

This should be easy enough to do.

1) When is the upgrade of Metacat to 2.2.2 planned?
2) Which set of CN will this affect? If this is the prod boxes, I only have sudo privileges on the orc node. I'm not suppose to have sudo privileges on any of the prod boxes after our security discussion a few weeks ago.

#2 Updated by Chris Brumgard over 11 years ago

For configuring the work memory, should work memory be always set to 10GB / max_connections or 1/3 RAM / max_connections.

#3 Updated by Chris Jones over 11 years ago

Chris - the point is to allow plenty of work_mem for sorting operations, etc., and so in lieu of the VMs total RAM, 25% to 33%, divided by max_connections, would be a decent work_mem setting. However, for VMs with little memory (say 4GB), I'd be more conservative and use 15-20%.

#4 Updated by Chris Brumgard over 11 years ago

Status changed from New to Closed
% Done changed from 0 to 100
translation missing: en.field_remaining_hours set to 0.0

Also available in: Atom PDF

Project

General

Profile

Infrastructure

Issues

Custom queries