Q: How can I split large LISTSERV message archives to improve performance?
Many lists have been running for a long time and have extensive message archives of past postings. This longevity is a tribute to the robustness of LISTSERV, but sometimes these very large archives can lead to performance problems. This tech tip examines a few ways that you can split large archives to improve performance.
The Problem of Large Archives
You can see the size of the archives by looking at LISTSERV log files:
*** FIO file cache now totals 121957k. A list of cached files follows. ***
File size Usage Flags File name
--------- ----- ----- ---------
121577k 1 U /data/listserv/archive/sizing/sizing.log1604
5k 2 /home/listserv/home/sizing.list
176k 1 /home/listserv/tmp/listserv.cmsut1
191k 10 K /home/listserv/home/default.mailtpl
5k 10 K /home/listserv/home/site.mailtpl
5k 2 /home/listserv/home/info_jp.list
2k 1 /home/listserv/home/digests.file
In this example, the message archive file for the Sizing list for April 2016 is 121MB, and that's just for the first week. While one large archive file may not be a problem, if you have ten or more such lists with large archives all being accessed at the same time, LISTSERV will quickly run out of available RAM needed to process such files, so it must page RAM to disk, always a slow process.
So what to do about this? Adding more RAM to the machine may help, but sometimes this is not possible due to hardware limits. Fortunately there are some actions that you can take in LISTSERV to minimize the impact of large archives.
Handling Current Archives
The first thing to do is to change from monthly to weekly archives. This list seems to average about 125MB of messages per week. That's 500MB per month. While handling a 125MB file may not be too hard, near the end of the month, the file will be nearly 500MB, and that will be harder for LISTSERV to handle. Changing to weekly archives will give you four files of about 125MB each instead of one huge file of 500MB. Remember, this file must be processed and re-written each time a new message is posted to the list. For a busy list, this can be many times per day. Writing to smaller files will always be faster. In the list configuration you simply change:
Notebook= Yes, /data/listserv/archive/sizing, Monthly, Private
Notebook= Yes, /data/listserv/archive/sizing, Weekly, Private
This change will take effect immediately.
Deleting Historical Archives
Archives older than the current month may need a different approach. Since these are no longer active, there is no benefit in changing them to weekly. But what you can do is analyze how long you should keep them. For some lists, the historical information is important and must be retained. For other lists, you probably don't need to keep archives older than six months. So evaluate the purpose and needs of each of your large archive lists and determine an appropriate time limit for aging out and removing older archives that are no longer needed.
You can't do this with LISTSERV commands, and you do need direct access to the server where LISTSERV is running. Using your favorite navigation tool, cd into the archives directory for the chosen list:
01/23/2008 12:02 AM 4,873 NACSE-D.LOG0801
02/29/2008 01:47 PM 2,916 NACSE-D.LOG0802
03/11/2008 07:17 PM 41,828 NACSE-D.LOG0803
06/15/2008 12:26 AM 5,386 NACSE-D.LOG0806
07/16/2008 11:35 AM 40,587 NACSE-D.LOG0807
08/28/2008 10:56 PM 2,691 NACSE-D.LOG0808
11/19/2008 05:13 PM 6,948 NACSE-D.LOG0811
12/23/2008 02:58 PM 11,026 NACSE-D.LOG0812
06/20/1998 11:51 PM 13,048 NACSE-D.LOG9806
07/22/1998 01:07 AM 15,387 NACSE-D.LOG9807
08/07/1998 05:24 PM 9,943 NACSE-D.LOG9808
09/22/1998 03:50 PM 2,395 NACSE-D.LOG9809
10/16/1998 05:11 PM 1,871 NACSE-D.LOG9810
11/02/1998 10:57 AM 2,316 NACSE-D.LOG9811
12/24/1998 03:16 PM 1,803 NACSE-D.LOG9812
You can go by the system file date, but that may change, so it's better to go by the LISTSERV created filename. The format is always: LISTNAME.LOGYYMMx. Note that the year is only two digits, as is the month. The "x" is a letter A-E indicating the week of the month (for weekly message archives only).
Select the range of archive files you want to delete and then delete them using appropriate OS commands. Then go to the LISTSERV command line in the web interface (or send the command by email):
This command will reindex the remaining archives for easy access and searching.
Splitting Historical Archives
In some cases when the archives are very extensive and very large, it may be necessary to split the archives into two or more lists. What this means is that you create one or more separate lists, then move part of the archives to the separate lists.
Let's use our NACSE-D list as a detailed example. We want to split the archives for 2003 and earlier off to another list called NACSE-D-2003. First we create this list as a "clone" of the NACSE-D list:
Then configure the list for no subscribers, no postings and archive access only by subscribers of the original NACSE-D list. That's what the NACSE-D parameter means on the Notebook keyword:
Then, using appropriate OS commands, move all of the 2003 and earlier message archive files to the new NACSE-D-2003 archives directory. It is also necessary to rename these files so they match the new list name.
Then reindex both lists:
Then verify that both list home pages show the proper range of archives:
As a final touch you can edit the OBJECT-A0-ARCHIVES web template for the NACSE-D list to add a link at the bottom to point to the NACSE-D-2003 list:
Of course, you can split the original list archives into more than two lists. You might even want to do it by one-year segments, such as NACSE-D-2006, NACSE-D-2005, NACSE-D-2004, and so on. You can choose whatever makes sense for your particular situation and list. This is an involved process but will be worth the effort, thanks to greatly improved search performance and responsiveness of the web interface while preserving the important history of your list.
Subscribe to LISTSERV at Work.