Welcome!

Storage Made Easy

Jim Liddle

Subscribe to Jim Liddle: eMailAlertsEmail Alerts
Get Jim Liddle via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Cloud Computing, Cloudonomics Journal, Open Source Journal, Open Source and Cloud Computing, Cloud Backup and Recovery Journal

Blog Feed Post

Elastic Cloud Monitoring and Tuning Tips for Linux

When deploying on EC2 you still need to tune your instance's operating system and monitor your application

When deploying on EC2 even though Amazon provides the hardware infrastructure, you still need to tune your instances operating system and monitor your application. You should review your hardware/software requirements and review your application design and deployment strategy

The Operating System

Change ulimit

‘ulimit’ Specifies the number of open files that are supported. If the value set for this parameter is too low, a file open error, memory allocation failure, or connection establishment error might be displayed. By default this is set to 1024 , normally you should increase this to at least 8096.

Issue the following command to set the value.

ulimit -n 8096

Use the ulimit -a command to display the current values for all limitations on system resources

Tune the Network

A good in detail reference for Linux IP tuning is here.  Some of the  important parameters to change  for distributed applications are below:

TCP_FIN_TIMEOUT

The tcp_fin_timeout variable tells kernel how long to keep sockets in the state FIN-WAIT-2 if you were the one closing the socketThis value takes an integer value which is per default set to 60 seconds. To set the value to 30  issue the command

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

TCP_KEEPALIVE_INTERVAL

The tcp_keepalive_intvl variable tells the kernel how long to wait for a reply on each keepalive probe. This value is in other words extremely important when you try to calculate how long time will go before your connection will die a keepalive death. The variable takes an integer value and the default value is 75 seconds. To set the value to 15 issue the following command

echo 15 > /proc/sys/net/ipv4/tcp_keepalive_intvl

TCP_KEEPALIVE_PROBES

The tcp_keepalive_probes variable tells the kernel how many TCP keepalive probes to send out before it decides a specific connection is broken.
This variable takes an integer value, The default value is to send out 9 probes before telling the application that the connection is broken. To change the valueto 5  use the following command.

echo 5 > /proc/sys/net/ipv4/tcp_keepalive_probes

Monitoring

You can monitor the system resources using command line but to make life easier you can use monitoring systems.  A couple of free opensource monitoring tools that you can use are:

  • Ganglia a free monitoring system
  • Hyperic they have both a commercial and free offering

Logging

You will be amazed how few projects care about logging until they have hit a problem. Have a consistent logging procedure in place to collect the logs from different machines to troubleshot in case of a problem

Linux Commands

Some Linux commands that we use regulary to you might find useful. More details can be found in my prior blog post here, and also on posts here and here

  • top: display Linux tasks
  • vmstat Report virtual memory statistics
  • free Display amount of free and used memory in the system
  • netstat Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
  • ps Report a snapshot of the current processes
  • iostat Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions
  • sar Collect, report, or save system activity information
  • tcpdump dump traffic on a network
  • strace trace system calls and signals

More Stories By Jim Liddle

Jim is CEO of Storage Made Easy. Jim is a regular blogger at SYS-CON.com since 2004, covering mobile, Grid, and Cloud Computing Topics.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
itsderek23 09/29/09 03:32:00 PM EDT

Good post Jim! As the co-founder of Scout, a hosted server monitoring solution, we see some special needs for cloud monitoring.

Typical monitoring solutions that are saved with backup images or configuration scripts are difficult to update and test. When new instances are being created often (a typical cloud use case), it's important to make the deployment as simple as possible. Updating your monitoring scripts and having to test them out by deploying another server is a pain.

Monitoring is often an afterthought and requires tweaking.

Our approach with Scout is to load the monitoring profile from a single line in a crontab file - no other configuration is required. The monitoring profile for servers can be changed at any time in our scoutapp.com web interface and doesn’t require changes to the deployment process. It decouples monitoring from deployment.

Thought you might find the approach interesting (more details, along with a video, is here.