Certainly tools that analyze logs after the fact are an invaluable tool to anybody who has to manage a cache. However, these tools mostly present averages or means - they do not present a dynamic picture, In other words, if half your users are eating steak, and the other half are eating cabbages, a log analysis tool might make you think they are all eating sarma.
MRTG samples data every five minutes to present a dynamic view. This view can be used not only for end of the day analysis, but for active monitoring that can sometimes avert disaster.
An additional benefit of squid is that it can present the data it had gathered not only in a daily graph - one that shows activity in the last 36 hours - but also in a a weekly, monthly and yearly graph, thus making it much easier to see long term trends.
The hit rate is calculated for the last 5 minutes (green) and the last hour (blue). Hit rate is calculated by volume.
| Max5 minute | 74.0 % | Average5 minute | 32.0 % | Current5 minute | 45.0 % | ||
| Max60 minute | 59.0 % | Average60 minute | 34.0 % | Current60 minute | 37.0 % |
| Max5 minute | 54.0 % | Average5 minute | 27.0 % | Current5 minute | 35.0 % | ||
| Max60 minute | 60.0 % | Average60 minute | 29.0 % | Current60 minute | 36.0 % |
| Max5 minute | 49.0 % | Average5 minute | 26.0 % | Current5 minute | 35.0 % | ||
| Max60 minute | 58.0 % | Average60 minute | 28.0 % | Current60 minute | 35.0 % |
| Max5 minute | 46.0 % | Average5 minute | 28.0 % | Current5 minute | 26.0 % | ||
| Max60 minute | 47.0 % | Average60 minute | 28.0 % | Current60 minute | 27.0 % |
| GREEN ### | % |
| BLUE ### | % |
![]() |
![]() |
![]() |
| 2.5.2alpha-1998/02/06 | Tobias Oetiker <oetiker@ee.ethz.ch> and Dave Rand <dlr@bungi.com> |
The yearly graph shows the declining hit rate, brought on, perhaps, by the growth of the internet, and also the improvement brought by Squid 2.0.
MRTG was primarily meant to monitor traffic through routers, which is why it best supports receiving data through SNMP.
MRTG (version 2) is not the ideal tool: the storage model in version 2 can only use integers, and when monitoring a large number of variables (more than a few tens) puts a significant load on the machine. There are several ways of reducing MRTG's load: the packages Orca, Cricket, and the UseRRDTool patch. All three use RRDTool (where RRD stands for Round Robin database), but only the UseRRDTool patch uses standard MRTG configuration files. However, even with UseRRDTool you need to install a special CGI program which will generate your graphs on the fly.
SNMP is normally transported through UDP packets. That makes it a light protocol - there is no connection setup/teardown expense. The disadvantage is that on a busy network, packets may be lost and higher level software has to handle timeouts and retransmissions.
when data is prepared for presentation over SNMP, it is organised into a MIB (management information base) tree. The nodes of the tree are the available items of information. Each node can be labeled with a word, or with a number, and the full name of a SNMP variable is the sequence of nodes that are traversed between the root of the MIB and the node.
For instance, the Squid MIB can be addressed by it root, which is 1.3.6.1.4.1.3495.1. In symbolic names, that is iso.org.dod.internet.private.enterprises.nlanr.squid. Symbolic names are used only when interacting with people - the packet only contains the numeric MIB label.
SNMP has some drawbacks: There is very little concern for security in basic SNMP. Instead of a cryptographic handshake we have come to expect from the more modern protocols, SNMP uses a simple password scheme called "community string". Only request packets with the appropriate community string are accepted by the monitored device.
Needless to say, such security is broken the moment someone connects a "sniffer" to your network. SNMP v2 and SNMPv3 provide better security, but still not widely used. Therefore, most router jockeys prefer to rely on IP controls to prevent unauthorized monitoring of their routers. Squid also makes IP restrictions for SNMP access possible. Also, Squid from version 2.2 on will log failed SNMP attempts to cache.log
The only reference to available SNMP variables is mib.txt, which the install places in squid's /etc directory. This is not the most readable document imaginable. The first few entries have nice descriptions. Later, however, descriptions aren't provided. Here is an example:
cacheProtoAggregateStats OBJECT IDENTIFIER ::= { cacheProtoStats 1 }
cacheClientHttpRequests OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
::= { cacheProtoAggregateStats 1 }
cacheHttpHits OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
::= { cacheProtoAggregateStats 2 }
Parsing the MIB file by hand is dull work and doesn't tell us what units are
used. Sometimes the only way to find out is to start graphing the data
then compare it with the same data from another source.MRTG can now parse Squid's mib.txt (located in src/mib.txt) for you, so you only need to add the LoadMIBs: /path/to/mib.txt directive in your MRTG configuration file, and from then on you can use symbolic names for the variables you want to display with MRTG. Here is a short list of some of the available variables. (The list was taken from Andreas Pabst's contribution to MRTG, but the comments are mine:
Such things as number of ftpget processes are understandably missing, since ftpget is no longer external in squid 2. The object and volume hit rate can be calculated with some effort.
But there is still plenty of data available through other means that is not available through SNMP interface. This is probably intentional: if everything that is available through cachemgr pages were to be forced into cache MIB, nobody would want to implement it.
The human readable pages are very informative:
Squid Object Cache: Version 2.1.PATCH2
Start Time: Wed, 17 Feb 1999 21:09:26 GMT
Current Time: Fri, 19 Feb 1999 01:32:06 GMT
Connection information for squid:
Number of clients accessing cache: 777
Number of HTTP requests received: 312425
Number of ICP messages received: 243757
Number of ICP messages sent: 244696
Number of queued ICP replies: 0
Request failure ratio: 0.00%
HTTP requests per minute: 183.5
ICP messages per minute: 286.9
Select loop called: 12194937 times, 8.377 ms avg
Cache information for squid:
Request Hit Ratios: 5min: 64.2%, 60min: 46.2%
Byte Hit Ratios: 5min: 35.9%, 60min: 27.9%
Storage Swap size: 15917088 KB
Storage Mem size: 44212 KB
Storage LRU Expiration Age: 13.46 days
Mean Object Size: 15.95 KB
Requests given to unlinkd: 0
Median Service Times (seconds) 5 min 60 min:
HTTP Requests (All): 0.05331 0.28853
Cache Misses: 0.68577 0.58309
Cache Hits: 0.01955 0.02899
Near Hits: 0.37825 0.32154
Not-Modified Replies: 0.01235 0.01847
DNS Lookups: 0.06083 0.18639
ICP Queries: 0.01494 0.01586
Resource usage for squid:
UP Time: 102160.743 seconds
CPU Time: 14389.014 seconds
CPU Usage: 14.08%
CPU Usage, 5 minute avg: 4.40%
CPU Usage, 60 minute avg: 6.06%
Maximum Resident Size: 0 KB
Page faults with physical i/o: 148003
Memory accounted for:
Total accounted: 124632 KB
File descriptor usage for squid:
Maximum number of file descriptors: 4096
Largest file desc currently in use: 120
Number of file desc currently in use: 50
Available number of file descriptors: 4046
Files queued for open: 0
Reserved number of file descriptors: 100
Disk files open: 4
Internal Data Structures:
998195 StoreEntries
9155 StoreEntries with MemObjects
9152 Hot Object Cache Items
997915 Filemap bits set
997911 on-disk objects
It turns out that cachemgr only does very minor reformatting. The page can
requested by any client and even in a raw state has sufficient syntactic fluff.
Thus a small program equipped with appropriate regular expression can
reliably extract data.I wrote a small tool which request a page, and then caches it so that various regular expressions can extract data from it. Once all the data has been collected, it is sent as a UDP packet to an MRTG machine.
Since writing regular expressions soon became tedious, I wrote another small tool, which requests a page and then turns it into a bunch of regular expressions. Simply pick out the right one, and paste it into the first tool.
Also be sure to check out another Jens from Germany for some very nice web and cache analysis tools.
Please send questions, ideas, accolades :-) to Matija Grabnar
Last updated:Fri Feb 19 03:52:09 CET 1999