IT journal: 2009

How to delete files from CSS11501

The scope of this post is to show how to delete archived files on a Cisco CSS11501. In order to delete other files (core dumps, logs, etc) the procedure is similar, you just need to know what you want to delete. In order to manipulate the files we need to go to debug mode:

CSS11501# llama

Then the ap_file command provides the "File commands using Complete pathnames". To delete the archived_rc_file from the Archive directory, issue:

CSS11501(debug)# ap_file delete c:/Archive/archived_rc_file

To see the files which are present in the Archive directory you can use the Tab after you type ap_file delete c:/Archive/ I am using the sg0820001 WebNS, and the dir - to list the contents of the Archive directory - command did not yeld any results when ran.

CSS11501(debug)# dir c:/Archive/
CSS11501(debug)#

mdadm tips on Linux software RAID

mdadm is a tool for managing, creating and reporting on Linux software RAID arrays.

I will describe some tips which I found useful at the moment.

Improve RAID1 re-sync time with write-intent bitmap
The RAID driver writes out periodically bitmap information recording which areas of the RAID component have been modified since the RAID array was last in sync.

If, for example one of two members of a RAID1 array fails and is removed from the array, md (the multiple disk software RAID drive) will record bits to the bitmap relating to the changes the active member is undertaking since the two members were last in sync. If the same failed/removed drive is re-added to the RAID1 array, md will notice and will recover only the portions indicated by the bitmap. In this way a lengthy re-sync is avoided (a full re-sync is normally needed if the drives are not in sync when the array starts up).

Cisco CSS11501 source groups and ACL

Few days ago I had to reconfigure a running CSS 11501 loadbalancer from an existing configuration in which traffic initiated from some services running in one VLAN towards any destination was source NAT-ed to a selective NAT which was based on the destination IP address.
What I wanted to do is almost exactly to what is described in this official Cisco document with the difference that I had more IP destination addresses for my ACL based source NAT, I had only 2 VLANs and the IP addresses were different than in the example given.
Below is an excerpt similar to my original configuration:

service SERV11
ip address 192.168.0.3
protocol tcp
keepalive type tcp
redundant-index 111
keepalive port 11501
active

service SERV12
ip address 192.168.0.4
protocol tcp
keepalive type tcp
redundant-index 112
keepalive port 11501
active

....

owner OWNER1

content SERV_BAL
vip address 10.0.0.1
add service SERV11
add service SERV12
redundant-index 11
balance leastconn
flow-reset-reject
flow-timeout-multiplier 20
active

....

group GROUP1
add service SERV11
add service SERV12
vip address 10.0.0.1
redundant-index 21
active

A subset of the IP addresses which I wanted to bypass the source NAT when the connection was initiated by the configured services (SERV11 and SERV12) were 10.0.0.11 and 10.0.0.12 So I wrote an ACL as documentation recommends, and applied it to the circuit VLAN of the configured services (VLAN 1), while on the other VLAN there was an ACL which allowed all traffic.

acl enable

....

acl 1
clause 5 bypass any 192.168.0.3 255.255.255.255 destination 10.0.0.11 255.255.255.255 
clause 10 bypass any 192.168.0.3 255.255.255.255 destination 10.0.0.12 255.255.255.255
clause 15 bypass any 192.168.0.4 255.255.255.255 destination 10.0.0.11 255.255.255.255 
clause 20 bypass any 192.168.0.4 255.255.255.255 destination 10.0.0.12 255.255.255.255
clause 101 permit any 192.168.0.3 255.255.255.255 destination any sourcegroup GROUP1
clause 102 permit any 192.168.0.4 255.255.255.255 destination any sourcegroup GROUP1
clause 254 permit any any destination any
apply circuit-(VLAN1)

However, checking on 10.0.0.11 and 10.0.0.12 for incoming sessions from SERV11, SERV12 I could see that the source IP address was still NAT-ed (packets were arriving with 10.0.0.1 as the source IP address). There's a catch which for me was not obvious from the documentation (probably my non-native English has contributed to that :) ). In order for the packets to arrive on 10.0.0.11, 10.0.0.12 with their real IP address and for the rest of the destinations to be NAT-ed the services had to be removed from the source group. In the end desired configuration looked like below:

acl enable

....

service SERV11
ip address 192.168.0.3
protocol tcp
keepalive type tcp
redundant-index 111
keepalive port 11501
active

service SERV12
ip address 192.168.0.4
protocol tcp
keepalive type tcp
redundant-index 112
keepalive port 11501
active

....

owner OWNER1

content SERV_BAL
vip address 10.0.0.1
add service SERV11
add service SERV12
redundant-index 11
balance leastconn
flow-reset-reject
flow-timeout-multiplier 20
active

....

group GROUP1
vip address 10.0.0.1
redundant-index 21
active
....

acl 1
clause 5 bypass any 192.168.0.3 255.255.255.255 destination 10.0.0.11 255.255.255.255 
clause 10 bypass any 192.168.0.3 255.255.255.255 destination 10.0.0.12 255.255.255.255
clause 15 bypass any 192.168.0.4 255.255.255.255 destination 10.0.0.11 255.255.255.255 
clause 20 bypass any 192.168.0.4 255.255.255.255 destination 10.0.0.12 255.255.255.255
clause 101 permit any 192.168.0.3 255.255.255.255 destination any sourcegroup GROUP1
clause 102 permit any 192.168.0.4 255.255.255.255 destination any sourcegroup GROUP1
clause 254 permit any any destination any
apply circuit-(VLAN1)

acl 2
clause 254 permit any any destination any
apply circuit-(VLAN2)

Where VLAN2 is the network towards 10.0.0 and the rest of the clients.

Reference: CSS Content Load-Balancing Configuration Guide (Software Version 8.10)

VRRP master/master issue on CSS 11501 with 3550

In the picture illustrated in which two Cisco CSS 11501 Loadbalancers were providing a redundant setup with fate sharing, the route from the "Servers" networks towards the client network was provided through an IP setup on a redundant interface shared by the 2 Loadbalancers.

The VRRP announcements for the virtual routers holding redundant interfaces on vlans A,B between the two loadbalancers were going through the 2 Cisco 3550 Catalyst switches which were running (C3550-I9Q3L2-M), Version 12.1(19)EA1c IOS.

To better depict the picture, each of the 2 Loadbalancers had one physical link to its corresponding L3 3550 and carrying over it vlans A,B (on the server side), one ISC link was connecting the two CSS for adaptive session redundancy (ASR) and the link between the two Cisco 3550 was set up as 802.1q trunk and transporting among other the vlans A,B over which the VRRP communication had to take place.
Although the setup and configuration was double and triple checked, the problem was that each of the Loadbalancers was claiming to be master on the virtual router instance running for its corresponding vlan (A or B).
For brevity I will illustrate the case of the virtual router on vlan A, although the problem seemed to be strongly related to the fact that the CSS were connecting through a trunk link to the 3550.

CSS11510_right# show redundant-interfaces

Redundant-Interfaces:

Interface Address: 192.168.0.2 VRID: 1
Redundant Address: 192.168.0.1 Range: 1
State: Master Master IP: 192.168.0.2

CSS11501_left# show redundant-interfaces

Redundant-Interfaces:

Interface Address: 192.168.0.3 VRID: 1
Redundant Address: 192.168.0.1 Range: 1
State: Master Master IP: 192.168.0.3

While trying to browse for this specific problem (both CSS were master), I found out that most of the cases were related to misconfiguration. Either an access list was blocking traffic between the 2 devices, either the VRID was incorrect, etc. However there was nothing wrong with the configuration present on the CSS nor on the 3550s.
Checking the counter for VRRP announcements received by the presumably slave Loadbalancer at some point in time, the number was always 0.

CSS11501_left# llama
CSS11501_left(debug)# ip scp statistics

totalIpFrames received: 211300
invalidIPFrame: 0 malformedIPFrame: 0
noIngressIPFrame: 0 srcDestSameIPFrame: 0
badIPVersion: 0 badIpHeaderLength: 0
badIpChecksum: 0 badSrcIPFrame: 0
loopbackIPFrame: 0 badIPAddress: 0
badIpDestAddress: 0 zeroTTLIPFrame: 0
badIpProtocol: 0 badIpOptions: 0

Packets received with supported protocol types:
IPPROTO_IP: 0 IPPROTO_ICMP: 12285
IPPROTO_IGMP: 0 IPPROTO_GGP: 0
IPPROTO_TCP: 3129 IPPROTO_EGP: 0
IPPROTO_PUP: 0 IPPROTO_UDP: 47625
IPPROTO_IDP: 0 IPPROTO_TP: 0
IPPROTO_EON: 0 IPPROTO_OSPF: 0
IPPROTO_ENCAP: 0 IPPROTO_VRRP: 0
IPPROTO_OSPF: 0

IP PACKET TO VXWORKS STATISTICS:
packetLeakToVxWorks: 170436

As mentioned earlier the 3550 was running Version 12.1(19)EA1c IOS, while the CSS was running sg0730203 (07.30.2.03) WebNS.
I didn't solve the issue myself. I was notified that there is a problem with the current IOS running on the 3550 and there was a need to upgrade to at least an EMI image 12.1.20. There is also a bug logged with Cisco, although the setup and the configuration of the presented issue and the one logged with Cisco are not exactly the same.
Here is the bug logged to Cisco.
After upgrading to 12.1.20 IOS, the VRRP announcements were received by the slave Loadbalancer and the initial VRRP negotiation took place correctly.
Reference: CSS Redundancy Configuration Guide

HP-UX ephemeral port range for TCP/UDP connections

For HP-UX (click here for latest release) you can tune the UDP local ephemeral port range separately from the TCP local ephemeral port range.

In a previous post it was shown how to alter/view the ephemeral port range for a Linux system.

To follow up the example from my Linux post in order to set on HP-UX the local port range from 15000 till 61000 we can use the ndd utility for the change on the fly.

For TCP we use:

#/usr/bin/ndd -set /dev/tcp tcp_smallest_anon_port 15000
#/usr/bin/ndd -set /dev/tcp tcp_largest_anon_port 61000

For UDP connections we use:

#/usr/bin/ndd -set /dev/udp udp_smallest_anon_port 15000
#/usr/bin/ndd -set /dev/udp udp_largest_anon_port 61000

To make the change persistent after reboot we can append the following entries in /etc/rc.config.d/nddconf:

Linux ephemeral port range for TCP/UDP connections over IPv4

The range of ephemeral ports a client program can use (unless otherwise specified by the program) on modern Linux OS distributions by default is from 32768 till 61000 (for systems with more than 128 MB RAM) and from 1024 till 4999 (or even less) for systems with less than 128MB of RAM. This range is defined in the kernel parameter /proc/sys/net/ipv4/ip_local_port_range and it affects both TCP as well as UDP client connections.
Should there will be a need to change this range to extend the range(for example setting the lowest port number to 15000) we cal use:

echo "15000 61000" > /proc/sys/net/ipv4/ip_local_port_range

To make this change persistent after reboots, we can use sysctl.

Cisco CSS 11500 series internal services

By default the Cisco CSS 11501 series loadbalancers will create an implicit internal service to check the availability of the gateways for its static defined routes.
This internal service monitors the availability of the route which goes through a specific gateway by sending ICMP keepalives to that gateway. In that way if the internal service is in "Alive" state, the route for which the internal service is running is maintained in the routing table, otherwise if the service is in "Down" state then the route is withdrawn from the routing table (if there are no valid arp entries cached for that gateway).
The checking of the service availability is implemented through ICMP keepalives (ping) which are sent to the gateway of the destination defined in the static route.

These services are visible from the debug mode of the CSS. To have a look how they are defined and at which interval the keepalives are sent you can use:

CSS11501# llama
which gets you to debug mode

Resizing extended partitions with GNU parted

This post will show how to resize an extended partition using GNU parted. There are many tools for partitioning available, but I wanted to use a tool which was by default installed in my test system (which runs CentOS Linux).

In summary "The GNU Parted program allows you to create, destroy, resize, move,and copy hard disk partitions. Parted can be used for creating space for new operating systems, reorganizing disk usage, and copying data to new hard disks."

On my test CentOS system I had three primary extended partitions created and one extended as below:

Model: ATA ST3500320AS (scsi)
Disk /dev/sda: 500GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number Start End Size Type File system Flags
1 32.3kB 535MB 535MB primary ext3 boot
2 535MB 11.0GB 10.5GB primary ext3
3 11.0GB 12.1GB 1078MB primary linux-swap
4 12.1GB 37.1GB 25.0GB extended
5 12.1GB 37.1GB 25.0GB logical lvm

As it's visible I had plenty of space on my hard drive (500GB), but I could use only approximately 7% (as I had 3 primary partitions and one extended there's no way in which I could create another partition).

Pages