2010年11月14日星期日

一则vpn故障造成业务中断的处理过程

    客户报障描述:vpn接入系统出故障,接到N多报拆。具体为能通过ping激活vpn拨号,系统能自动弹出登陆对话框,不会报用户名和密码错误,但Netscreen-Remote客户端就是不会出现VPN成功连接后的黄色小钥匙造成业务中断
排障过程:让客户提供一个vpn测试帐号,发现报障情况属实。查看Netscreen-Remote Log Viewer中的日志,如下:
09:59:03.375 My Connections\ - Initiating IKE Phase 1 (IP ADDR=1.1.1.1)
09:59:03.406 My Connections\ - SENDING>>>> ISAKMP OAK AG (SA, KE, NON, ID, VID, VID, VID, VID)
09:59:03.546 My Connections\ - RECEIVED<<< ISAKMP OAK AG (SA, VID, VID, VID, VID, KE, NON, ID, HASH, VID, NAT-D, NAT-D)
09:59:03.546 My Connections\ - Peer is NAT-T capable
09:59:03.546 My Connections\ - NAT is detected for Client
09:59:03.562 My Connections\ - SENDING>>>> ISAKMP OAK AG *(HASH, NAT-D, NAT-D, NOTIFY:STATUS_INITIAL_CONTACT)
09:59:03.562 My Connections\ - Established IKE SA09:59:03.562    MY COOKIE de ce 78 0 81 58 21 bc
09:59:03.562    HIS COOKIE 7a ff b1 c1 c7 37 c1 31
09:59:03.656 My Connections\ - RECEIVED<<< ISAKMP OAK TRANS *(HASH, ATTR)
09:59:09.343 My Connections\ - RECEIVED<<< ISAKMP OAK TRANS *(Retransmission)
09:59:15.343 My Connections\ - RECEIVED<<< ISAKMP OAK TRANS *(Retransmission)
09:59:18.015 My Connections\ - SENDING>>>> ISAKMP OAK TRANS *(HASH, ATTR)
09:59:18.109 My Connections\ - RECEIVED<<< ISAKMP OAK TRANS *(HASH, ATTR)
09:59:18.109 My Connections\ - Received Private IP Address = IP ADDR=11.2.1.234
09:59:18.109 My Connections\ - SENDING>>>> ISAKMP OAK TRANS *(HASH, ATTR)
09:59:18.203 My Connections\ - RECEIVED<<< ISAKMP OAK TRANS *(HASH, ATTR)
09:59:18.203 My Connections\ - SENDING>>>> ISAKMP OAK TRANS *(HASH, ATTR)
09:59:18.203 My Connections\ - Initiating IKE Phase 2 with Client IDs (message id: BF484846)09:59:18.203   Initiator = IP ADDR=11.2.1.234, prot = 0 port = 0
09:59:18.203   Responder = IP SUBNET/MASK=192.168.1.0/255.255.255.0, prot = 0 port = 0
09:59:18.203 My Connections\ - SENDING>>>> ISAKMP OAK QM *(HASH, SA, NON, ID, ID)
09:59:24.890 
09:59:33.875 My Connections\ - QM re-keying timed out (message id: BF484846). Retry count: 1
09:59:33.875 My Connections\ - SENDING>>>> ISAKMP OAK QM *(Retransmission)
09:59:46.890 
09:59:48.875 My Connections\ - QM re-keying timed out (message id: BF484846). Retry count: 2
09:59:48.875 My Connections\ - SENDING>>>> ISAKMP OAK QM *(Retransmission)
10:00:03.875 My Connections\ - QM re-keying timed out (message id: BF484846). Retry count: 3
10:00:03.875 My Connections\ - SENDING>>>> ISAKMP OAK QM *(Retransmission)
10:00:08.890 
10:00:18.875 My Connections\ - Exceeded 3 re-keying attempts (message id: BF484846)
从日志中可看出IKE Phase 1已成功完成(红色加显),IKE Phase 2未完成(蓝色加显)。
接下来查看VPN服务器中相关的event,如下:
2008-02-20 09:56:11 system info  00536 IKE<121.201.217.55> Phase 2 msg ID
                                       <a29ebdc9>: Negotiations have failed.
2008-02-20 09:56:11 system info  00536 Rejected an IKE packet on ethernet1/1
                                       from 121.201.217.55:500 to
                                       1.1.1.1:500 with cookies
                                       462e118477c5ab9c and 99453c6c6b16eb87
                                       because the VPN does not have an
                                       application SA configured.
2008-02-20 09:56:11 system info  00536 IKE<121.201.217.55> Phase 2: No policy
                                       exists for the proxy ID received:
                                       local ID (<192.168.1.0>/<255.255.255.0>,
                                       <0>, <0>) remote ID (<11.2.1.234>/
                                       <255.255.255.255>, <0>, <0>).
2008-02-20 09:56:11 system info  00536 IKE<121.201.217.55> Phase 2 msg ID
                                       <a29ebdc9>: Responded to the peer's
                                       first message.
2008-02-20 09:56:11 system info  00536 IKE<121.201.217.55>: XAuth login was
                                       passed for gateway <VPN_Gateway>,
                                       username <testuser>, retry: 0, Client
                                       IP Addr<11.2.1.234>, IPPool name:
                                       <VPN_POOL>, Session-Timeout:<0s>,
                                       Idle-Timeout:<0s>.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55>: Received initial
                                       contact notification and removed Phase
                                       1 SAs.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55> Phase 1: Completed
                                       Aggressive mode negotiations with a
                                       <28800>-second lifetime.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55> Phase 1: Completed
                                       for user <testuser>.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55>: Received initial
                                       contact notification and removed Phase
                                       2 SAs.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55>: Received a
                                       notification message for DOI <1>
                                       <24578> <INITIAL-CONTACT>.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55> Phase 1: IKE
                                       responder has detected NAT in front of
                                       the remote device.
2008-02-20 09:56:00 system info  00536 IKE<121.201.217.55> Phase 1: Responder
                                       starts AGGRESSIVE mode negotiations.
从VPN服务器日志中可查看到更为详细的说明,从日志中可看出IKE Phase 1已成功完成(红色加显),IKE Phase 2未完成(蓝色加显),并对Phase 2未完成原因进行了描述。
VPN-SERVER-> get policy
Total regular policies 1, Default deny.
    ID From     To       Src-address   Dst-address  Service      Action State   ASTLCB
     1 dialup   Trust    Dial-Up VPN appserver    appservice    Tunnel enabled ---X-X
从上面可知,系统中只有一条policy,状态为启用(enabled)。同时也可知道这是一个拨号型VPN。查看VPN服务器中的Dst-address跟Netscreen-Remote客户端中的Remote Party Identity and Addressing设置是否相同:
VPN-SERVER-> get address trust name appserver
Name                 Address/Prefix-length           Flag  Comments
appserver            192.168.1.0/24                   0200

从上面可知,VPN服务器Dst-address与Netscreen-Remote中的Remote Party Identity and Addressing设置的完全匹配。接着来查看VPN服务器Policy中的Service:
VPN-SERVER-> get service appservice
Name:       appservice
Category:   other          ID:  0   Flag:  User-defined
Transport    Src port     Dst port   ICMPtype,code  Timeout(min) Application
tcp           0/65535        80/80                        30               
tcp           0/65535    1222/1223                        30       
tcp           0/65535    2233/2234                        30       
tcp           0/65535    2344/2345                        30       
tcp           0/65535        23/23                        30   
可以发现,放开的服务中并没有ping(ICMP type:8,code:0),增加ping:
VPN-SERVER->set service "appservice" + icmp type 8 code 0
故障解决,业务恢复正常。难道Juniper Netscreen配置xauth vpn时,一定要放开ping服务才能成功建立vpn连接?请高手指教。
最后说明平台版本:
客户端:Netscreen Remote 8.0.0(Build 14)
VPN服务器:Netscreen ISG-1000 (Software Version: 5.3.0r10.0, Type: Firewall+VPN)

没有评论:

发表评论