|
1 | 1 | # Troubleshooting SDN
|
2 | 2 |
|
3 |
| -Deploying the Microsoft Windows SDN Stack may require some troubleshooting of problems that arise during fabric and tenant deployment. The instructions provided below is for collecting a set of data which will aid in the troubleshooting and triage process. Please look at the [SDN Troubleshooting](https://technet.microsoft.com/en-us/library/mt715794.aspx) TechNet article for more information on individual commands and triage. |
| 3 | +Deploying the Microsoft Windows SDN Stack may require some troubleshooting of problems that arise during fabric and tenant deployment. Please reference the [SDN Troubleshooting Topic](https://technet.microsoft.com/en-us/library/mt715794.aspx) for more details. |
4 | 4 |
|
5 |
| -Make sure you have the most recent diagnostic KBs (download location forthcoming) installed on all of your NC nodes and Hyper-V Hosts. Also, make sure the tools have been installed on the Hyper-V Hosts: |
6 |
| -```none |
7 |
| -PS> Add-WindowsFeature RSAT-NetworkController –IncludeManagementTools |
8 |
| -PS> Import-Module NetworkControllerDiagnostics |
9 |
| - ``` |
10 |
| -### Triage and Data Collection |
11 |
| -1. Validate that Network Controller is up and running correctly (Executed from one of the NC Nodes): |
12 |
| -```none |
13 |
| -PS> Debug-WinFabNodeStatus |
14 |
| -``` |
15 |
| -Check that ReplicaStatus is Ready and HealthState is Ok (if any nodes are not in Ready/Ok state, note which one is unhealthy in the bug) |
16 |
| - |
17 |
| -```none |
18 |
| -PS> Get-NetworkControllerReplica |
19 |
| -``` |
20 |
| - |
21 |
| -Check that the Replica Status is Ready for each service (if any service is not in Ready state, note which service is unhealthy and on which node it is running in the bug) |
22 |
| - |
23 |
| -2. Validate the NC Host Agents have made connections to the Network Controller (Execute on each Hyper-V host) |
24 |
| -```none |
25 |
| -C:\> netstat -anp tcp |findstr 6640 |
26 |
| -``` |
27 |
| - |
28 |
| -There should be three ESTABLISHED connections and one LISTENING socket |
29 |
| -- Listening on Hyper-V hosts IP on port 6640 |
30 |
| -- Two established connections to Hyper-V host IP on port 6640 from NC node(s) on ephemeral ports (> 32000) Connection established bet |
31 |
| -- One established connection from Hyper-V host IP to REST IP on port 6640 |
32 |
| - |
33 |
| -3. Check the Network Controller’s configuration state (Executed from any Hyper-V host) |
34 |
| -```none |
35 |
| -PS> Debug-NetworkControllerConfigurationState -NcIpAddress <Enter FQDN or IP – based on cert subject name configured> |
36 |
| -``` |
37 |
| - |
38 |
| -Look for any resources which have status Warning or Failure |
39 |
| -_Caveat: If you deployed using VMM, please use the VMM variant of the script available on GitHub [Debug-NetworkControllerConfigurationStateVmm](https://github.com/Microsoft/SDN/blob/master/Diagnostics/Debug-NetworkControllerConfigurationVMM.ps1)_ |
40 |
| - |
41 |
| -4. Check the SLB Configuration State (Executed from an NC node) |
42 |
| -```none |
43 |
| -PS > Debug-SlbConfigState |
44 |
| -``` |
45 |
| -Output location should be indicated – default is C:\SDNDiagnostics\NetworkControllerState\SlbConfigState.txt |
46 |
| -_Caveat: This script does not work for VMM-based deployments_ |
47 |
| - |
48 |
| -5. Check policies in Host Agent |
49 |
| -```none |
50 |
| -C:\> ovsdb-client.exe dump tcp:127.0.0.1:6641 ms_vtep |
51 |
| -``` |
52 |
| -The key table in this output is the ucast_macs_remote table which lists the tenant VM NIC IP and MAC address. Check to see if policy is missing for any given tenant VM IP address. |
53 |
| - |
54 |
| -6. Look for HNV Provider Addresses (PA IPs) on the host |
55 |
| -```none |
56 |
| -PS > Get-ProviderAddress |
57 |
| - ``` |
58 |
| - |
59 |
| -Attach the full output of all of these commands to the bug. |
60 |
| - |
61 | 5 | ### Collecting Logs and Traces
|
62 |
| -Next step will probably be log collection. In order to proceed in an investigation, we need both the Host ID and the Port Profile IDs of any VM NICs for which there is no policy available in the Host Agent’s OVSDB ms_vtep database. |
| 6 | +If you aren't able to troubleshoot the issue on you're own, the next step will be to collect logs. In order to proceed in an investigation, we need both the Host ID and the Port Profile IDs of any VM NICs for which there is no policy available in the Host Agent’s OVSDB ms_vtep database. |
| 7 | + |
63 | 8 |
|
64 |
| -1. Collect most recent ETL log files under C:\SDNDiagnostics\Logs directory on all NC nodes and Hyper-V host in question (Zip) |
65 |
| -2. Execute this script to get the Host ID |
| 9 | +1. Execute this script to get the Host ID |
66 | 10 | ```none
|
67 | 11 | PS > Get-ItemProperty "hklm:\system\currentcontrolset\services\nchostagent\parameters" -Name HostId |fl HostId
|
68 | 12 | ```
|
69 |
| -3. Execute this script (download from GitHub – [Get-AllPortProfiles](https://github.com/Microsoft/SDN/blob/master/Diagnostics/Get-AllPortProfiles.ps1) ) to get the Port Profile IDs for each VM (indicate which VM NIC does not have policies) |
70 |
| - |
71 |
| -Attach this information to the bug as well. |
| 13 | +2. Execute this script (download from GitHub – [Get-AllPortProfiles](https://github.com/Microsoft/SDN/blob/master/Diagnostics/Get-AllPortProfiles.ps1) ) to get the Port Profile IDs for each VM (indicate which VM NIC does not have policies) |
72 | 14 |
|
73 |
| -Lastly, make a note of what was happening before the degradation of service or error occurred. |
74 | 15 |
|
75 | 16 | ### Gateways Troubleshooting
|
76 | 17 |
|
|
0 commit comments