“ Error 432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded in Exchange 2016

One of our customer complained that the Internal E-mails are not being delivered. When we checked the Queue Viewer we observed the emails are stuck at the queue with the error “Error- 432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded”

This could be due to large number of emails being sent internally(either to one mailbox / several mailboxes).In our case , the customer’s processing server was down for some days, and when it was restored it started sending all the backlog emails.

To overcome this issue we had to temporarily disable the throttling , by adding the below text in the EdgeTransport.exe.config file(located in the Exchange Bin Folder)

add key=”MailboxDeliveryThrottlingEnabled” value=”False”

(Some articled recommends to add the value in the MSExchangedelivery.exe.config file instead).

Thereafter , you need to restart the MS Exchange Transport & Exchange Mailbox Transport Delivery services.

In case if you don’t completely disable the throttling you could add the below keys.

add key=”RecipientThreadLimit” value=”2″
add key=”MaxMailboxDeliveryPerMdbConnections” value=”3″

Good Luck.

September 17, 2020 at 3:33 pm Leave a comment

How to modify the iSCSI initiator ID in Linux

When you deploy Linux VM’s using a Template (in ESXi) ,you may come across a situation , where the iSCI initiator ID on these VM’s will be identical. To resolve this issue we need to modify the iscsi initiator ID’s.

In case if you have logged in to the iscsi session already you need to log out first.

#iscsiadm -m node -T iqn.xxxxxxxxxxxxxx -p iscsiserver-ip -u

Thereafter:

backup the file initiatorname.iscsi
#cp /etc/iscsi/initiatorname.iscsi /etc/iscsi/initiatorname.iscsi.bak

#echo “InitiatorName=/sbin/iscsi-iname`” > /etc/iscsi/initiatorname.iscsi

You can login again to the iscsi session
#iscsiadm -m node -T iqn.xxxxxxxxxxxxxx -p iscsiserver-ip -l


Source:https://www.thegeekdiary.com/



September 15, 2020 at 8:47 am Leave a comment

How to view the Network Configuration in AHV

Use the following commands to view the configuration of the network elements.

Before you begin

Log on to the Acropolis host with SSH.

Procedure
  • To show interface properties such as link speed and status, log on to the Controller VM, and then list the physical interfaces.

    nutanix@cvm$ manage_ovs show_interfaces

    Output similar to the following is displayed:
name mode link speed 
eth0 1000 True 1000 
eth1 1000 True 1000 
eth2 10000 True 10000 
eth3 10000 True 10000
  • To show the ports and interfaces that are configured as uplinks, log on to the Controller VM, and then list the uplink configuration.

    nutanix@cvm$ manage_ovs –bridge_name bridge show_uplinks

    Replace bridge with the name of the bridge for which you want to view uplink information. Omit the –bridge_name parameter if you want to view uplink information for the default OVS bridge br0.Output similar to the following is displayed:
Bridge: br0
  Bond: br0-up
    bond_mode: active-backup
    interfaces: eth3 eth2 eth1 eth0
    lacp: off
    lacp-fallback: false
    lacp_speed: slow
  • To show the bridges on the host, log on to any Controller VM with SSH and list the bridges:

    nutanix@cvm$ manage_ovs show_bridges

    Output similar to the following is displayed
Bridges:
br0
  • To show the configuration of an OVS bond, log on to the Acropolis host with SSH, and then list the configuration of the bond.

    root@ahv# ovs-appctl bond/show bond_name

    For example, show the configuration of bond0.

    root@ahv# ovs-appctl bond/show bond0

    Output similar to the following is displayed:
---- bond0 ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
active slave mac: 0c:c4:7a:48:b2:68(eth0)

slave eth0: enabled
        active slave
        may_enable: true

slave eth1: disabled
        may_enable: false

September 7, 2020 at 12:17 pm Leave a comment

How to fix the disk usage warning when /home partition or /home/nutanix directory is full

Source: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA0600000008dpDCAQ

Summary:

This article describes ways to safely free up space if /home or /home/nutanix becomes full or does not contain enough space to facilitate an AOS upgrade or PCVM upgrade.

Versions affected:

ALL Prism Central Versions, ALL AOS VersionTroubleshootingUpgrade

Description:

WARNING: DO NOT treat the Nutanix CVM (Controller VM) or PCVM as a normal Linux machine. DO NOT perform “rm -rf /home” on any of the CVMs or PCVM. It could lead to data loss scenarios. Contact Nutanix Support in case you have any doubts.

This condition can be reported in two scenarios:

  • The NCC health checkdisk_usage_check reports that the /home partition usage is above a certain threshold (by default 75%)
  • The pre-upgrade check test_nutanix_partition_space checks if all nodes have a minimum of 5.6 GB space on the /home/nutanix directory before performing an upgrade

The following error messages will be generated in Prism by the test_nutanix_partition_space pre-upgrade check:

Not enough space on /home/nutanix directory on Controller VM [ip]. Available = x GB : Expected = x GB
Failed to calculate minimum space required
Failed to get disk usage for cvm [ip], most likely because of failure to ssh into cvm
Unexpected output from df on Controller VM [ip]. Please refer to preupgrade.out for further information

Nutanix reserves space on the SSD-tier of each CVM for its infrastructure. These files and directories are located in the /home folder that you see when you log in to a CVM. The size of the /home folder is capped at 40 GB so that the majority of the space on SSD is available for user data.

Due to the limited size of the /home partition, it is possible for it to run low on free space and trigger Prism Alerts, NCC Health Check failures or warnings, or Pre-Upgrade Check failures. These guardrails exist to prevent /home from becoming completely full, as this causes data processing services like Stargate to become unresponsive. Clusters with multiple CVMs having 100% full /home partition will often result in downtime for user VMs.

The Scavenger service running on each CVM is responsible for the automated clean-up of old logs in /home and improvements to its scope were made in AOS 5.5.9, 5.10.1, and later releases. For customers running earlier AOS releases, or in special circumstances, it may be necessary to manually clean up files out of certain directories in order to bring space usage in /home down to a level that will allow future AOS upgrades.

When cleaning up unused binaries and old logs on a CVM, it is important to note that all the user data partitions on each drive associated with a given node are also mounted within /home. This is why we strongly advise against using undocumented commands like “rm -rf /home”, since this will also wipe the user data directories mounted within this path. The purpose of this article is to guide you through identifying the files that are causing the CVM to run low on free space and removing only those which can be safely deleted.

Solution:

WARNING: DO NOT treat the Nutanix CVM (Controller VM) as a normal Linux machine. DO NOT perform “rm -rf /home” on any of the CVMs. It could lead to data loss scenarios. Contact Nutanix Support in case you have any doubts.

Step 1: Parsing the space usage for “/home”.

Log in to CVM, download KB-1540_clean_v7.sh to /home/nutanix/tmp directory, make it executable and run it.

KB-1540_clean_v7.sh has some checks (MD5, compatibility, etc.) and deploys KB-1540_clean_v7.sh script accordingly.

nutanix@cvm:~$ cd ~/tmp
nutanix@cvm:~/tmp$ wget http://download.nutanix.com/kbattachments/1540/KB-1540_clean_v7.sh
nutanix@cvm:~/tmp$ mv KB-1540_clean_v7.sh KB-1540_clean.sh
nutanix@cvm:~/tmp$ chmod +x KB-1540_clean.sh
nutanix@cvm:~/tmp$ ./KB-1540_clean.sh

You can select to deploy the script to the local CVM or all CVMs.

========
Select package to deploy
     1 : Deploy the tool only to the local CVM
     2 : Deploy the tool to all of the CVMs in the cluster
    Selection (Cancel="c"):

Run the script to get a clear distribution of partition space usage in /home.

nutanix@cvm:~/tmp$ ./nutanix_home_clean.sh

Step 2: Check for files that can be deleted from within the list of approved directories.

PLEASE READ: The following are the ONLY directories within which it is safe to remove files. Take note of the specific guidance for removing files from each directory. Do not use any other commands or scripts to remove files. Do not use “rm -rf” under any circumstances.

  1. Removing Old Logs and Core Files Before removing old logs, check to see if you have any open cases with pending RCAs (Root Cause Analysis). The existing logs might be necessary for resolving those cases and you should check with the owner from Nutanix Support before cleaning up /home. Only delete the files inside these directories. Do not delete the directories themselves.
    • /home/nutanix/data/cores/
    • /home/nutanix/data/binary_logs/
    • /home/nutanix/data/ncc/installer/
    • /home/nutanix/data/log_collector/
    Use this syntax for deleting files within each of these directories: nutanix@cvm:~$ rm /home/nutanix/data/cores/*
  2. Removing Old ISOs and Software Binaries Begin by confirming the version of AOS that is currently installed on your cluster by running the command below. Make sure never to remove any files that are associated with your current AOS version. You will find this under the “Cluster Version” field in the output of the command shown below. nutanix@cvm:~$ ncli cluster info Example output: Cluster Name : Axxxxa Cluster Version : 5.10.2 Only delete the files inside these directories. Do not delete the directories themselves.
    • /home/nutanix/software_uncompressed/ – Delete any old versions other than the versions you are currently upgrading. The software_uncompressed folder is only in use when the pre-upgrade is running and should be removed after a successful upgrade. If you see a running cluster which is currently not upgrading, it is safe to remove everything underneath software_uncompressed
    • /home/nutanix/foundation/isos/ – Old ISOs of hypervisors or Phoenix.
    • /home/nutanix/foundation/tmp/ – Temporary files that can be deleted.
    Use this syntax for deleting files within each of these directories: nutanix@cvm:~$ rm /home/nutanix/foundation/isos/* If you see large files in the software_downloads directory that are not needed for any planned upgrades, do not remove those from the command-line. Instead, use the Prism Upgrade Software UI to accomplish as shown below. This example lists multiple versions of AOS which consume around 5 GB each, simply click on the ‘X’ to delete the files. Then click on each of the following tabs including File Server, Hypervisor, NCC, and Foundation to locate further downloads you may not require. It is possible that Enable Automatic Download is checked. This is located below the above screenshot (on the AOS tab). Left unmonitored, the cluster will download multiple versions, consuming more space in the home directory.

Step 3: Check space usage in /home to see that it is now below 70%.

You can use the “df -h” command to check on the amount of free space in /home. To accommodate a potential AOS upgrade, usage should ideally be below 70%.

nutanix@cvm:~$ allssh "df -h /home"

Example output:

================== x.x.x.x =================
/dev/md2         40G  8.4G   31G  22% /home
================== x.x.x.x =================
/dev/md2         40G  8.5G   31G  22% /home
================== x.x.x.x =================
/dev/md2         40G   19G   21G  49% /home

Cleaned up files from the approved directories but still see high usage in /home?

Contact Nutanix Support and submit the script log bundle (/tmp/home_kb1540_<cvm_name>_<timestamp>.tar.gz). One of our Systems Reliability Engineers (SREs) will promptly assist you with identifying the source of and solution to the problem at hand. Under no circumstances should you remove files from any other directories aside from those found here as these may be critical to the CVM infrastructure or may contain user data.

For the home partition exceeding its limit on the PCVM refer to the KB-8950 to troubleshoot.

September 7, 2020 at 12:11 pm Leave a comment

How do I flush or delete incorrect records from my recursive server cache?

Sometimes a recursive server may have incorrect records in its cache.  These may be as a result of an error made by a zone administrator, or as a result of a deliberately engineered cache poisoning attack.

To identify the faulty records, by dumping and inspecting cache:

rndc dumpdb -all
grep problem.domain /var/named/data/cache_dump.db

(The location of the cache_dump.db may be varied based on the bind configuration)

Or you may be able to identify which records are incorrect by querying your server directly.

dig +norec <ip address of nameserver> <name> <type>

How to solve the problem?

rndc flushname name
  • Use the name of a domain if there are problems with the NS or MX records associated with it.
  • Use the server name, if there are problems with the addresses associated with that server name (for example a nameserver, a webserver or a mailserver).

Flush the cache for a specific name as well as all records below that name

rndc flushtree name
  • This will clear the cache, but it will not clear any names out of ADB, so may not be sufficient for some needs.

If you are not sure where the problem lies, or there are too many records to delete them individually, then you might prefer to:Flush the entire named cache

rndc flush && rndc reload


August 18, 2020 at 11:48 am Leave a comment

How I passed the CASP+

This year I have decided to complete few certifications specializing in the field of security. Based on this goal, I started my certification sprint with the CompTIA Advanced Security Practitioner Certification (CASP+) exam. I chosed this exam due to the reason it is a performance based certification for practitioners , not only for managers.

CASP+ is compliant with ISO 17024 standards and approved by the US DoD to meet directive 8140/8570.01-M requirements.
Regulators and government rely on ANSI accreditation, because it provides confidence and trust in the outputs of an accredited program.

I bought the book titled “CompTIA® Advanced Security Practitioner (CASP) CAS-003 Cert Guide” from the Pearson Store. The books is authored by Robin Abernathy & Troy McMillan . I spent around 4-5 months reading the book and understanding the contents ,as I was making sure that I understood the technologies and the terms in the book. Since, my goal was two folded , one is passing the exam , and the other one is to ensure that the knowledge i gained will help me to handle real world situation in a professional manner.

The exam will contain approximately 90 questions(Multiple choice based and performance based question). The duration is 165 minutes. Further this exam does not give you a scaled score – it is pass / fail only.

I am providing the link below to the book I used. Also I am willing to share the PDF version of the book with anyone who wants to attempt this exam.

August 8, 2020 at 11:12 am Leave a comment

DC & Exchange loses connection during VEEAM Backup

Problem:

Outlook users get disconnected periodically (at the same time everyday).

When we analyzed the situation, we found out that the issue coincides with the backup windows. Further, investigation reveals that it happens exactly at the time of VMware snapshot removal stage, and this is quite normal ,since the VM will experience a longer VM stun.(Can be confirmed by looking in to the vmware.log). This  was causing the VM (Domain Controller) to freeze,  and at this time the Exchange triggers a Netlogon error with the eventID 5719 because  it loses the connection to the domain controller. The outlook users (Desktop & Smartphone) will be forced to re-open the email client or re-enter the credentials.

Solution:

So to avoid this , we had to convert the  backup job from VM based to an Agent based. The agent based backup uses the  VSS instead of VMware API triggered VM based snapshots.

Once the above is changed , we did not see any Netlogon event ID 5719 appeared and the users did not complain thereafter.

Good Luck

June 25, 2020 at 1:01 pm Leave a comment

How to troubleshoot DNS Issues with Wireshark

Hi Folks

Until recently I was a big fan Microsoft Message Analyzer. Unfortunately , Microsoft deprecated the product.So I decided to switch to Wireshark. I will not be going through the basic operations of wireshark as there are plenty of good video tutorials on the Internet.

In this article , I will focus on how to capture DNS packets on a BIND server and filter the packets for known queries and the response codes.

Step1: Start the capture on the BIND server

Step2: After running sample queries , Press CTRL & C to end the capture and transfer the .pcap file to the wireshark.

Once you open the .pcap file in the Wireshark , you can use the below filters to display the required data.

** To filter based on the queried domain name **
dns.qry.name == “hotmail.com”

** To filter MX queries **
dns.qry.type == 15

** To filter SERVFAIL response **
dns.flags.rcode == 2

You could use ! to exclude a filter in the search for example to exclude dns.qry.type == 15
!dns.qry.type == 15

For detailed list of DNS Response Codes & other DNS parameters refer the below URL’s.

https://support.umbrella.com/hc/en-us/articles/232254248-Common-DNS-return-codes-for-any-DNS-service-and-Umbrella-

https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml

Good Luck.

June 17, 2020 at 2:23 pm Leave a comment

Advanced Troubleshooting of ESXi Server 6.x for vSphere Gurus

Hi Folks

You could refer the attached document for hints that will help you in troubleshooting ESXi environments. This document covers mainly 3 areas.

  • Which log files to review and when.
  • ESXi commands to isolate and troubleshoot issues.
  • Configuration Files.

Thanks.

Source: vmworld.

June 8, 2020 at 9:51 am Leave a comment

sudo: effective uid is not 0, is sudo installed setuid root

When messing with up acl’s you may come across situation where the sudo will be stopped from functioning.  Especially , when you typed sudo you may notice the error “sudo: effective uid is not 0, is sudo installed setuid root”.

To diagnose the issue

Step1:
Check the /etc/sudoers file , whether you have added the group or the user name in the sudoers file for e.g: user abc

abc        ALL=(ALL)       NOPASSWD: ALL

Step2: if the output of the step 1 is correct check the permission on sudo as below (Output of a working sudo)

# ls -l /usr/bin/sudo
—s–x–x 2 root root 190904 Mar 4 18:21 /usr/bin/sudo

# stat /usr/bin/sudo

Access: (4111/—s–x–x) Uid: ( 0/ root) Gid: ( 0/ root)

In case , if you find the output of Step 2 is not matching with yours you can reset the permission to default

# rpm –setperms sudo.

 

 

May 11, 2020 at 12:35 pm Leave a comment

Older Posts


Archives

Categories

Follow Hope you like it.. on WordPress.com

Blog Stats

  • 33,560 hits

%d bloggers like this: