Hello
I simulated a disk failure by right clicking on disk number 6 of my module 3 of the Spectrum Accelerate system, and clicking on "Phase out -> Ready". I did this to test that the Full Redundancy state is maintained. I got an Alert message saying, "DATA_REDIST_COMPLETED". I was hoping that I could right click on it again and click on Test and then phase in to bring it back to the Spectrum Accelerate System. However, the state changes to "Initializing" and goes back to "Failed". There is a red exclamation alert next to the Full Redundancy status on the bottom right of the XIV GUI. Why is this happening? I have a Spectrum Accelerate setup with 3 modules and six 1TB disks in each.
Could it have something to do with the fact that a minimum of six disks are needed and I phased out one disk leading to only 5 disks which is below the minimum requirement?
Answer by T_greg (26) | Jul 30, 2015 at 12:43 PM
No nothing to do with the disk count. Spectrum Accelerate in all supported configurations allows for one module worth of disks + three additional disks of hardware spare capacity. In your case the system can tolerate up to nine disk failures when fully utilized. Can be more if you haven't fill the system with data.
Have you tried logging in with your user assigned to the opsadmin roll and running component_service_force_ok component=1:disk:3:6, then the comp test? If this doesn't work you will probably have to delete the RDM and recreate it. See this section of Spectrum Accelerate Knowledge Center
Thanks for confirming that the minimum disk failure is not an issue.
Yes, I have already done component_service_force_ok component=1:disk:3:6 and then did the Test again. The same thing happens again. I also tried vmkfstools -z /vmfs/devices/disks/naa.Identifier /vmfs/volumes/datastoreXXX/data_rdm_disk_paths/ DISK_6_RDM.vmdk and added it again - as described in one of the Redbooks and your link. Even after this, the test fails with Initializing and Failed. That's why I am stuck.
I did the whole thing again, this time with "DISK_7_RDM.vmdk", "vmkfstools -z /vmfs/devices/disks/naa.600605b00a0be6601d1c4f695f716a6a /vmfs/volumes/datastore1c/data_rdm_disk_paths/DISK_7_RDM.vmdk" And please see the error happen again;
What is this error?
Answer by AmitMargalit (1) | Aug 02, 2015 at 12:11 AM
The "INVALID COMMAND OPERATION CODE" itself can be safely ignored.
It is basically our code asking the controller to perform a SCSI passthrough command to read a specific log page or other status of the disk, which is refused. This can happen due to several reasons:
The controller is a RAID controller and the disk is not in JBOD mode, so the controller refuses to let us do a real pass-through, since the device it presents to the VM is not a real disk drive.
The disk itself doesn't support the specific log page or operation.
Other reasons.
In any case, the code path that emits this event does not affect the operation of the system. In this case it seems that the problem is that the disk is being requested to run its own self-test, and fails.
I'm looking into this and will update here soon.
Amit