Thursday, January 17, 2008

Where all standard deletion fails...

Long time, since I updated the blog eh ? A good..a bit challenging one, till I remembered the command, debugfs.

Issue

Location : One of my clients' VPS.
Concern : One of his vps clients is not restarting properly. Stucks at initializing the vzquota. Error message when doing 'vzctl start 738' is below


Starting VE ...
Initializing quota ...
vzquota : (error) quota check : lstat `photos.friendster.comphotos6968532886961_494183449s.jpg': No such file or directory
vzquota init failed [1]


Diagnosis

I couldn't find any files with the name photos.friendster.comphotos6968532886961_494183449s.jpg inside the vps, or when looking at /vz/private/738. Then did a

find . -name "*494183449s.jpg" inside the /vz/private/738. It returned me location as ./.trash/lsm/photos.friendster.comphotos6968532886961_494183449s.jpg

/me so happy. Ran
rm -f ./.trash/lsm/photos.friendster.comphotos6968532886961_494183449s.jpg
worked fine. But the problem persisted . Reason was though it didnt give any errors, it actually didnt delete the file. Ran a ls -l inside the directory lsm which showed output where it was question marks everywhere except for the name. AFAIK, name of a file is also stored in the directory info in the FS.

[root@vpsit lsm]# ls -l
total 0
?--------- ? ? ? ? ? photos.friendster.comphotos6968532886961_494183449s.jpg


So how can I fix this and make the vps start ? "rm -f filename" was not working.. neither a bigger command, "find . -exec rm -rf {} \;" nor even rm -rf /.trash or rm -rf /.trash/lsm. The directory delete was not working, since it was not empty. Tried unlink also.

[root@vpsit lsm]# find . -exec rm -f {} \;
rm: cannot remove `.' or `..'
find: ./photos.friendster.comphotos6968532886961_494183449s.jpg: No such file or directory


Then tried to turn off quota for vps ..so executed the vzquota off command and got the o/p as below

vzquota off 738
vzquota : (error) Can't open quota file for id 738, maybe you need to reinitialize quota: No such file or directory


So didnt want to initialize the quota by command, since it wont work for sure, just as in case when it starts the vps.

Temporary fix: turn off disk_quota parameter in /etc/sysconfig/vz and started the vps. Turned on the parameter later.

Permanent fix: you can't use fsck to fix this, on a live server. Not when you have to run the fsck in / itself. So fsck was not an option here. So next choice went to debugfs

here is what I did with debugfs

debugfs -w /dev/hda1
cd /vz/private/738/.trash/
unlink lsm


gone..and the issue is fixed. I dont see a proper solution anywhere in web for this ???? question marked cases of file deletion and hence this post.

I am tired of freelancing. Need to find some salaried job..Till next time..cya..

No comments: