It’s a Panda’s Life

This semester as a part of TAing 167/9 we are making use of Mercurial for SCM. Itay liked Hg so much that he actually started using it for nearly everything on his computer and came up with a good reason for doing this. I am not organized enough to put all of my files into source control (though heavens knows I probably should), but I actually like the idea of using mercurial to keep my two copies of research code synchronized.

Since I can't compile and run said code on my home machine for lack of expensive libraries, and I'd rather not code in the department for a general lack of productivity, using Hg seemed like a better idea than just scping code around. Now this in general presents a problem on Mac OS X. For reasons I haven't yet discovered (I have not yet taken a look at the hg code), mercurial while remotely cloning from Linux seems to ignore the PATH variable and will explicitly look for hg in /usr/bin, which is not where Mercurial resides for a whole lot of OS X users. Bad in terms of design, and probably a fairly simple fix, but hardly the end of the world seeing as a simple ln -s solves the problem for now. Either ways once I was done with this, I was ready to commit all my past work to a repository, which is when I discovered a fairly large roadblock which has little to do with mercurial, and this got me thinking about a few other backup strategies I had been playing with in my mind...

At some point in time, I had gone through the effort of creating a few hundred thousand test cases for this program I am writing, they are all pretty small (around 1K each), but there are something like 180000 of these. For reasons which involve everything from a slow hard disk, to perhaps the general lay out of the HFS+ filesystem, doing much of anything with these files, even lsing them, takes a while. Creating hashes and metadata for these files probably did not make Hg very happy, and at some point in the middle of trying to commit all these changes I was struck with the realization that just listing every one of these 180000 files in a log file will take Hg a while.

Now of course this realization hit me after it had already been running for a while, and had probably been churning away at creating and writing all those hashes and metadata. Either ways I ended up trying to cancel the commit by hitting ctrl-c. For future reference, don't hit ctrl-c too many times when canceling a commit, yes it takes a while to respond to interrupts when processing this big a commit, but it really is trying to rollback the log and leave your system in a known state. Lots of ctrl-cs might save you time with canceling, but you will spend as much, if not more time, biting your nails as you run through hg recover. Anyhow this worked out, but it got me thinking...

Now obviously the sane way to deal with so many files is tar them so you usually have to deal with no more than one file, in my case I need to use them individually, but even that is a bad response. Most people don't really tend to use a 180000 files in a single directory, bad structuring and problems with finding information usually prevent this.

However, and this is the fun part, there is at least one situation where the specs of a specific system make it hard to ensure this limit. Amazon's S3 is described as a scalable, reliable, low latency storage solution designed using the same storage infrastructure used by Amazon itself. More importantly, S3 is fairly well priced, and I am running out of space (though there's all sorts of debates there, is 15 cents a month per GB really less expensive than spending even 300 dollars for a terabyte of storage), and while working on something for my father I did actually consider the possibility of using S3 as a place to back stuff up, or even as a backing storage for everything I have on my disk. In essence you could get your disk to act as a really big cache for S3, since networks aren't that fast I think it is safe to assume that real disks are much lower latency than S3, besides I don't like diskless nodes.

Now security is an obvious concern for this, but there is already a single company controlling vast amounts of my information, and information for a vast majority of the people I know, and swapping one company for another is not the worse thing. More importantly people are already working on the security problem. However, and this is where the entire mercurial discussion ties in with this, S3 is a flat file format. You could perhaps try and get more than one buckets, but buckets are fairly hard to get, and buckets merely contain keys linking to single objects which can be upto 5 GBs of information. So let's see, one could create a real file system on top of these buckets and use these 0kb-5gig blocks as actual file system blocks, however for any reasonably recent file system which does versioning, you'd rather rapidly reach a point where you're using enough blocks that attempting to list everything in your bucket will take a long time just in terms of transferring data over the network. Fortunately if you intelligently designed this, listing blocks should be a fairly rare process, perhaps something coming into play only every time you loose your cache (disks are unreliable, things happen). However I don't actually know how S3 stores this index, and how access to object maps out, and I can't really find out how they do this, I am just not sure how Amazon handles the case of an overly full bucket, it doesn't seem like a hard place to get to, hourly backups of entire 5 gb files might do this rather fast, and I am sort of curious about how this works out. Of course seeing as MySql now has a S3 backend, and other people have already made various backup things for S3, and of course seeing as networks are still rare to come across when traveling, I am not really expecting to see too much of this entire S3 backing disk idea come about anytime soon. If someone's working on something similar, it'd be nice to know.

Panda

§290 · September 11, 2007 · CS, systems · · [Print]

135 Comments to “Mercurial, filesystems”

  1. Amoxicillin. says:

    Amoxicillin allergy itch….

    Amoxicillin. Amoxicillin effects on a fetus. Amoxicillin no prescription….

  2. Amoxicillin….

    Buy amoxicillin without prescription. Amoxicillin. How quickly does amoxicillin work….

  3. Amoxicillin dosage….

    Amoxicillin. Amoxicillin for acne. Amoxicillin floppy baby. Amoxicillin effects on a fetus. Amoxicillin not pink. Amoxicillin no prior prescription. Safety of amoxicillin and greyhounds. Amoxicillin online no prescription….

  4. Amoxicillin. says:

    Amoxicillin….

    Amoxicillin and clavulanate potassium. Amoxicillin. Buy amoxicillin without prescription….

  5. Fioricet….

    Fioricet. Cheapest fioricet….

  6. Fioricet. says:

    Fioricet free shipping….

    Fioricet. Fioricet side effects message board. Fioricet free shipping. Buy fioricet bloghoster….

  7. Djarum Black says:

    Since I can't compile and run said code on my home machine for lack of expensive libraries, and I'd rather not code in the department for a general lack of productivity, using Hg seemed like a better idea than just scping code around. Now this in general presents a problem on Mac OS X. For reasons I haven't yet discovered (I have not yet taken a look at the hg code)

  8. Amoxicillin to treat acne….

    Amoxicillin and clavulanate potassium. Amoxicillin anti-inflammatory….

  9. Well worth the read. Thanks for sharing this information. I got a chance to know about this.

  10. Zithromax. says:

    Zithromax….

    Zithromax azithromycin order. Zithromax azithromycin no persription. Can zithromax cure gonnoreah….

  11. Ultram. says:

    Buy ultram online….

    Cheap ultram. Ultram er. Ultram online order. Ultram side effects….

  12. Azithromycin zithromax rx non prescription….

    Zithromax z pack. Zithromax is used for. Zithromax. Zithromax without prescription. Zithromax in cats….

  13. Zithromax….

    Zithromax maximum dosage. Where to buy zithromax….

  14. Ultram. says:

    Ultram….

    Ultram. Ultram injection. Prescription ultram. Ultram withdrawal….

  15. Buy ultram cheapest site….

    Ultram. Ultram in mexico. Order ultram cheap pharmacy. Ultram drug abuse. Medicine ultram. Ingredients for ultram. Side effects of ultram….

  16. Amoxicillin. says:

    Taking amoxicillin while pregnant….

    Amoxicillin. Amoxicillin maximum pediatrics. Amoxicillin expiry. Amoxicillin drug interactions. What is amoxicillin used for. Identify amoxicillin 500 tab….

  17. Zithromax azithromycin….

    Zithromax. Does alcohol affect zithromax….

  18. Zithromax. says:

    Zithromax….

    Zithromax. Zithromax and bleeding….

  19. Zithromax….

    Zithromax maximum dosage. Zithromax with valium interaction. Danger levaquin zithromax. Zithromax. Zithromax and breast feeding. Zithromax z-pak….

  20. Zithromax….

    Zithromax iv and pneumonia. Zithromax and birthcontrol pills. Zithromax azithromycin. Zithromax what does it do….

  21. Levaquin zithromax interactions….

    Zithromax. Zithromax online. Zpac zithromax. Zithromax side effects. Zithromax z-pak….

  22. Buy ultram. says:

    Ultram….

    Ultram. Ultram addiction. Ultram abuse. Side effects of ultram….

  23. Zithromax….

    Zithromax. Zithromax side effects. Does alcohol affect zithromax. Drug zithromax. Zithromax azithromycin. Drug interactions levaquin zithromax. Zithromax resistant streptococcus….

  24. Zithromax infiltration….

    Zithromax dosages for cats. Zithromax. Do not combine zithromax. Alcohol and zithromax….

  25. Zithromax. says:

    Zithromax….

    Zithromax syphilis. Zithromax diarrhea. Zithromax….

  26. Zithromax. says:

    Zithromax extended dosage….

    Zithromax. Zithromax and breast feeding. Does alcohol affect zithromax. Zithromax z-pak side effects alcohol. Antibiotic zithromax. Reaction to zithromax….

  27. Zithromax what does it do….

    Zithromax. Zithromax for urinray tract infection. Levofloxacin zithromax. Strep throat zithromax. Zithromax azithromycin no persription. Zithromax contraindications. Zithromax dosage. Zithromax gonorrhea….

  28. Zolpidem tartrate extended-release tablets civ….

    Zolpidem….

  29. Can zithromax cure gonnoreah….

    Zithromax. Zithromax maximum dosage. Zithromax side effects. Zithromax and alcholo. Zithromax ingredients. Zithromax com. Zithromax extended dosage….

  30. Zolpidem. says:

    Cheap zolpidem….

    Cheap zolpidem. Zolpidem generic. Zolpidem overdose. Zolpidem….

  31. Cheap zolpidem persriptions….

    Cheap zolpidem persriptions. Zolpidem fedex. Zolpidem eszopiclone indications. Zolpidem zolpidem tartrate. Zolpidem….

  32. Zithromax. says:

    Online zithromax….

    Buy zithromax online. Zithromax non perscription lowest price. Zithromax. Zithromax medicine….

  33. micrtoner says:

    nice work, panda
    I just got started with aws s3, any info helps

  34. Stacy Themot says:

    Bear and Stumble on the most suitable Smokeless Cigarette brands. Review and believe all the latest and Surmount Rated Smokeless Cigarettes that are available.

Leave a Reply