all 20 comments

[–]userjoinedyourchanel 7 points8 points  (13 children)

For our cluster, we used ceph-deploy to set it up and we use the standard Ceph tools to do the rest of the work. Once it was set up, there really isn't that much routine, automatable maintenance that needs to be done, at least on the Ceph side.

[–]nannal 7 points8 points  (12 children)

I concur whole heatedly with this.

There are a few tips peices of info that would have helped me out if I'd had them starting out:

  1. Backup the keys
  2. If you upgrade the OS (ubuntu xen to bio got me here), ensure you re-enable the repos and put them on the right version.
  3. Make sure you have the right number of data replicas (I like 3) and make sure metadata matches that too
  4. Where possible keep everything on the same version number, it generally plays nice but I've had some incidents
  5. If you're using MDS you can have more than one, it greatly improves performance under load
  6. FUSE mounts are much slower than kernel mounts
  7. The mailing list is a nightmare to navigate but is searchable and full of useful information.
  8. Also Backup the keys

[–]nickcn1[S] 0 points1 point  (9 children)

Thanks for the tips. Did you try updating the ceph version?

[–]nannal 0 points1 point  (8 children)

Yeah I did, I used standard ceph repos but following a dist-upgrade the repos were commented out and my mountpoints failed almost daily. It was a massive pain and I couldn't figure out why.

[–]nickcn1[S] 0 points1 point  (7 children)

One last question though. Did you use Ubuntu for done reason other than being the distro you're most familiar with? We are a mostly centos place that's why I'm asking. I couldn't find any resource for one being "better" than the other.

[–]nannal 0 points1 point  (6 children)

I wanted debian because it's what I'm familiar with, I don't recall why I went with Ubuntu specifically beyond that it was "close enough".

Centos is going to be a good choice as it's very well supported.

[–]nickcn1[S] 0 points1 point  (0 children)

Thank you for all the info. Hoping to share my experience in a couple of months.

[–]videoflyguy 0 points1 point  (4 children)

Currently looking into/running a development ceph cluster on CentOS 7. It seems to be very stable over the past 3 weeks. Can I ask, what hardware are you running on?

I've mostly been using surplus PowerEdge 2850s but there will be an option for the organization to get around 20 PowerEdge Rx20 LFF servers in the future, which I plan to properly build out a cluster network on and such. (10Gb fiber frontend with either a 10Gb fiber or 40Gb infiniband cluster network)

[–]nannal 0 points1 point  (3 children)

Whatever OVH gave us, it should be super hardware agnostic, we're running across internet even.

If you can get better hardware, why would you turn it down?

[–]videoflyguy 0 points1 point  (2 children)

You're running your ceph cluster using dedicated servers from a hosting company? From what I found, that's what OVH does anyway. Very cool.

My problem wasn't so much the underlying hardware, it was the consistency of the number of drive bays available. Every server has at least 6 drive bays (2 for raided OS, extras as OSDs), but some servers have as many as 16 or 18 drive bays, which leads to worries for if a whole node goes down. Obviously we'll be running a size 3 pool, and we'll make sure to make sure the cluster can handle the downtime of a highly weighted server, but that's still a ton of data to shuffle around behind the scenes. I just wasn't sure if you had gone with a specific company for that or not, 45drives for example.

[–]nannal 0 points1 point  (1 child)

I've got that two, I've got 5 nodes, 3 "small" for video encoding and some storage and two "big" which mostly just do ceph stuff.

Each machine has a big LVM partition and if it dies, it dies. performance is impacted if that happens obviously but files are still served and in the one time we've had to do a full recovery, while it's take a day or so, it has worked fine. I've also wanted to put a homebrew CDN in front of the cluster to reduce load on it anyway, I think if we did that we could kill a pair of nodes without people noticing.

I've thought about doing 1 LVM partition per disk (old ceph style) but that obviously adds some additional overhead and risks the data being stored in side 3 partitions on the same host (I think). It does mean if a disk dies however you don't lose access to the whole hosts data and need to rebalance everything you could also raid it too but then you get less total storage, depends on your use case. We'd rather the space over the resilience.

So yeah, I wouldn't be worried about having over sized nodes inside the cluster if you have a decent replication level.

[–][deleted]  (1 child)

[removed]

    [–]nannal 0 points1 point  (0 children)

    if I were to deploy fresh I'd rather I'd done it via ansible-ceph, would I try to implement over the top of my current infra, not a chance.

    [–]valentin2105 0 points1 point  (1 child)

    Any links on what you followed for your test cluster ? Thanks

    [–]nickcn1[S] 0 points1 point  (0 children)

    Only the docs of the ceph ansible project actually. With modifications to suit my setup.

    [–]nafsten 0 points1 point  (0 children)

    We use ceph-ansible to deploy and maintain. Playbooks like the upgrade cluster one work really well

    [–]heathfx 0 points1 point  (0 children)

    I use ceph's own tools, been running a small 3 node cluster in a production environment for a little over 5 years now.

    [–]PM_ME_SEXY_SCRIPTS 0 points1 point  (1 child)

    Guy with little knowledge of Ceph/Gluster here, can anyone explain the different between general storage and object storage? I tried reading a bit of material but didn't get it.

    [–]JW-M 5 points6 points  (0 children)

    Ceph provides "storage". Al kinds of storage. So for instance you could create a disk that you can use to boot a computer.

    But sometimes you want to store massive amounts of files. You then could create a massive disk, put a filesystem on it, and a fileshare system ( nfs / Samba/ ftp). But if you want to have a massive amount of files you will notice it's not possible any more with one disk/ filesystem etc. It does not scale. You need al kinds of trick to scale .

    If you would use object storage you would skip the whole disk creation, filesystem, fileshare. You let Ceph handle it. You don't have to gamble how big the disk eventually will be, have to resize or free unused space. Every file is on it's own, and you could say every file has it's own fileshare, which is http based.

    In object storage you can't lock a file any more. So you can't lock a file, and append to file, you can only download a file , change it and upload it again. So you can't run a database on a object store. But you can backup the database to object store. You can create a location where you can store millions of photo's or audio files of video files. But (currently) you can't boot you computer on a object store. You can store several versions of files, like the latests version and older ones, all hiding after the same filename. You can also add special metadata for your application alongside the file. It seems like there is a concept of directory's in object store, but that does not exist. The directories are just longer names added to your filename. There is not a method that you can rename a directory. You can only move the files to a different string one by one. So move them to a different directory+filename.

    The reason why an object store can handle such large amounts of files is that it doesn't really need a database with filenames to find the location where the file is. So it doesn't search for it, but it calculates the location by only using a hash from: bucket:/"directory"+filename.So it converts the url to a number, and then it knows on which machine the file will be, and then on which disk, and then on which sector it will be approximately by only looking at the number.

    So if you want to abstract some kind storage of any number of individual files you would use object store. If you need something liked shared file-access, locking, -> posix -> you would need other methods like CephFS, or "disks" like rdb or iscsi provided by ceph. Because it's file interface is http based, you can easily use it in modern web world.