A couple of months ago, I finally made time to sit down and play with one of the most powerful features of Samba: clustering and high availability through CTDB.
CTDB stands for Clustered Trivial DataBase. The TDB is a lightweight database used by Samba to store different types of persistent and volatile data. If you’ve ever installed/configured or managed Samba before, then you are probably familiar with the dozens of .tdb files in /var/lib/samba. Contained in those files are session information, share configuration, printer drivers and other data that smbd, winbind and nmbd need to keep running. CTDB is an implementation of the TDB that work across multiple nodes in a cluster like fashion. Effectively, it synchronises a node’s local database with it’s peers, while at the same time maintaining consistency and providing locking and recovery. The end result is that multiple nodes can provide a singular logical service driven view. Provided your storage and network can keep up, you can scale out services like NFS, FTP and of course SMB/CIFS almost linearly! And when you build your configuration on top of a working Linux cluster stack, not only do you get performance, you also get fault tolerance.
Now I can already hear some saying “Why not just use DFS on Windows. Doesn’t it already do the job?”, and to some degree this is true. DFS uses replication and the concept of “namespaces” to consolidate multiple file servers. However it wasn’t designed to be the solution to scalability and performance. In my opinion, only the SMB related features of the up-comming Windows Server 2012 even come close, but many of those performance related features and enhancements will require Windows 8 clients.