SqlCVS User's manual

From LinuxMCE
Revision as of 17:42, 28 October 2009 by Marie.o (Talk | contribs) (This page should not be deleted, as it contains a very nice overview over sqlCVS)

(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Here is a general overview of how to use sqlCVS, designed to explain quickly what you can do with it. This assumes you have a master database, we will call 'masterdb' somewhere.

First, put the masterdb on a server somewhere where you will run sqlCVS. Once you do, you should not use the database directly again. If you have a web server, for example, that uses the database, it will now run against a separate copy of the database--not the masterdb that sqlCVS will be using. The masterdb that sqlCVS uses should only be used by sqlCVS, and everyone else will use local copies they checkout from sqlCVS.

Next, on the server, run sqlCVS passing it the login information for the masterdb. You will now use the create command to create repository(ies) within the masterdb. A repository is really just a collection of tables that you will want to logically group together. For example, you could create an 'Accounts Payable' repository which contained all the tables related to A/P. When local users want to do a check-out, they will check-out a whole repository. You can create just 1 repository and put all the tables in it, in which case all users will check-out and check-in the whole database at once. There may be some tables that you do not put in any repository, but then nobody will be able to access them using sqlCVS.

When creating the repository, we recommend you turn on 'history tracking'. This causes sqlCVS to create 2 other copies of every table, called [table]_pschist and [table]_pscmask. These copies has the same structure as the original table, but with all the constraints removed. Every time someone changes a row, the full contents of the row will be saved into the pschist table, and a record with bit flags added to _pscmask indicating which fields were modified. This is how sqlCVS can go back in time and undo changes, or restore the database to a prior state. Your original table will still have only the current values, just like it always did.

Next you run sqlCVS with the dump command. This creates a [repository].sqlcvs file that contains a working copy of the repository (or database) for you to give to the clients. The clients use the same sqlCVS program (binary) as the server--the whole program is in 1 file shared by both server and client. The client uses the import command to import the .sqlcvs into their local MySQL database. This becomes their working copy, and has the same structure as the original database you started with. It doesn't matter if the .sqlcvs file they import is very old. It's only a starting point--as soon as they do a sqlCVS update or sync the local copy will be brought current.

On the server you run sqlCVS with the listen command. That causes sqlCVS to open a tcp/ip port for an incoming connection from a client, and allows clients to connect to do checkins and updates.

On the client the user will continue to use all the same software without changing anything--he just points the database connection to his local copy rather than the server's. Your web site, your accounting program, etc. all will use the working copies and will no longer connect directly to the masterdb. Since the working copy is essentially identical to the original masterdb all the software that uses the database should be unaffected and should work fine with this client copy. The only difference in the database is the addition of some special psc_ fields at the end of each table. However, the fields all have default values and you do not ever need to touch them. The only time this may cause a problem for your software is if it does an INSERT statement without specifying the field names, assuming a certain number of fields. In that case, the insert statement may fail because the table now contains more fields. The other potential problem is if your table has a 'timestamp' field. One of the special psc_ fields is a 'timestamp' so sqlCVS can keep track of what records were modified, but MySQL does not allow a table to have 2 timestamps and still work as expected.

***IMPORTANT NOTE*** Some editors, like SQLyog, do not know how to properly handle timestamp fields. They reset the timestamp field every time the row changes. This is actually a bug in those editors since it defeats the whole purpose of a timestamp. PhpMyAdmin does not have this problem, and your software will not either since your software will be unaware of the timestamp field.

sqlCVS also added a psc_id field to each table. This is sqlCVS's internal ID for that row. It is permanent, and will never change. This is how sqlCVS is able to accurately know what row was modified. You can change all your fields, including the primary key, and sqlCVS will still update the correct row in the master database. This field defaults to NULL and is assigned a value only when you check-in the new row. That is how sqlCVS knows whether a row is new.

There is also a psc_user field. This contains the user id of the person who owns the record. The security settings can be changed, but by default only the owner of a record or an administrator can go back and change that record. The field defaults to NULL, and when the new rows are checked in, sqlCVS will set them to be the user id of whoever is doing the checkin--he becomes the "owner" of those records. This means that if you have 30 people using the database and adding records, only the person who does the checkin will later be the 'owner' of those rows. You may modify your application to set the psc_user manually, and sqlCVS will respect your values. However, once the row has been checked in, the sqlCVS on the server will not allow any local users to change this value. To change the owner of a record after checkin, you will need to update the record in the sqlCVS master database. Or you can turn off the security, allowing anyone to modify any records, or have only table-level security, and then it does not matter anymore. If you do keep the default row-level security and a user who is not an administrator modifies a row that does not belong to him or tries to modify a row or table that is marked as "frozen" or he does not have permission to modify, then when he does a sqlCVS checkin, the server will keep his modifications in a special place, and give him a batch #, together with the name of the user who owns those records. Then that user, or any administrator, can run sqlCVS 'approve batch' to have those changes put back into the master database.

If you have history tracking enabled in the sqlCVS master databsae, you can also checkout local copies of the database as of a given date, or with some modifications ignored. For example, you can see what the database looked like as of last month, or get a copy that omits all the modifications made by a certain user.

When you do a check-in, all changes, including new, deleted and modified rows, are committed in a "batch", and given a batch id. If the database engine you are using supports transactions (commits and rollbacks), then your checkins will be atomic. In other words, if you modified 100 rows, when you do a check-in, all 100 rows will be sent to the server in a batch. If there is a failure--maybe you entered some data in your local copy that violates a constraint--then none of the rows will be checked in. It's all or nothing. If the database engine does not support transactions, it's possible to get some rows checked-in, but not all. This will not break anything, and sqlCVS will checkin the rest of the changes in another batch, but usually atomic commits are preferred.

You can also see what modifications where made to the database by date, user, or batch. If you're an administrator, you can roll back the master database to it's state at a given time, or selectively remove some batches after the fact. There's also a command-line switch that forces strict verification of database integrity and will not allow rows that reference foreign keys that do not exist.