12 Oct 2004

Database refactoring

I recently studied a database refactoring:

Before refactoring we had a user table like this:
User
UserId(PK) | FisrtName | LastName | LastConnexion | Login | Password
The evolution: 1 user can now have multiple login/password (let's say one login for the official site and one for the administration site)
So the logical database scheme evolution would be:

User
UserId(PK) | FisrtName | LastName
Login
UserId(FK) | Admin(boolean) | LastConnexion | Login | Password
with Login table PK is (UserId + Admin)

So now we have:
User (1 to many) Login

If we focus on the 4 basic sql queries: INSERT, UPDATE, SELECT, DELETE
Before evolution:

INSERT INTO User(UserId, FisrtName, LastName, LastConnexion, Login, Password)
VALUES (value1, value2,....);
UPDATE User
SET LastConnexion = new_value
WHERE UserId = some_value;
SELECT UserId, FisrtName, LastName
FROM User
WHERE Login = some_value;
DELETE FROM User
WHERE UserId = some_value;
If we do the "classical" refactoring we need to modify the existing sql queries (modification are in bold/italic):
INSERT INTO User(UserId, FisrtName, LastName)
VALUES (value1, value2,....);
INSERT INTO User(UserId, Admin, LastConnexion, Login, Password)
VALUES (value1, 0, value2,....);
UPDATE Login
SET LastConnexion = new_value
WHERE UserId = some_value;
SELECT UserId, FisrtName, LastName
FROM User, Login
WHERE Login = some_value
AND User.UserId = Login.UserId
AND Admin = 0;
DELETE FROM Login
WHERE UserId = some_value
AND Admin = 0;
DELETE FROM Login
WHERE UserId = some_value
AND Admin = 1;
DELETE FROM User
WHERE UserId = some_value;
We can see that this modification create many changes in code.

Alternative solution:
Replace User Table by a User View with (almost the same data)
DELETE User Table (save data before)
CREATE TABLE NewUser
NewUser
UserId(PK) | Admin(PK) | FisrtName | LastName | LastConnexion | Login | Password
CREATE VIEW User
AS
SELECT (UserId, Admin, FisrtName, LastName, LastConnexion, Login, Password) FROM NewUser
WHERE Admin = 0;
INSERT INTO User(UserId, Admin, FisrtName, LastName, LastConnexion, Login, Password)
VALUES (value1, 0, value2,....);
UPDATE User
SET LastConnexion = new_value
WHERE UserId = some_value;
SELECT UserId, FisrtName, LastName
FROM User
WHERE Login = some_value;
DELETE FROM User
WHERE UserId = some_value
AND Admin = 0; (not sure you need this)
You can see that modifications are reduce in code.
BUT the drawbacks of this technic is that FisrtName and LastName are repeated unnecessarily which can lead to invalid data.

You need to balance time to change code vs (database space lost + risk of invalid data)

Additionnal informations: one good place to start is here: http://www.agiledata.org/

No comments: