MySQL – data migration example

[adsense id=”0514458240″ width=”468″ height=”60″]

Data migrations from one database to another can give rise to unusual processing requests. In this example, a temporary table has been created and has been populated with data. The table consists of three columns: f1 int, f2 int and f3 which is a string. The 3rd column contains information in the form “a,b,c” or “a-b-c” or “a;b;c”. The information in the 3rd column must be expanded out to insert information into the live table with records { f1, f2, “a”}, {f1, f2, “b”} and {f1, f2, “c”}.

1 9 a,b,c
2 9 a;b;c
3 9 a-b-c

Should return

1 9 a
1 9 b
1 9 c
2 9 a
2 9 b
2 9 c
3 9 a
3 9 b
3 9 c

There are two ways to perform this action, the first is pre-loading the information into the database expand the data out using command line tools such as awk or create a store procedure which takes the data and merges it into the live table.

The sample code provided here deals with the second scenario.

DROP TABLE IF EXISTS tmp_info;
CREATE TABLE tmp_info(f1 int, f2 int, f3 varchar(20));
INSERT INTO tmp_info VALUES (1, 9, 'a,b,c');
INSERT INTO tmp_info VALUES (2, 9, 'a;b;c');
INSERT INTO tmp_info VALUES (3, 9, 'a-b-c');
DROP TABLE IF EXISTS t_info;
CREATE TABLE t_info(f1 int, f2 int, f3 varchar(20));
DROP PROCEDURE IF EXISTS appenddata;
DELIMITER $$
CREATE PROCEDURE appenddata()
BEGIN
	DECLARE done INT DEFAULT 0;
	DECLARE a,b INT;
	DECLARE lstart, lend INT;
	DECLARE c VARCHAR(20);
	DECLARE cur CURSOR FOR SELECT f1,f2,f3 FROM tmp_info;
	DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
	OPEN cur;
	read_loop: LOOP
		FETCH cur INTO a, b, c;
		IF done THEN
			LEAVE read_loop;
		END IF;
		-- Keep all using the same format
		SET c = REPLACE(c, ";", ",");
		SET c = REPLACE(c, "-", ",");
		SET lstart = 1;
		SET lend = LOCATE(",", c, lstart);
		WHILE lend > 0 DO
			INSERT INTO t_info(f1, f2, f3) VALUES (a, b, SUBSTRING(c, lstart, lend-lstart));
			SET lstart = lend+1;
			SET lend = LOCATE(",", c, lstart);
		END WHILE;
		IF c <> "" THEN
			INSERT INTO t_info(f1, f2, f3) VALUES (a, b, SUBSTRING(c, lstart));
		END IF;
	END LOOP;
	CLOSE cur;
END;
$$
DELIMITER ;
call appenddata();
SELECT * FROM t_info;

The output from this process is as follows:

f1	f2	f3
1	9	a
1	9	b
1	9	c
2	9	a
2	9	b
2	9	c
3	9	a
3	9	b
3	9	c