MySQL – How to limit fields loaded from CSV

[adsense id=”0514458240″ width=”468″ height=”60″]

I have seen this many times. Let’s imagine we receive a large CSV file with many pieces of information but we are only interested in one or two field values to be loaded into a MySQL table. What can we do in order to load only those fields.

There are several options:

  • We can pre-process the file to extract only those fields that we need into a separate CSV file
  • We can load all the data into a staging table and then write more code to load only the data and fields into our live table.
  • We could use LOAD DATA INFILE using session variables to load data directly into the live table

Below, we are going to demonstrate the loading of the data directly into the live table using session variables.

Let us imagine we have received a file with an employee id and an employee name. We only need to load the name into the emp table.

mysql> desc emp;
+--------+--------------+------+-----+---------+----------------+
| Field  | Type         | Null | Key | Default | Extra          |
+--------+--------------+------+-----+---------+----------------+
| empid  | int(11)      | NO   | PRI | NULL    | auto_increment | 
| name   | varchar(100) | YES  |     | NULL    |                | 
| deptid | int(11)      | YES  |     | NULL    |                | 
+--------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

The file contents are as follows:

id,name
5,"Tom"
6,"Dick"
99,"Harry"

Now we need to write the LOAD DATA INFILE code to load the name into the employee table. Leaving the empid field NULL will result in auto_increment assigning a value and deptid will remain NULL.

LOAD DATA INFILE 'names.txt' INTO TABLE emp
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
IGNORE 1 LINES
(@id, name);

By storing the id into a session variable @id which is not used means that this will be ignored. It should be noted that we should have a session variable or field name defined for each field in the CSV file in order to avoid warnings appearing:

mysql> show warnings;
+---------+------+---------------------------------------------------------------------------+
| Level   | Code | Message                                                                   |
+---------+------+---------------------------------------------------------------------------+
| Warning | 1262 | Row 1 was truncated; it contained more data than there were input columns | 
| Warning | 1262 | Row 2 was truncated; it contained more data than there were input columns | 
| Warning | 1262 | Row 3 was truncated; it contained more data than there were input columns | 
+---------+------+---------------------------------------------------------------------------+
3 rows in set (0.00 sec)

Now let’s see how this all works:

mysql> truncate table emp;
Query OK, 0 rows affected (0.01 sec)

mysql> LOAD DATA INFILE '/home/rcashell/mysql/load/names.txt' INTO TABLE emp
    -> FIELDS TERMINATED BY ','
    -> OPTIONALLY ENCLOSED BY '"'
    -> IGNORE 1 LINES
    -> (@id, name);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select * from emp;
+-------+-------+--------+
| empid | name  | deptid |
+-------+-------+--------+
|     1 | Tom   |   NULL | 
|     2 | Dick  |   NULL | 
|     3 | Harry |   NULL | 
+-------+-------+--------+
3 rows in set (0.00 sec)

One response to “MySQL – How to limit fields loaded from CSV”