Page 1
Where are the usability studies for us : developers?
Page 2
Index
See this article in english 
Page 3
La "Martenitsa" búlgara

MySQL and Data Truncation : A descent into IEEE hell

Alexander Hristov

 

20 - Oct - 2006

It all started very simply...

After upgrading a MySQL installation to MySQL 5.0.23a and to Connector/J 5.0.3 I started getting wierd data truncation exceptions everywhere. After some digging, I found that Connector/J had turned on a "strict JDBC compliance" mode (the new jdbcCompliantTruncation property) that causes previously silenced warnings about data truncations to be converted into full-fledged exceptions.

Debugging my own code

"Oh well", I said, time to fix some sloppy code.... I digged some more, and found some columns that had a size too small for the data I was trying to insert into them. They had gone mostly unnoticed because the data that was cropped was not significant (the typical "nanosecond" part of a timestamp converted to string). But the strange thing was that I kept getting those exceptions, even if everything was fitting. After some debugging, I managed to isolate the offending code into something very simple (thus I thought then): a double value (say 8517.44) was being inserted into a decimal(5,2) column. But even though the data fit, I kept getting the exception.

"How strange- This couldn't possibly be happenning". It must be some of those stupid mistakes of mine, It's impossible such an obvious bug to have passed the MySQL team. As always, one should always assume that the error is his instead of blaming it to others. So I read and re-read the decimal(p,q) specification just in case I was omitting something trivial. I was not.

So I turned on the mysql console and created a small test case:

  mysql> create table foo ( cost decimal(6,2) not null ) engine=InnoDB;
  Query OK, 0 rows affected (0.17 sec)

  mysql> insert into foo values (8517.44);
  Query OK, 1 row affected (0.03 sec)

  mysql> show warnings;
  Empty set (0.03 sec)

  mysql>

Well, as I said, I never really expected the bug to be so obvious. So the thing must be on the java-side, right? I put together a small test:

Test.java
 
import java.sql.*;
import java.util.Properties;

public class Test {
  public static void main(String[] args) throws Exception {
    Class.forName("com.mysql.jdbc.Driver");
    Properties p = new Properties();
    p.setProperty("user","user");
    p.setProperty("password","password");
    p.setProperty("jdbcCompliantTruncation","true");
    p.setProperty("autoReconnect","true");
    p.setProperty("characterEncoding","UTF-8");
    Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/test",p);
    
    Statement sql = con.createStatement();
    sql.executeUpdate("drop table if exists foo ");
    sql.executeUpdate("Create table foo (cost decimal(6,2) NOT NULL) ENGINE=InnoDB");
    
    PreparedStatement ps = con.prepareStatement("insert into foo values (?)");
    double value= 8517.44;
    ps.setDouble(1,value);
    ps.executeUpdate();
  }
}

 

After running this test, I immediately got my exception:

Exception in thread "main" com.mysql.jdbc.MysqlDataTruncation: 
Data truncation: Data truncated for column 'cost' at row 1 at com.mysql.jdbc.SQLError.convertShowWarningsToSQLWarnings(SQLError.java:717) at com.mysql.jdbc.MysqlIO.scanForAndThrowDataTruncation(MysqlIO.java:3031) at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1250) at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:693) ...

Definitely wierd... Now I tried to reduce the error even more, by using directly a Statement instead of a PreparedStatement:.

 
...
    Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/test",p);
    Statement sql = con.createStatement();
    sql.executeUpdate("drop table if exists foo ");
    sql.executeUpdate("Create table foo (cost decimal(6,2) NOT NULL) ENGINE=InnoDB");
    sql.executeUpdate("insert into foo values(8517.44)");
...

 

But this one worked!. Hmmm... Seems it affects only prepared statements. I investigated further and discovered that PreparedStatements in JDBC can be handled either client-side (in the driver) or server-side. When handled client-side, the statements are filled with values and sent to the server as ordinary statements, while in the other case, it's the server that does the precompilation and parameter binding. Obviously the second case is more efficient, since the server can associate execution plans, dependancies, etc. to a prepared statement and doesn't need to re-analyze each execution. Whether a Java PreparedStatement is handled client-side or server-side is controlled by the useServerPrepStmts driver property. I tried the above test case setting useServerPrepStmts to false, and it worked without throwing an exception. So the error had just been further delimited by that fact.

Debugging the JDBC driver

I tried the error on a couple of other MySQL servers, running on Fedora Core 5 systems, just to make sure that it wasn't something with my particular version, and it was still there. So I filed a bug with mysql. Of course, I couldn't just sit still while waiting for a reply - I just had (stupid me) to investigate further.

In my years of programming, I've had my share of misgrievings with floating point values, so I thought that this might be a problem related to how doubles are sent to the server over the wire. Maybe 8517.44 was sent as 8517.43999999 after all and maybe that the server was thinking that I was trying to insert a number with many decimals in an decimal(x,2) column, and this was the reason for the data truncation error. So a quick visit to this online IEEE 754 decimal to binary conversion allowed me to get the hex representation of 8517.44 and to check if the internal binary representation matched the value exactly or not. For a 32 bit float, the value is represented as an approximation, but for a 64 bit double, the value has an exact representation of 40C0A2B851EB851F:

IEEE Converter

So the problem wasn't there either

Next stop - the MySQL JDBC Driver (Connector/J). Maybe the driver wasn't sending the correct IEEE representation, which I thought was quite unlikely, but it had to be tested anyway. How could I see what the driver was doing? Well : proxies to the rescue. I wrote a small program that listened on a port (say, 3308) and simply relayed data sent to that port to a different port (3306 - the port of the mysql server) and vice-versa : a sort of poor man's network monitor. Using that program I could see what the driver was sending to the server, and what the server was replying.

FROM CLIENT TO SERVER: 31 bytes
1b 01 01 01 16 69 6e 73 65 72 74 20 69 6e 74 6f 		.....insert into
20 66 6f 6f 20 76 61 6c 75 65 73 20 28 3f 29    		 foo values (?) 
FROM SERVER TO CLIENT: 52 bytes
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 		................
17 01 01 01 01 64 65 66 01 01 01 01 3f 01 01 3f 		.....def....?..?
01 01 01 01 01 fd 80 01 01 01 01 01 01 01 01 fe 		.....?........?
01 01 01 01                                     		....            
FROM CLIENT TO SERVER: 26 bytes
16 01 01 01 17 01 01 01 01 01 01 01 01 01 01 01 		................
01 01 1f 85 eb 51 b8 a2 c0 40                   		...?.Q...@      
FROM SERVER TO CLIENT: 11 bytes
01 01 01 01 01 01 01 01 01 01 01                		...........     
FROM CLIENT TO SERVER: 18 bytes
01 01 01 01 01 53 48 4f 57 20 57 41 52 4e 49 4e 		.....SHOW WARNIN
47 53                                           		GS              
FROM SERVER TO CLIENT: 173 bytes
01 01 01 01 01 1b 01 01 01 01 64 65 66 01 01 01 		..........def...
01 4c 65 76 65 6c 01 01 3f 01 01 01 01 01 fd 81 		.Level..?.....?
01 1f 01 01 1a 01 01 01 01 64 65 66 01 01 01 01 		.........def....
43 6f 64 65 01 01 3f 01 01 01 01 01 01 a1 01 01 		Code..?......?.
01 01 1d 01 01 01 01 64 65 66 01 01 01 01 4d 65 		.......def....Me
73 73 61 67 65 01 01 3f 01 01 01 01 01 fd 81 01 		ssage..?.....?.
1f 01 01 01 01 01 01 fe 01 01 01 01 34 01 01 01 		.......?...4...
01 4e 6f 74 65 01 31 32 36 35 29 44 61 74 61 20 		.Note.1265)Data 
74 72 75 6e 63 61 74 65 64 20 66 6f 72 20 63 6f 		truncated for co
6c 75 6d 6e 20 27 63 6f 73 74 27 20 61 74 20 72 		lumn 'cost' at r
6f 77 20 31 01 01 01 01 fe 01 01 01 01          		ow 1....?...   

Now of course I have no idea how the data protocol between the mysql server and a driver works, but even in this case, seeing the dump some pretty clear conclusions can be extracted, specifically :

  1. that the binary value of the number was being sent correctly. AND
  2. that this error probably is not related to the JDBC driver

To test the second hypothesis, I turned to Connector/NET - the .NET driver for MySQL, and created a quick Visual Basic.NET test:

Test.vb
 
Imports MySql.Data.MySqlClient


Public Class Form1
  Private Sub Button1_Click(
    ByVal sender As System.Object, 
    ByVal e As System.EventArgs) Handles Button1.Click

    Dim s As String =         
  "Database=Test;Data Source=localhost;User Id=user;Password=password;Port=3308"
    Dim conexion As New MySqlConnection(s)
    conexion.Open()
    Dim cmd As New MySqlCommand("INSERT INTO foo VALUES (?val)", conexion)
    cmd.Parameters.Add("?val", 8517.45)
    cmd.Prepare()
    cmd.ExecuteNonQuery()

    End Sub
End Class

 

I was still using the 3308 port because I wanted to compare what the .NET driver was sending with the data sent by the JDBC driver. Well, the data was the same, and the result was the same (a data truncation warning on the part of the server), but the .NET driver simply chose not to raise an exception. So it was definitely a server problem.

Meanwhile, on the first bug, I already got the reply that it could not be reproduced (hmmm), so I filed a new bug with more precise information and waited....

 

Debugging MySQL

After some time, I got tired of waiting and it gleaned on me that I could try to track down the bug myself. Somebody should have hit me hard at that point: C and C++ are not my favourite languages (and my experience with them is slim), and mysql is not precisely your "afternoon homework" kind of software. But I have always been a fan of the "everything is possible except getting back your taxes" school of thought, so I said that - given enough time and printfs() - I could at least find the bug, and in the best case fix it. Back in the golden era of BBSs, I had fixed some pesky bugs in WWIV, so no matter how big it was, mysql was just a software program, after all.

Next stop - the mysql source distribution.Never in my life I thought I would be doing this. Anyway there I went... I downloaded the mysql-5.0.27-nightly-20061009 distribution (which corresponds more or less to the 5.0GA version I was using) and tried to compile it under VC++ 2005. Avalanche of warnings (ok, whatever....) and some errors. "algorithm.hpp" not found. What the hell? Wasn't this supposed to compile without problems? Nevermind, I track down the files that the thing is complaining about and move them to some directory that it can find them. Wow! I have compiled mysql!. It even works!.  I run my Java test against this version, just to make sure that the error is still here, and I get once again the data truncation exception.  Ok, we are ready to go.

Now for the hard part - I switch the Active Solution Configuration to "debug" and rebuild the whole thing again. I had to find the place where the parameters sent over the wire are converted to double values. MySQL has a reasonably well organized source code. I went to the mysqld project and inside it I found the sql_prepare.cpp file with lots of set_param_xxxx functions. One of them was set_param_double, which I assumed that was the one to be called when a prepared statement has double parameters:

sql_prepare.cpp
 
static void set_param_double(Item_param *param, uchar **pos, ulong len)
{
  double data;
#ifndef EMBEDDED_LIBRARY
  if (len < 8)
    return;
  float8get(data,*pos);
#else
  doubleget(data, *pos);
#endif
  param->set_double((double) data);
  *pos+= 8;
}

 

Seems pretty straightforward stuff. the float8get macro was something like:

config-win.h
 
#define float8get(V,M) doubleget((V),(M))
#define doubleget(V,M)  do { *((long *) &V) = *((long*) M); \
          *(((long *) &V)+1) = *(((long*) M)+1); } while(0)


 

Now this is precisely the reason I've always hated C style languages. These endless chains of pointers to pointers to YET MORE POINTERS. And don't you just love how two statements are enclosed in a ficticious while loop just to make them one statement? Anyway, doubleget doesn't seem a big deal : just moves around two 4-byte longs from a buffer to somewhere else (in our case, to the "data" variable). Ok, I set a breakpoint in the param->set_double statement, I also set watches to view the contents of the **pos buffer and run the server, and then run the test:

MySQL debugging

The faithful watches tell me that the binary representation has arrived correctly: we see the 1f85eb51b8a2c040 sequence in the buffer. But then the value of data is 8517.4400000000005 (!) .WTF?!?

Now things are getting hairy.  Of course, there are multiple possible double values with the same binary representation. Usually this won't matter unless doing some very accurate calculations or unless doing many floating point operations in which the approximation errors will pile up.

I decide to put this issue aside while I investigate how mysql decides that a truncation has occurred. Again, I have no idea where or how this happens, but again "differential debugging" comes to the rescue. I start stepping through the server looking for hints from method names, until I arrive at a call to mysql_reset_errors() , which is defined in sql_error.cpp:

sql_error.cpp
 
void mysql_reset_errors(THD *thd, bool force)
{
  DBUG_ENTER("mysql_reset_errors");
  if (thd->query_id != thd->warn_id || force)
  {
    thd->warn_id= thd->query_id;
    free_root(&thd->warn_root,MYF(0));
    bzero((char*) thd->warn_count, sizeof(thd->warn_count));
    if (force)
      thd->total_warn_count= 0;
    thd->warn_list.empty();
    thd->row_count= 1; // by default point to row 1
  }
  DBUG_VOID_RETURN;
}

 

Great! so the warning count is stored in thd->total_warn_count. Now getting to the point where the warning is produced is easy. I simply set a watch and executed step by step until I saw that the count was set to 1, and this happens in field.cpp in the store() mehtod of Field, and the specific virtual method called is Field_new_decimal::store(). Here we see a piece of code like this one (the TODO annotation promises nothing good...)

field.cpp
 
  err= double2my_decimal(E_DEC_FATAL_ERROR &amp; ~E_DEC_OVERFLOW, nr,
                         &amp;decimal_value);
  /*
    TODO: fix following when double2my_decimal when double2decimal
    will return E_DEC_TRUNCATED always correctly
  */
  if (!err)
  {
    double nr2;
    my_decimal2double(E_DEC_FATAL_ERROR, &amp;decimal_value, &amp;nr2);
    if (nr2 != nr)
      err= E_DEC_TRUNCATED;
  }

 

Things are becoming clear - MySQL is converting my little 8517.44 to a decimal, then back to a double, and if any difference is found (nr2 != nr1), a data truncation error is raised. GREAT shit! shit! shit! shit!  Now, this is exactly the way IEEE 754 doubles are supposed NOT to be compared. Where did the machine epsilon go?  Down the drain, I guess... No wonder that when we come back from this roundtrip conversion, we have a different result:

Bad float comparison

But alas, this is the least of my problems because the biggest surpirse comes now:

decimal.c
 

int double2decimal(double from, decimal_t *to)
{
  /* TODO: fix it, when we'll have dtoa */
  char s[400], *end;
  sprintf(s, "%.16G", from);
  end= strend(s);
  return string2decimal(s, to, &end);
}

 

OMG! OMG! OMG! OMG! It's converted to a string using sptinrf ...using a fixed value of 16 significant digits and then the resulting string is analyzed in string2decimal. And of course, sprintf produces "8517.4400000000005" and thus a data truncation.

 

The IEEE to decimal nightmare

Well, after all we'll have to delve into the nasty IEEE details. I start from scratch and write a program to compute the exact decimal value represented by the 40C0A2B851EB851F  sequence, which turns out to be

1.0397265625000000621724893790087662637233734130859375 x 213 =
8517.440000000000509317032992839813232421875

So after all, the IEEE converter was wrong, too. The hex string is not an exact representation of my number. Had I known this in advance, I'd have saved quite a lot of time and head-scratching. So much for trusting *.edu domains.

In Java, we are accustomed to recovering what we set:

Java
double x = 8517.44; System.out.printf("%.17G\n",x); // prints 8517.4400000000000
System.out.printf("%.27G\n",x); // prints 8517.44000000000000000000000

 

But this is not a given, as we find in [1] : "(In Java) The binary to decimal conversion is done inexactly for practical reasons. The roundtrip binary -> decimal -> binary can be exact since the inexactness is correlated"

The Double.toString() javadoc states it clearly:

"How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0."

If we try the same thing in C or C++, we get quite a different result:

C using gcc
double x = 8517.44; printf("%.17G\n",x); // prints 8517.4400000000005
printf("%.27G\n",x); // prints 8517.44000000000050931703299

 

And the thing even depends on the compiler:

C using Visual C++ 2005
double x = 8517.44; printf("%.17G\n",x); // prints 8517.4400000000005
printf("%.27G\n",x); // prints 8517.44000000000050000000000

 

Distributing the blame and conclusion

Who is right? Well, (almost) everybody is to some degree right and everybody is wrong to some degree.

I was wrong because I should have thought about what was really happening when I did double x = 8517.44 before doing anything else, and should not have used doubles in the first place. However, I'm also right because I expect a driver to do its job and isolate me from the specifics of how the server and the language it's written on deals with binary -> decimal conversions.

The IEEE converter is wrong because it pretends to be precise, but then falls short of it:

Converter bug

As I said before, this is the first place I went to. Had I known that 8717.44 was not exactly represented by the 40C0A2B851EB851F , I'd have saved quite a lot of time.

Java is right because the priority Java "Wants binary -> decimal -> binary" conversion to reproduce the original value, in order to allow "text to be used for reliable data interchange" [1]. But of course Java is wrong also, because the number stored in x is not the one printed and used in calculations. As Darcy says "A floating point number is not a stand-in for nearby values".

gcc is right because the value actually stored is 8517.440000000000509317032992839813232421875, regardless of where it came from. The priorities of gcc are different, and it just represents the data it has. Because of this, gcc is wrong in the sense that conversions between float->string->float->string end up drifting if not enough precision digits are printed.

Well, after an exchange on the MSDN forums , I believe what VC++ does is unjustified. So VC++ is wrong without excuse: its result neither offers an accurate string representation of the value stored nor serves any other end. It's just a hard wired arbitrary precision cap.

MySQL is wrong because it doesn't know where the float comes from, so it doesn't know if the value came from a calculation, or if it comes from a decimal -> binary conversion. So it adopts the VC++ approach of hard-wiring a precision. This is bad for two reasons:

First, the number may come from a calculation and the result may have more precision than MySQL thinks. A float can be exact to more than 17 decimal digits.

It seems that many people misread the "Binary to Decimal" paragraph of this paper and think that a floating point number has at most 17 decimal digits of precision. WRONG. This is true only when speaking about roundtrip conversions. For a counterexample, consider:

double x = 1/1099511627776d;
double y = 1/549755813888d;
double z = x+y;

Well, x and y are exact to 28 decimal digits, and the sum z is exact to 29 decimal digits. The rounding error in this operation is 0. Don't try to print the values of x,y and z in Java, though, because they will be truncated by the above rule.

As Darcy says, its a floating point myth to expect all results to be inexact.

Second, if the float really comes from a decimal, hard-wiring the precision it produces for some range more precision than is needed, and thus fails to recover the original decimal. A more reasonable approach would be the approach taken by Java, or as a poor man's alternative, reasoning as follows:  Representing 8517 requires 14 binary bits (213 = 8192 < 8517 < 16384 = 214). An IEEE double has a precision of 53 bits, so 39 bits remain. Using 39 binary bits we can represent 239 different numbers. How many decimal digits are necessary for representing those numbers? 12 decimal digits are enough. If we add the decimal digits required for 8517 (i.e., 4), we get a grand total of 16 total decimal digits for this specific range (although in general 17 digits are needed)

 

The unfortunate conclusion of all this is that (even if the float-comparison stuff was fixed in MySQL)

There is no way to proceed that is correct in all cases

And because of this, the approach taken by MySQL is probably as good as any other approach.

Trying to store a floating point binary number in a decimal field will result either in false data truncation notices in some cases, or in unreported real truncations in others. There's simply not enough information to decide, not for MySQL, not for Oracle, not for any other SQL Server.

So if you want a fool-proof conclusion:

DO NOT USE FLOATS OR DOUBLES WITH DECIMAL FIELDS

Use either BigDecimal or BigInteger

 

Resources

  1. "What Everybody Using the Java Programming Language Should Know About Floating Point Arithmetic", Joseph Darcy.

  2. "What Every Computer Scientist Should Know About Floating-Point Arithmetic", David Goldberg.

  3. "C++ FAQ on Floating Points". CodeGuru

  4. "Comparing floating point numbers", Bruce Dawson.

 

Comments

Apr 01, 2014 at 14:07 Sent by best prices
DVQpG7 Major thanks for the blog.Really thank you! Awesome.
Mar 22, 2014 at 11:34 Sent by seo for cheap
TIQubX Really appreciate you sharing this article post.Really thank you! Cool.
Mar 22, 2014 at 11:13 Sent by nice seo guys
9fk7sI Fantastic article post.Much thanks again.
Jan 18, 2014 at 09:53 Sent by eduzwsqp
la79wwuo hqb2b643 insurance izjp6od1 ao02sae8
Dec 13, 2010 at 17:45 Sent by Rick Regan
Alexander, Regarding your statement "This is true only when speaking about roundtrip conversions." It's not even necessarily true in that case. Take your example and enter x and y as decimal literals instead of doing the computations: double x = 0.0000000000009094947017729282379150390625; //2^-40 double y = 0.000000000001818989403545856475830078125; //2^-39 If you print them with sufficient precision (like with gcc), they will come back exact; that is, they will round-trip.
Feb 18, 2009 at 15:25 Sent by Deepak D Patil
Hats off to your efforts. Though not fruitful for you personally, but that must have given you (and lots of like me) great insight and satisfaction.
Feb 18, 2009 at 13:58 Sent by Jana
Alex, You have provided phenomenal amount of information in one single blog. I would have to reread to pick bagload information here. hope other people to read this good blog. thanks for the blog
Dec 21, 2008 at 01:52 Sent by Vince
Nice+post.+I%5C%27ll+return.e
May 29, 2008 at 15:38 Sent by anonymous
WOW - impressive work! I actually get the same error - with text though?? Could you do a study on that as well please? :-)
Mar 27, 2008 at 23:34 Sent by Mauro
Great explanation, but did you try to use ROUND(x,y) ? It´s work for me.
Nov 21, 2007 at 04:54 Sent by Chris Malan
An Herculanean effort. This must have taken some time to do, not to mention know-how. To put it mildly, I'm impressed.
Jul 28, 2007 at 12:38 Sent by anonymous
Very Good explaination. I got this exception when trying to use 'datetime' type in one of the col.
Mar 25, 2007 at 19:22 Sent by marcel
very good article! never again doubles!
Jan 29, 2007 at 12:33 Sent by Fizzl
Thanks for the very interesting article. I was actually just searching why I get exceptions for simple inserts to float fields from within Java -> mysql. I'm none the wiser how to proceed, but the article was very detailed and interesting. Perhaps I'll try to use some other field types for storing my floats. :)
Nov 15, 2006 at 20:55 Sent by ABR
Thanks, this is a great, helpful discussion.
Nov 15, 2006 at 14:00 Sent by abbas
problem same as with you. but i am using int .
Oct 31, 2006 at 17:35 Sent by anonymous
¡Muchas gracias por su página! Estoy estudiando chino y le agradezco mucho su aportación. ¡Muchas Gracias! Isabel.
Oct 30, 2006 at 19:21 Sent by Ramesh Belli
What an investigation. Great Job
Oct 25, 2006 at 08:43 Sent by Rory Winston
Excellent post, very interesting!

 

Add a Comment

Name (optional)
EMail (optional, will not be displayed)

Text