The SHA1 algorithm is a cryptographic hashing algorithm. It can get used for creating hash values of arbitrary binary data. MySQL already allows to generate a SHA1 value from a single value. This library allows to create a SHA1 value from a list of fields. This could also get achieved manually by a construct like "SHA1(CONCAT(field1, field2, field3))". Such constructs could yield the same SHA1 value if field1 and field2 get preped accordingly.
Example: SHA1(CONCAT('abc', 'defghi', '123') == SHA1(CONCAT('abcdef', 'ghi', '123)
To avoid such issues this UDF plugin was created. It doesn't simply concat the values but uses an (internally escaped) separation character "|" to separate values. See "Internal escaping" for more information on this issue.
###General notes of usage###
This plugin supplies only one function "fields_sha1()". Data gets handled as it is passed from mysql. As mentioned already the values get not simply concatenated but rather fed into the SHA1 algorithm by a custom fashion using separators and escape characters.
Calling fields_sha1(fieldname) on a single field will yield the same result as when using the mysql internal functino sha1(fieldname). The difference starts as soon as multiple fields get passed to fields_sha1(). Also a call to fields_sha1(integerField) will yield a different value as a call to sha1(integerField) as this UDF will hash the binary value of the integer field while sha1() casts the integer to a string.
Be aware that unlinke sha1() this function will return a 20-byte binary value. So if you want to display it directly in the mysql shell use "hex(fields_sha1(field))" instead.
###fields_sha1([arg1,..,argN])###
Every field passed to fields_sha1() will get integrated into the hash.
#####[arg1,..,argN]#####
A (possibly empty) list of values of any type.
- The argumentlist maybe empty in which case the same result as for "sha1('')" will get returned (da39a3ee5e6b4b0d3255bfef95601890afd80709).
#####returns#####
A 20-byte binary sha1 hash generated by cumulating all passed fields into a sha1 hash. Use MySql hex() method to get a non-binary hex string.
####Installation####
Build the plugin:
autoreconf
./configure
make
sudo make install
sudo make installdb
You need to have the following packages installed (Debian/Ubuntu): libtool, autoconf, automake (and of course "make" and every other packages required for compiling)
Place the shared library binary in an appropriate location. Log in to mysql as root or as another user with sufficient privileges, and select any database. Then, create the function using the following DDL statement (above "make installdb" will do the same):
CREATE FUNCTION fields_sha1 RETURNS STRING SONAME 'lib_mysqludf_fields_sha1.so';
The function will be globally available in all databases.
To deinstall the function, run the following statement in your mysql shell or execute below make targets:
DROP FUNCTION fields_sha1;
To remove the library from your system type in the directory where you built the plugin:
make uninstalldb
make uninstall
####Examples####
Create a hash for some fields:
SELECT hex(fields_sha1(customer_id, first_name, last_name, last_update)) AS hash FROM customer WHERE customer_id=1;
+------------------------------------------+
| hash |
+------------------------------------------+
| 40bd001563085fc35165329ea1ff5c5ecbdbbeef |
+------------------------------------------+
Create a hash from static values:
SELECT hex(fields_sha1('This is a test')) AS hash;
+------------------------------------------+
| hash |
+------------------------------------------+
| A54D88E06612D820BC3BE72877C74F257B561B19 |
+------------------------------------------+
SELECT SHA1('This is a test') as internal_hash;
+------------------------------------------+
| internal_hash |
+------------------------------------------+
| a54d88e06612d820bc3be72877c74f257b561b19 |
+------------------------------------------+
Create a hash from multiple static values. Note the difference between "fields_sha1()" in the first two examples and that theres no difference between the two last calls using "sha1()"
SELECT hex(fields_sha1('This is a |test', 'abc')) AS multi_hash;
+------------------------------------------+
| multi_hash |
+------------------------------------------+
| BB0318666AD1138192C575124F7E842820B485AA |
+------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT hex(fields_sha1('This is a ', 'test|abc')) AS multi_hash;
+------------------------------------------+
| multi_hash |
+------------------------------------------+
| 6ACC4FA9E180BB6BE73EDE8768C22D51FBEC5306 |
+------------------------------------------+
1 row in set (0.01 sec)
SELECT sha1(CONCAT_WS('|', 'This is a |test', 'abc')) AS internal_hash;
+------------------------------------------+
| internal_hash |
+------------------------------------------+
| 8b92198d1b00d577fca71ff1d6a5cc6891f84580 |
+------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT sha1(CONCAT_WS('|', 'This is a ', 'test|abc')) AS internal_hash;
+------------------------------------------+
| internal_hash |
+------------------------------------------+
| 8b92198d1b00d577fca71ff1d6a5cc6891f84580 |
+------------------------------------------+
1 row in set (0.00 sec)
###Internal escaping###
As already mentioned the plugin uses an internally escaped separation character "|" to separate values. As this separation character gets escaped it is not possible to craft field values which would yield the same result.
Assume the following calls SHA1(CONCAT_WS('|','abc|def', 'ghi', '123') SHA1(CONCAT_WS('|', 'abc', 'def|ghi', '123')
If the "|" in passed values wasn't escaped internally the hashed value would be "abc|def|ghi|123" in both cases. But as internal escape happens the hashed values will be:
'abc\|def|ghi|123' 'abc|def\|ghi|123'
As both strings are different the hash will also be. The "" character itself gets also escaped. Every other character (byte) stays as is.