Newsfeeds

продукция, услуга	город	категория

			расширенный поиск

Главная

Ленты новостей

Планета MySQL

Newsfeeds

Planet MySQL

Planet MySQL - https://planet.mysql.com

Use Cases for MariaDB Data Versioning
Use Cases for MariaDB Data Versioning rasmusjohansson Thu, 07/05/2018 - 19:54 Working in software development, versioning of code is something that we’ve often taken for granted. Task definitions and bug descriptions are preferably also managed by a system that versions every change. On top of this, we use a lot of documents for designing, documenting and managing our development cycles. For example, some of the tools we use are Jira, Google Docs and our own Knowledge Base to accomplish these things, which all provide versioning support. In MariaDB Server 10.3, we’ve introduced an elegant and easy way for data versioning, called System-Versioned Tables. Let’s look at what it can be used for. A Look Back at Data for GDPR and PCI DSS Compliance The General Data Protection Regulation (GDPR) is now enforced by the European Union (EU). All companies collecting user data in the EU have to comply with the GDPR rules. In addition to the daily questions and statements coming over email asking you to agree to new terms because of GDPR, the companies also have to store personal and private data in a way that fulfills the criteria of GDPR. Card payments also have their own rules. There are standards like the Payment Card Industry Data Security Standard (PCI DSS), which are followed by banks and other online businesses. 1) What happened when, 2) by whom and 3) what did the data look like before and after? In MariaDB Server, the MariaDB Audit plugin is there for dealing with 1) and 2). It can also be used for 3) by looking in the audit logs on changes made, but it doesn’t give you the full data for how it looked before and after. With the newly released System-Versioned Tables this is possible. Let’s say that payment card information is stored in a database table. By turning on versioning for that table, all changes will create a new version of the row(s) affected by the change. The rows will also be time stamped, which means that you can query the row to see what it looked like before. Handling Personal Data in Registries When you think about versioning, one thing that comes to mind are registries of all sorts, which is the domain of GDPR when it comes to handling personal data. There are many types of personal data and one important to all of us is healthcare data, for example the patient registries of hospitals. In these registries versioning is of great importance to keep track of patients’ health history and related information such as medication. Other personal data registers are civil registers, tax registers, school and student registers and employee registers. The list is endless. Rapidly Changing Data Sets All of the above mentioned examples with data versioning applied in one way or another can be seen as slowly changing data. What I mean is that, although the systems can be huge and the total amount of transactions happening enormous, each piece of data doesn’t change that often. For example, my information in the civil register doesn’t change every second. But what if we have rapidly changing data such as the share rates at a stock exchange or tracking vehicle data for a shipping company. In these cases, we can make use of MariaDB’s data versioning. Creating applications or software for the above purposes and having a database that provides data versioning out-of-the-box will lead to easier design, less customization and a more secure solutions. Step-by-step Example I’ll end with a GDPR example. I have a newsletter with subscribers and want to make sure that I always know when and what has happened to the subscriber data. I create a table in the database for the purpose and turn on versioning for the table. CREATE TABLE Subscriber ( SubscriberId int(11) NOT NULL AUTO_INCREMENT, FirstName varchar(50) NOT NULL, LastName varchar(50) NOT NULL, Newsletter bit NOT NULL, PRIMARY KEY (SubscriberId) ) ENGINE=InnoDB WITH SYSTEM VERSIONING; I insert myself as subscriber. INSERT INTO Subscriber (FirstName, LastName, Newsletter) VALUES ('Rasmus', 'Johansson', 1); I then try to add a column to the table. ALTER TABLE Subscriber ADD COLUMN Gender char(1) NULL; ERROR 4119 (HY000): Not allowed for system-versioned `Company`.`Subscriber`. Change @@system_versioning_alter_history to proceed with ALTER. It results in the above error, because changing a versioned table is not permitted by default. I turn on the possibility to change the table and then the ALTER succeeds. SET @@system_versioning_alter_history = 1; ALTER TABLE Subscriber ADD COLUMN Gender char(1) NULL; Query OK, 1 row affected (0.17 sec) I also want a constraint on the new column. ALTER TABLE Subscriber ADD CONSTRAINT con_gender CHECK (Gender in ('f','m')); Then I do a couple of updates in the table. UPDATE Subscriber SET Newsletter = 0 WHERE SubscriberId = 1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 UPDATE Subscriber SET Gender = 'm' WHERE SubscriberId = 1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 Finally, I delete the row in the table: DELETE FROM Subscriber WHERE SubscriberId = 1; If we ask for the rows in the table, including the old versions we get the following. SELECT *, ROW_START, ROW_END FROM Subscriber FOR SYSTEM_TIME ALL; +--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+ | SubscriberId | FirstName | LastName | Newsletter | Gender | ROW_START | ROW_END | +--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+ | 1 | Rasmus | Johansson | # | NULL | 2018-06-08 10:57:36.982721 | 2018-06-08 11:14:07.654996 | | 1 | Rasmus | Johansson | | NULL | 2018-06-08 11:14:07.654996 | 2018-06-08 11:15:05.971761 | | 1 | Rasmus | Johansson | | NULL | 2018-06-08 11:15:05.971761 | 2018-06-08 11:15:28.459109 | | 1 | Rasmus | Johansson | | m | 2018-06-08 11:15:28.459109 | 2038-01-19 03:14:07.999999 | +--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+ Even though I deleted the row, I get four old versions of the row. All this was handled by the database. The only thing I had to do was to turn on versioning for the table. What will you do with MariaDB’s System Versioned-Tables? We’d love to hear from you! How to MariaDB Releases Login or Register to post comments
MySQL Performance : 8.0 GA on IO-bound TPCC
This post is mainly inspired by findings from the previous testing of MySQL 8.0 on TPCC workload(s) and observations from IO-bound Sysbench OLTP on Optane -vs- SSD. But also by several "urban myths" I'm often hearing when discussing with users about their IO-bound OLTP performance problems : Myth #1 : "if I'll double the number of my storage drives -- I'll get x2 times better TPS !" this was mostly true during "HDD era", and again.. (ex.: a single thread app doing single random IO reads from a single HDD will not go faster by doing the same from 2x HDD -- similar like single thread workload will not run faster on 8CPU cores -vs- 2CPU cores, etc.) all depends on your workload and how many parallel IO operations you're involving.. indeed, it is much more easier to saturate HDD, but it's much more harder to do it with modern SSD/NVMe NOTE : we're speaking about OLTP (e.g. if you started to observe full table scans in your workload -- means you're already doing something wrong ;-)) simple rule : if you're not saturating your storage on any of its limits => you'll not see any gain by adding more drives, you'll probably just have a bigger storage space, that's all. Myth #2 : "I'll go faster with flash drive which is claimed capable x2 times more IOps in specs than my current drive !" if you're expecting to run OLTP workload, rather pay attention to IO latency first ! sometimes it may be not mentioned these x2 times more IOps were obtained with x4 times more IO threads ;-)) e.g. a drive capable of x2 times more IOps but with x4 times higher latency will still be x4 times slower than your current drive on 1 user load, and 2, and 4, and .. maybe up to 64 ? (depends on where your current drive the limit is reached) -- and if you don't have more than 64 concurrent users ? -- then you'll never see your x2 times more IOps, but rather x4 times worse TPS ;-)) Test yourself - this is the only advice I could give you ! because "only a real test will give you a real answer" (and I'm repeating to say it again and again.. ;-)) testing your own workload will give you the best answer ! otherwise you may still use some generic benchmark workloads which are representative for you for ex. your new flash drive may be look better from all points and passing very well all generic pure IO tests, but show x3 times worse results once used by MySQL -- and this is just because, for ex., every involved fsync() will take x3 times more time, etc. (based on real story, no kidding ;-)) So far, for the following story I'll use : Sysbench-TPCC workload, 10x100W (x10 of 100 warehouses, ~100GB data, here is why) Same Skylake server as before, same config, etc. and the same Optane & SSD as in the previous testing, except that I'll also add to the game x2 Optane drives used as a single RAID-0 MDM volume ! EXT4 is used on top of each drive or volume Starting Test scenario : concurrent users : 1, 2, 4, .. 1024 trx_commit : 1 (flush REDO on every COMMIT) Buffer Pool (BP) : 128GB (in-memory), 32GB (IO-bound) Sorry, no "user friendly" graphs this time, but hope it'll still be easy to understand the results ;-)) On the first graph you'll see : 3 test results : with x2 Optaine first, then with single Optane, and then single SSD on each test the load is going from 1, 2, 4 .. up to 1024 concurrent users the Commits/sec curve is showing obtained TPS level and the first result is with 128GB BP : InnoDB Buffer Pool 128GB Comments : first of all, as you can see, TPS is not better with two -vs- one Optane as we're not hitting any single limit of Optane drive, there is no any gain by using x2 ;-)) also, regardless the used data volume is ~100GB, we're not observing here such a big difference between Optane -vs- SSD as it was with Sysbench OLTP on a similar data size and still "in-memory".. this because TPCC workload is much less aggressive on IO writes, so REDO flushing is less impacted.. now, how TPS will be impacted if BP size was smaller, just 32GB ? Read more... (9 min remaining to read)
How to split MySQL/MariaDB datadir to multiple mount points
If you are going to be using InnoDB tables and if you plan to have innodb_file_per_table enabled, then your best option would probably be to use the CREATE TABLE statement’s “DATA DIRECTORY” option, so that you can place a table outside the data directory. From the MySQL documentation: DATA DIRECTORY, INDEX DIRECTORY For InnoDB, the DATA DIRECTORY=’directory’ option allows you to create InnoDB file-per-table tablespaces outside the MySQL data directory. Within the directory that you specify, MySQL creates a subdirectory corresponding to the database name, and within that a .ibd file for the table. The innodb_file_per_table configuration option must be enabled to use the DATA DIRECTORY option with InnoDB. The full directory path must be specified. See Section 14.7.5, “Creating File-Per-Table Tablespaces Outside the Data Directory” for more information. https://dev.mysql.com/doc/refman/5.7/en/create-table.html There’s additional information about that here: https://dev.mysql.com/doc/refman/5.7/en/tablespace-placing.html If you expect that any single table will exceed 1 TB, then you may need to use partitioning for that table, and you may need to use different “DATA DIRECTORY” clauses for different partitions. https://dev.mysql.com/doc/refman/5.7/en/partitioning.html Sometime we are curious to know that how would be the backup and restoration performed on these databases while using CREATE TABLE statements with “DATA DIRECTORY” option? In case of mysqldump or Percona xtrabackup or MariaDB Backup, mysqldump backups would include the DATA DIRECTORY option. However, a mysqldump backup would probably be impractical for a database that is several TB large. Percona XtraBackup seems to do something a little interesting. It places the tablespace file in the same directory as the rest of the backup during the actual backup step, but during the restore step, it places the tablespace file back in the correct directory as specified by the DATA DIRECTORY option. See the output below for an example: [ec2-user@ip-172-30-0-249 ~]$ mysql -u root Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 11 Server version: 10.1.31-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> use db1; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [db1]> CREATE TABLE tab ( -> str varchar(50) -> ) DATA DIRECTORY='/mariadb_data/'; Query OK, 0 rows affected (0.01 sec) MariaDB [db1]> \q Bye [ec2-user@ip ~]$ sudo ls -l /mariadb_data/ total 0 drwxrwx--- 2 mysql mysql 20 Mar 23 15:04 db1 [ec2-user@ip ~]$ sudo ls -l /mariadb_data/db1/ total 192 -rw-rw---- 1 mysql mysql 98304 Mar 23 15:04 tab.ibd [ec2-user@ip ~]$ cd backups/ [ec2-user@ip backups]$ sudo innobackupex /home/ec2-user/backups/ 180323 15:06:05 innobackupex: Starting the backup operation IMPORTANT: Please check that the backup run completes successfully. At the end of a successful backup run innobackupex prints "completed OK!". ... 180323 15:06:14 completed OK! [ec2-user@ip backups]$ sudo ls -l total 4 drwx------ 8 root root 4096 Mar 23 15:06 2018-03-23_15-06-05 [ec2-user@ip backups]$ sudo ls -l 2018-03-23_15-06-05/db1/ | grep "tab\..*" -rw-r----- 1 root root 481 Mar 23 15:06 tab.frm -rw-r----- 1 root root 98304 Mar 23 15:06 tab.ibd -rw-r----- 1 root root 25 Mar 23 15:06 tab.isl [ec2-user@ip backups]$ sudo systemctl stop mariadb [ec2-user@ip backups]$sudo innobackupex --applylog 2018-03-23_15-06-05\ 180323 15:14:17 innobackupex: Starting the apply-log operation IMPORTANT: Please check that the apply-log run completes successfully. At the end of a successful apply-log run innobackupex prints "completed OK!". ... 180323 15:14:21 completed OK! [ec2-user@ip backups]$ sudo rm /mariadb_data/db1/tab.ibd [ec2-user@ip backups]$ sudo rm -fr /var/lib/mysql/* [ec2-user@ip backups]$ sudo innobackupex --copy-back /home/ec2-user/backups/2018-03-23_15-06-05/ 180323 15:16:17 innobackupex: Starting the copy-back operation IMPORTANT: Please check that the copy-back run completes successfully. At the end of a successful copy-back run innobackupex prints "completed OK!". ... 180323 15:16:25 completed OK! [ec2-user@ip backups]$ sudo ls -l /mariadb_data/db1/ total 96 -rw-r----- 1 root root 98304 Mar 23 15:16 tab.ibd MariaDB Backup based on Percona Xtrabackup so almost same behavior will be seen while using it.
Log Buffer #548: A Carnival of the Vanities for DBAs
This Log Buffer Edition covers blog posts from Cloud, Oracle, and MySQL. Cloud: Google Stackdriver lets you track your cloud-powered applications with monitoring, logging and diagnostics. Using Stackdriver to monitor Google Cloud Platform (GCP) or Amazon Web Services (AWS) projects has many advantages—you can get detailed performance data and can set up tailored alerts. This post is courtesy of Sam Dengler, AWS Solutions Architect. Message brokers can be used to solve a number of needs in enterprise architectures, including managing workload queues and broadcasting messages to a number of subscribers. New Cloud Filestore service brings GCP users high-performance file storage. One of the biggest trends in application development today is the use of APIs to power the backend technologies supporting a product. It’s no secret that data is an essential part of running a business, no matter how large or small a business may be. Many companies host their business data using relational databases. Oracle: How can I print to PDF? How can I get a document/report with my data? How can I export my data from APEX to Excel? Almost a year ago, Oracle released Oracle GoldenGate 12c (12.3.0.1.x). At that time, there were two architectures released: Microservices and Classic. Both architectures provided the same enterprise-level replication. The only difference was that one enabled a RESTful API interface with HTML5 page and the other was still command line driven. Ubuntu 16.04: Installation of Chrome browser fails with libnss3 (>= 2:3.22) [2] UTL_FILE_DIR and 18c DevOps in OAC: Scripting Oracle Cloud Instance Management with PSM Cli MySQL: One problem that’s a lot less common these days is swapping. Most of the issues that cause swapping with MySQL have been nailed down to several different key configuration points, either in the OS or MySQL, or issues like the swap insanity issue. MariaDB 10.3 is now generally available (10.3.7 was released GA on 2018-05-25). The article What’s New in MariaDB Server 10.3 by the MariaDB Corporation lists three key improvements in 10.3: temporal data processing, Oracle compatibility features, and purpose-built storage engines. MySQL 8.0 InnoDB Cluster on ARM64 with Oracle Linux and the Raspberry Pi 3B. Detecting Incompatible Use of Spatial Functions before Upgrading to MySQL 8.0 JFG Posted on the Percona Community Blog – A Nice Feature in MariaDB 10.3: no InnoDB Buffer Pool in Core Dumps
Audit Log’s JSON format logging
Blood, sweat, tears and the JSON format logging is finally supported by the Audit Log plugin. This comes in pair with the feature that allows to read log events, which could be useful for rapid analysis of the audit log trail without the need of accessing the files directly.…