There were recently number of posts about MyISAM, for example Arjen wrote pretty nice article about MyISAM features so I thought I would share my own view on using MyISAM in production.

For me it is not only about table locks. Table locks is only one of MyISAM limitations you need to consider using it in production.
Especially if you’re comming from “traditional” databases you’re likely to be shocked by MyISAM behavior (and default MySQL behavior due to this) – it will be corrupted by unproper shutdown, it will fail with partial statement execution if certain errors are discovered etc.

You should not think about this however as about bugs, as many MySQL features it is designed for particular load cases when it shines and it might not be good choice for others.

In 1999 for my production application (storing billions of rows in tens of thousands tables) Innodb was better choice mainly because of thouse other behaviors … well table locks was the problem at very early stage but it was solved by using this significant number of tables.

I still have the same view on Storage Engines – Innodb is my oppinion is better choise for general purpose storage engine – it better matches what you would expect from database server and saves you from a lot of gotchas – this might be more important than performance for small application. As load increases you might convert certain tables to MyISAM and other storage engines for performance reasons…. of course keeping all limits in mind.

So here is my list of items you need to keep into account while using MyISAM tables.

Recovery. MySQL was running stable for us, giving us false sense of security but when it crashed (or was it power failure?) It took many hours to check recover our tables. If this happened make sure you have decent myisam_sort_buffer_size and large myisam_max_sort_file_size otherwise recovery may be done by key_cache rather than sort which can take even longer.

Be careful with myisam_recover. This is great option to automate
recovery but it can give you couple of nice surprises. First – if you have many tables which are being repaired, each may allocate
myisam_sort_buffer_size and MySQL could crash or go swapping. Second – table will be locked while repair is going. If this is frequently used
table you may have all your connections become busy waiting on this table to become available effectively bringing down. Much better
solution for us was to move out all databases out of MySQL directory to other location, check them by myisamchk prioritizing more important
databases first an then rename them back to MySQL database directory. This way accesses to non-checked tables fail with table does not exist
error rather than wait forever.

Hidden corruptions. If could be bad memory OS or MySQL bugs but corruption may happen and go for long unnoticed with MyISAM storage engine. This hidden corruption may later cause crashes wrong query results and further data corruption.

Partial updates. MyISAM is does not have transactions this is well understood but it also does not have atomic statement execution, if you run UPDATE which updates 1000000 rows after crash only 500000 of rows may end up being updated, and you might not know which ones.

Concurrency. MyISAM uses table locks and has concurrent inserts which can go concurrently with selects. This is sometimes presented as great concurrency for inserts but in reality it means only one INSERT is
allowed to happen at the same time. It can happen concurrently to select statements but it has to be one insert at the time. Happily inserts in MyISAM are rather fast so it rarely is the problem The other misconception about table locks is if you have 95% reads table locks are
not going to be the problem. It might be truth if you only have short reads or writes but you’re in great trouble if you need to run some bulk
operations. For example running full table scan query to compute some stats is often impossible in production as it would block all updates to
the table. Even worse with bulk updates queries. You do not have to have a lot of such queries to get into trouble. Just one is enough.

Lock priorities. By default MySQL treats updates as higher priority operations. You can use SELECT HIGH_PRIORITY or UPDATE LOW_PRIORITY to adjust that or you can simply set low_priority_updates option. Anyway default behavior means any UPDATE statement which is blocked by long running select will also block further selects from this table – they will have to wait until UPDATE is executing which is waiting on SELECT to complete. This is often not accounted for and people think – “OK. I write my script so it does short updates so it will not block anything” – it still may cause total block if there are long selects running.

Fragmentation. This is common problem for pretty much all storage engines. It is however also different. MyISAM has in-row fragmentation which means single row may be stored in many pieces. In extreme cases I’ve seen over 10 pieces in average per row – this is when rows contain some data which constantly growths in size. It does not happen for all applications and there are ways to fight it but still watch out.

Lack of row cache. MyISAM tables only have indexes cached in
key_buffer while data is cached in OS cache. It is performance issue as system call is needed to get data from Operation System even when it is in cache but it is only part of the problem. The other problem is – it is hard to manage resources, especially if you have some other processes going on the same server. Backup process may go ahead and wipe OS cache
which you hoped for causing unexpected slowdowns.

delay_key_writes Whatever way you enable this option – globally, for table or for set of statements (by using LOCK TABLES/UNLOCK TABLES) be careful. This option may improve performance dramatically in certain cases by avoiding flushing dirty index blocks from key_buffer to disk, but it also comes at great danger if MySQL Server crashes or power goes down. In case crash happens without this option is enabled in most cases table corruption will be mild, especially on low loaded servers. Many users do not even know you need to check MyISAM tables after crash and have been running for years survining many crashes. If you enable delay_key_writes it drastically changes. Now in case of crash your index and data will likely be very much out of sync and if you do not repair the table you will very likely observe very serious corruption with queries failing to run, wrong result sets or crashes.

Here is my list what I think MyISAM tables are good to be used for. This list is not inclusive and every system has its own risk tolerance factor and performance requirements, not to mention different load:

Log tables. Now Archive storage engine can be even better.

Read only data, especially packed with myisampack. This can’t be corrupted or updated and as you see this is where our problems reside.

Cache/Session tables (you can throw them away if it server crashes). You can use multiple of cache tables to avoid concurrency issues.

Temporary tables used in batch processing and other means

Exported data If you export data from OLTP system for example to perform data analyses – MyISAM will be great fit. Even if you get the crash during export you normally can start over and most of the time data is read only anyway.

Tables which contain data which is quick to regenerate

Generally tables which you can throw away without any trouble.

Tables with data you do not need to be 100% correct and always available. For example statistical data.

Data on slave servers. If it crashes you can simply re-clone it. Make sure however to have at least one proper master to fall back to. Also be worried about replication with different storage engines.

Then performance improvements are worth the risk.

38 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
P. Kuiper

Hi Peter,

“In 1999 for my production application (storing billions of rows in tens of thousands tables) Innodb was better choice mainly because of thouse other behaviors … well table locks was the problem at very early stage but it was solved by using this significant number of tables.”

Were all those tens of thousands tables stored on the same server? Did this have any impact on performance? Just wondering since there isn’t much info about these very very large data sets (which I happen to be interested in)

Regards,

Peter

Roland Bouman

Hi Peter,

thanks for the insights. I was wondering about the row fragmentation. You mention you’ve seen an average of 10 fragments per row at one time.

How can I find out how much fragmentation there is on the average?

Thanks in advance,

Roland.

Kevin Burton

Some thoughts:

Just use delay_key_write on slave servers. And only ever do INSERT/UPDATE on your master. If you loose a slave just take it out of production as it probably won’t work anyway.

You can increase concurrency by keeping your INSERT/UPDATE statements small so they exec quick and don’t keep open table locks.

The row cache issue is a bad one. I’d like to see some way to compute filesystem cache efficiency as a stat I can log. I haven’t found a way yet.

ananth

Hi peter,

I came across your performance blog while searching through performance forum on MySQL site
It is a great blog with lot of good information.

We are having a weird performance problem in production where we are only using MyISAM tables.
Here is some background on our environment:

H/W: Dell 2.4GH 2CPU servers with 3 internal disks using h/w RAID 5
S/W: RedHat linux 2.4
Application: Java jetty server with Hibernate
MySQL version = 4.1.19

We have been running application in production for almost a year with no problems
Our biggest transaction tables are only 90MM rows with 10-10GB in disk space
We have around 30 tables which are transactions related but some TEMP tables which are transient in nature for 3 weeks

Here are symptoms:

We found that MySQL server freezes for 30-60 seconds and during this time
application DB thread pool grows in size doing nothing resulting in large number of db connections
Once MySQL unfreezes, all of threads complete and everything is fine again

During these freeze times, even simple ‘SELECT 1’ from our DB monitoring takes 30-40 seconds

we captured some system stats during this time and noticed that there is one mysql thread taking up all CPU 25% (as we are running hyper-threading)
Otherwise, system is idle, low load and no IO
We captured show processlist and don’t see any thing happening as most fo processes are SLEEP mode

We are suspecting ,based on some thread on MyISAM engine forum, that it may be some sort of “garbage collector” problem while flusing KEY-CACHE

we have 1GB key cache on server
we reduced it to 750MB to see if it makes any difference but still happening

We are not able to recreate the problem in our integration environment where we don’t have load of transaction.
We do have full data in integration though

Can you shed some light on this and comment on what might be happening?

I’d love to provide more information if you need.
I also posted this on forum and here is the link
http://forums.mysql.com/read.php?21,85701,96186#msg-96186

Thanks in advance

ananth

James Day

ananth,

Please do as Ingo suggests. Ingo is the programmer who handles most MyISAM storage engine problems.

P. Kuiper

[offtopic]
Its nice to see alot of the people from MySQL support here! I recommend everybody using MySQL in big (or even smaller) production setups to get a support contract. Its very very good! Its faster en more statisfying then posting your questions in forums, hoping to get an answer. I do recommend the Platinum support, we have it too and its nice to experienced people to fall back on in case you run into weird problems like ananth described above.
[/offtopic]

Keep up the good work guys!

Peter

ananth

We don’t have support contract for MySQL and we are looking at buying support contract to suit our environment
Meanwhile, Is it possible to buy consuting from MySQL gurus w/o having support contract to look at these performance problems we have been having with MySQL in production?

ananth

ananth

Thanks Peter for this information
We are running on 4.1.19 and latest 4.1 is 4.1.20
We’ll try to get 4.1.20 and see if problem goes away.

Regarding using oprofile tool to profile the the thread, do we need debug version for that or can we run regular version?

Thanks again

ananth

ananth

Hi Peter,

We were able to recreate problem in our integration environment by stressing application and mysql.
Here is the stack trace

0x808d903 handle_segfault + 423
0x82e8d98 pthread_sighandler + 184
0x81234db insert_into_free_memory_sorted_list__11Query_cacheP17Query_cache_blockPP17Query_cache_block + 95
0x81233b1 insert_into_free_memory_list__11Query_cacheP17Query_cache_block + 53
0x81231fd free_memory_block__11Query_cacheP17Query_cache_block + 93
0x812279f free_query__11Query_cacheP17Query_cache_block + 219
0x81226a4 free_old_query__11Query_cache + 84
0x8123025 allocate_block__11Query_cacheUlcUlc + 137
0x81227f1 write_block_data__11Query_cacheUlPcUlQ217Query_cache_block10block_typeUic + 69
0x8121632 store_query__11Query_cacheP3THDP13st_table_list + 454
0x809da46 mysql_execute_command__FP3THD + 1258
0x80a223f mysql_parse__FP3THDPcUi + 211
0x809c6ef dispatch_command__F19enum_server_commandP3THDPcUi + 1547
0x809c0d8 do_command__FP3THD + 188
0x809b7a7 handle_one_connection + 615
0x82e654c pthread_start_thread + 220
0x830fe0a thread_start + 4

We looked at source code and the problem is related to query_cache
We have allocated 500MB of query cache and looks like lot of trashing of query cache going on

Here are our query cache variables and status
+——————————+———–+
| Variable_name | Value |
+——————————+———–+
| have_query_cache | YES |
| query_cache_limit | 1048576 |
| query_cache_min_res_unit | 4096 |
| query_cache_size | 524288000 |
| query_cache_type | ON |
| query_cache_wlock_invalidate | OFF |
+——————————+———–+

+————————-+———–+
| Variable_name | Value |
+————————-+———–+
| Qcache_free_blocks | 5303 |
| Qcache_free_memory | 486708416 |
| Qcache_hits | 2845741 |
| Qcache_inserts | 2623234 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 2638244 |
| Qcache_queries_in_cache | 26168 |
| Qcache_total_blocks | 57725 |
+————————-+———–+

As a workaround we experimented by changing query_cache = 0 on one server and the problem went away but we saw server load go up as there was lot of IP (as expected)
I was going through the documentation about tuning query cache and they have a suggestion to tune
| query_cache_min_res_unit | 4096 |
this parameter to lower if result sets are smaller in size
We have not experimented with that yet

I am not sure if this is a BUG or not to report in as BUG since we have not truned any query cache other than allocating lot of query cache under the impressions that more the better but learned that it is not the case under heavy load.

ANy suggestions?

Thanks
ananth

Alex

We ran into exactly the same query cache problem recently, i.e. all queries hang (even simple ones) for no reason. According to MySQL, it’s caused by a design problem with the query cache when invalidating records (Peter; you can have a look at MySQL network ticket #9498 for details).

As you did, we fixed it by disabling the query cache entirely. You could try re-enabling it with a smaller size (128MB or so) to see if the problem comes back.

James Day

The qury cache uses a sequential search through a list of free blocks when a single query is being removed from the cache. Regularly consolidating the free space with FLUSH QUERY CACHE may or may not help, I haven’t benchmarked it.

See bugs:

http://bugs.mysql.com/bug.php?id=21074
http://bugs.mysql.com/bug.php?id=21051

Because of the time it takes to remove queries it can be faster to have a small cache and anyone going above 100M should be alert for the possibility that it could become a performance penalty rather than improvement. It’s actually worth being alert for that at any size, since it does always add a little overhead; it just grows with the size.

James Day
Support Engineer, MySQL AB

James McKenzie

Peter,

I was wondering if you have any experience running MyISAM in a ridiculously HUGE environment. Right now I am working with a team of DBA’s and developers to launch a MYSQL implementation using a Sun E25K (72 X 1.2GHZ dual core procs with 200GB RAM). We are stuck with very poor insert rates and I was hoping you would have some suggestions. The most we get out of a single mysqld process running on the server is around 700-900 rec/sec and we are trying to find a way to increase the rec/sec and decrease CPU utilization.

Any ideas?

Thanks in advance,
James
System Admin way over his head with all this database stuff

inanc

hello,

in our production environment several nodes inserts on two tables. that is going to be transferred to the slave. we are currently using innodb and mostly getting lock wait time out and deadlock problems. by switching to myisam no locks occur on the tables and the selects which are summarizing the tables with write-intensive operations relieved free.

i think myisam is a way to go for write-intensive summarizing-read-intensive-to-slave apps.

inanc

peter,

I agree with you on some points. But with innodb we use repeatable-read concurrency with update if not rows been updated then insert kind of transactions. At the same time a cluster of nodes actively writing to the same tables which are read by another different cluster members. Whatever technique we tried nothing was as succesful as with MyISAM even it is doing its job not by parallel ( but inserts are faster works and/or seems like parallel ).

I am going to watch mysql webinars about scale-out write intensive applications maybe that can point a path.

howa

Hi,

Some questions:

1. Is it now InnoDB still suffer from the problems of poor performance when running in SMP server?

2. If using build in default settings, seems MyISAM perform better in most suitation.

3. If InnoDB corrupt, it is 0% recoverable?

howa

Hi Peter,

Thanks for the reply.

2. Yes, but say for example, MyISAM default insert speed is great, but InnoDB default insert speed is poor. (I have seen your InnoDB optimization presentation, and done the experiments)

3. This is the main point, we just afraid we will have total loss of data since we can’t even repair the table.

Santiago

Hi Peter!

I’m from Argentina, sorry for my english.

Can you give me any hint to transform all my MyISAM tables to INNOdb? The database is currently used in production, we have roughly 100 tables and 100,000 rows. The idea is to avoid the dead time of the site.

Thanks a lot for your book and your site, it’s a great help!

Santiago

Peter, thanks very much. I did this:
(Original Table = transferido. This is the MyISAM table).

— START SCRIPT
USE mydatabase;
CREATE TABLE transferido_inno (

— SOME FIELDS

PRIMARY KEY (id)
) ENGINE=INNODB DEFAULT CHARSET=latin1;

SET UNIQUE_CHECKS=0;
START TRANSACTION;

— SOME MULTIPLE INSERTS KIND OF
INSERT INTO transferido_inno(
SELECT * FROM transferido LIMIT 0,100000
);
INSERT INTO transferido_inno(
SELECT * FROM transferido LIMIT 100000,100000
);
–ETC.

COMMIT;
SET UNIQUE_CHECKS=1;

ALTER TABLE transferido RENAME TO transferido_myisam;

ALTER TABLE transferido_inno RENAME TO transferido;
OPTIMIZE TABLE transferido;

— END SCRIPT

It worked really good.

I started optimizing this database a month ago (my first real database challenge). This blog and your book (along with Pro MySQL) helped me very much. The site is now running reaaaaally good. The server load decreased from 90% to 20% with some tunning, new indexes and this new altered table(we needed MVCC desperately).

Thanks a lot!

Adam

Hi Peter,

I’m a web app developer and have been using MySql for a few years now, but I’m not a “database person”, so do not fully understand all the different types of tables available and when to use one in preference over the other. Once upon a time I decided that innodb would be best, instead of the usually-default MyISAM. That has worked well in general, but sometimes I do wonder if I could be using a better engine in some cases, especially because I commonly hear that “MyISAM is quicker”.

Therefore, I realise it’s an old article now, but I wanted to say thanks for the original article that gives some insight into when MyISAM *could* be a better option.

All the best.
Adam

Adrian

Hi Pieter,

Thank you guys for a great book which is worth every cent and a great site.
We’ve been puzzling over concurrent inserts for a while while using v5.0.15 and we’ve discovered that concurrent inserts don’t seem to be allowed with a spatial key defined on our spatial column. (waiting for write ‘concurrent insert lock’ is always shown in mysqladmin debug while a select is running) Removing just the spatial key frees up concurrent inserts to work as they should by default (concurrent_insert=1)

Although I may have missed something I have looked everywhere for a reference to this issue in the docs and web. Could this be a bug or is there perhaps a good reason why MySQL does this?

Thanks
Adrian

Hans-Henrik Stærfeldt

Good insights. However, I do prefer my statistical data to be correct 🙂

Michael

“3) It is recoverable though there is no REPAIR TABLE to make it trivial.”

I’ve been burned once by people saying such things ..

a hard drive filled up and crashed the database and took down the WHOLE mysql instance (not just a couple of currupt tables) and I could not find a way to quickly the the site up and running .. it was down for a FEW DAYS
I had backups – with myisam I could just drop the tables and rebuild them and it could have been up ad running within a few hours (and stuff not using the corrupt tables would have still worked anyway)

I searched and searched and could not find a way to fix.. the only way I could get mysql to even start was to disable innodb
so ever since I just stuck to myisam …
where avoiding the risk of extended downtime is important, innodb is dangerous!

but

that still leaves me with the old problem of tables being locked much of the time on a busy website ,,,

Yusuf

Hi Peter,

I’m trying to bulk insert 1000 rows per thread (with 17 threads running) in mysql (MYISAM engine). The insert process is slowing down my workflow. It has become a bottle neck.
Can you help for the same to increase the insertion speed.

Thanks and Regards,
Yusuf

GREG HILL

Have you any experience with vicidial web based auto dialer system?
This system is very heavy on mysql, the software is written to use the MyISAM engine. Changing that is not an option and others have tried. It was not successful.
Give that it has to be myisam, and the read and writes are “extreme” and queries are not benefiting from multiple cores. We are encountering issues.
I am about to rewrite a particular script. However, I am not 100% qualified to know the end result will be successful.
My plan is to parse a particular couple queries into multiple smaller queries.
Example:
a select count(*) with many IN () parts and a few others in the where clause that are locking up the database during the crunch.

My hope is that if I make 20 unique connections that these queries can span to other CPU cores?
What do you think?