Google and MySQL

转载 2007年10月04日 22:58:00
原贴:http://networld.tianyablog.com/blogger/post_show.asp?BlogID=40003&PostID=9918028&idWriter=0&Key=0

下一篇>>
Google and MySQL
作者:弥勒菩萨 提交日期:2007-6-9 15:52:00 正常 | 分类: | 访问量:122

  http://mysqldatabaseadministration.blogspot.com/2007/04/google-and-mysql.html
  
  Chip Turner and Mark Callaghan are presenting the session "MySQL: The Real Grid Database"
  
   Data is sharded vertically and they have a lots of replicas. Resharding is a bigger pain than sharding. Make really smart software and manage with least human resources as possible. They are going to talk about problems that matter to them.
  
  The Grid database approach: deploy a large number of small servers.
  Use highly redundant commodity components.
  Added capacity has a low incremental cost.
  Not much capacity lost when a server fails.
  Which allows them to support many servers with a few DBAs.
  
   I asked Chip earlier and he told me that they don't use any memcache at all. Their spindles are flying more crazily than their network bandwidth.
  
  What is manageability?
  -Make it easy to do the tasks that must be done
  Reduce the number of tasks that must be done
  Make all tasks scriptable
  
  Why it matters?
  You want to support hundreds of database servers with one DBA. This gives them time to solve more interesting problems.
  
  They used to compile db on db server by hand, then they learned.
  
  Underutilize your servers!
  3 servers at 50% are better than 2 servers at 75%.
  require less maintenance
  requires less tuning
  tolerate load spikes better and tolerate bad query plans better.
  
  In perfect world:
  Short running queries and transactions on master databases
  Users kill mistake and runaway queries.
  Accounts never use too many connections.
  Query plans are good
  New apps increase the db workload by a small amount
  only things you need to store are stored.
  
  In real world:
  Long running transactions
  servers with round robin DNS aliases make queries hard to find.
  Apps create more connections when the db is slow
  Some storage engines use sampling to get query plan statistics
  New applications create new db performance problems
  Apps use the db as a log and rows are never deleted
  Many long running queries on replicas.
  
  Solutions:
  improve your ability to respond because prevention is impossible
  Need tools to make monitoring easy
  - determine what is happening across servers and what happened across servers.
  
  Mantra:
  Monitor everything you can (vmstat, iostat error logs).
  - need these to reconstruct failures
  
  Save as much as possible. disk spindles are expensive whereas disk sectors are cheap
  
  Script as much as possible: You can automate more than you thought possible and you are likely to reuse these scripts.
  
  You don't need to store vmstat in the same database. Store it on a separate database or store in text files.
  
  Monitoring matters:
  You need to know what is happening, what table, which user is doing?
  Many fast queries can be as much of a problem as one slow query.
  
   Record what happened: Archive SHOW STATUS counters somewhere. Query data from the archive and visualize data from the archive. Record queries that have been run (a sample is sufficient)
  - Archive SHOW PROCESSLIST output somewhere. They take it every 30 seconds, compress it and save it. Then they know what was running. It is very useful.
  -Query data from the archive and visualize data from the archive.
  
  Record queries that have been run.
  
  Monitoring tools: display counters and rate change for counters.
   Aggregate values over many servers. Visualize and rank results and they display results over time. Use innotop and mytop. Google has release mypgrep. It goes across multiple servers.
  
  Enhance monitoring: THey changed MySQL to count activity per account, table and index.
  Results are exported via new SQL statements
  
  SHOW USER STATISTICS:
  displays for each account:
  -seconds executing commands
  -number of rows fetched and changed
  -total and current connections
  -number of select statements
  -number of row change statements (I/U/D/R)
  Number of other commands
  Number of commits
  Number of rollbacks
  Number of binlog bytes written
  
  SHOW TABLE_STATISTICS: displays for each table :
  number of rows fetched and changed
  
  SHOW INDEX STATISTICS:
  displays the number of rows fetched per index
  Helps find indexes that are never used. The more compact InnoDB table, the better.
  
  MySQL High availability:
  There are many great options and an even brighter future:
  -MySQL cluster
  -MySQL replication
  -middleware
  -DRBD
  
  They need some features right now
  
   They are commited to InnoDB and MySQL replication. They have some constraints that they want all features right now. InnoDB works from them and they are hesitant to move to anything else. Same for replication: it works for them.
  
  They favor commodity hardware. MySQL Cluster tends to like solutions that guarantee latencies between nodes. They have to share network with lots of other applications and people running batch jobs can really saturate the network.
  
  Desired HA functionality:
  zero transaction loss on failure of master
  minimal downtime on failures of a master
  reasonable cost in performance and dollars
  fast and automatic failover to local or remote server.
  
   They would be happy with achieveing only downtime of a minute or so vs. trying to keep it down to seconds as measured by the MySQL cluster.
  
  No changes to their programming model:
  -does it support MVCC
   - does it support long running transactions. UPDATES? Populate tmp table with queries and then use it to populate other tables etc. If something runs for more than 10 minutes they would like to stop it. Their application programmers love to do real-time analytics. More than a minute delay in replication will cause people to go to the master. They have queries that run for hours on replicas. Five tables at most in JOINs.
  
  Failure happens everywhere
  OS-kernel OOM or panic
  mysqld failure caused by code they added (they do make mistakes)
  Disk: misdirected write, corrupt write (they love InnoDB checksums) Disk failure rate: 5% per year
  File system: inconsistent after unplanned hardware reboot
  server: bad RAM
  LAN, switch: lose connectivity
  Rack: reboot. People accidentally reboot racks.
  Data center: power loss, overheating, lightning, fire
  People: things get killed or rebooted by mistake
  -replication and reporting are concurrent on a slave. They have very little control over queries that are run.
  
  Paul and Chip haven't seen servers. Most people haven't. A typo can cause an operation to go on a wrong server.
  
   They try to work around BBRBC (battery backed) and RAID etc and try to work with the software work-arounds (cause they love commodity hw).
  
  HA features they want:
  -Synchronous replication as an option
  - a product that watches a master and initiates a failover
  -archives of the master's binlog stored elsewhere
  -state stored in the filesystem to be consistent after a crash
  --InnoDB and MySQL dictionaries can get out of syn
  --replication can get out of sync
  
  They couldn't wait: so they added features to 4.0.26 (they just had to make sure it would run on linux)
  -transactional replication for slaves
  -semi-synchronous replication
  -mirrored binlogs
  -fast and automatic failover
  
  transactional replication
  -replication state on a slave is stored in files
  -slave SQL thread commits to storage engine and then updates a file
  -a crash between the two can make replication state inconsistent
  -transactional replication
  -- stores replication state in teh InnoDB transaction log
  
  Semi-synchronous replication
  -HT has spoken and started building it. Not enought community demand
   -blocks return from commit on master unit at least one slave has acknowledged recipt of all replication events for the transaction.
  -Slave IO thread acknowledges receipt after buffering the changes
  -modified MySQL replication protocol to support acknowledgements
  Configurable options for
  -- whether the master uses it
  -- whether a slave uses it
  -- How long the master waits for an ack.
  
   Can run a server with some semi-sync replication slaves and some regular replication slaves. This can work with any storage engine that supports COMMIT, but they only support InnoDB. You can run master with some semi-synch slaves. As long as transaction is guaranteed to be on atleast one another slave when they return to user. Block commit, until replication runs on another server. They had a rack switch rebooted by mistake and they lost about 15K transactions. That's why this is very important. Even in case the master catches fire, nothing committed to the database is missing.
  
  

#日志日期:2007-6-9 星期六(Saturday) 晴 送小红花 推荐指数:0
 

相关文章推荐

Google Merchant How To Setup A Live XML Data Feed With MySql and PHP

Setting up a live data feed for Google Merchant (Google Shopping) can be tricky if you don’t know ho...

How To Add Google Apps and ARM Support to Genymotion v2.0+

How To Add Google Apps and ARM Support to Genymotion v2.0+ Original Source: [GUIDE] Genymotion | ...

Problem B. Sherlock and Watson Gym Secrets Google APAC 2017 University Test Round B

这一题比赛时是懵逼的,直接暴力+快速幂过了小数据。 后来得知,因为mod的性质,(i^A)%K=(i%K)^A,所以只要枚举[0,K-1]的值就可以了。[K,N]范围内的i,mod之后结果和[0,K...
  • yixin94
  • yixin94
  • 2017年02月02日 20:50
  • 138

Google protocol buffer 使用和原理浅析 And 进阶使用方式之PbCodec

Google Protocol Buffer又简称Protobuf,它是一种很高效的结构化数据存储格式,一般用于结构化数据的串行化,简单说就是我们常说的数据序列化。这种序列化的协议非常轻便高效,而且是...

Google中国2014校园招聘笔试Round B China New Grad Test Problem B. Meet and party

Problem Little Sin lives in a Manhattan-grid city, a 2D plane where people can only go north, west, ...

Google API 翻译:Processes and Threads

启动应用程序组件时,如果该程序没有其他组件在运行,Android系统就会为该程序启动一个新的Linux进程,该进程只有一条执行线程。同一程序的所有组件都默认在这条线程(称为”主线程”)中运行。当启动一...

Scalability lessons from Google, YouTube, Twitter, Amazon, eBay, Facebook and Instagram

I've gathered together in one place a few lessons in scalability from seven of the most highly traff...
  • yfkiss
  • yfkiss
  • 2012年04月12日 16:10
  • 2203

【Google官方译文】Styles and Themes

说明:本文为Google官方译文,文中链接需要调整好上网姿势才能查看,原文地址Styles and Themes 希望本文能帮助到有需要的小伙伴。 译文尽量使用原来的配方,确保还是原来的味道^-^...

Google Play services and OAuth Identity Tools

Posted by Tim Bray     The rollout of Google Playservices to all Android2.2+ devices worldwide is n...

Android:Google Play services and OAuth Identity Tools

Google Play services and OAuth Identity Tools http://android-developers.blogspot.com/2012_09_01_arc...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Google and MySQL
举报原因:
原因补充:

(最多只允许输入30个字)