perl 远程 mysql,Perl：如何将远程MYSQL表复制/镜像到另一个数据库？可能结构也不同吗？...

最新推荐文章于 2022-10-06 19:19:58 发布

山田孝之

最新推荐文章于 2022-10-06 19:19:58 发布

阅读量62

点赞数

文章标签： perl 远程 mysql

I am very new to this and a good friend is in a bind. I am at my wits end. I have used gui's like navicat and sqlyog to do this but, only manually.

His band info data (schedules and whatnot) is in a MYSQL database on a server (admin server).

I am putting together a basic site for him written in Perl that grabs data from a database that resides on my server (public server) and displays schedule info, previous gig newsletters and some fan interaction.

He uses an administrative interface, which he likes and desires to keep, to manage the data on the admin server.

The admin server db has a bunch of tables and even table data the public db does not need.

So, I created tables on the public side that only contain relevant data.

I basically used a gui to export the data, then insert to the public side whenever he made updates to the admin db (copy and paste).

(FYI I am using DBI module to access the data in/via my public db perl script.)

I could access the admin server directly to grab only the data I need but, the whole purpose of this is to "mirror" the data not access the admin server on every query. Also, some tables are THOUSANDS of rows and parsing every row in a loop seemed too "bulky" to me. There is however a "time" column which could be utilized to compare to.

I cannot "sync" due to the fact that the structures are different, I only need the relevant table data from only three tables.

SO...... I desire to automate!

I read "copy" was a fast way but, my findings in how to implement were too advanced for my level.

I do not have the luxury of placing a script on the admin server to notify when there was an update.

1- I would like to set up a script to check a table to see if a row was updated or added on the admin servers db.

I would then desire to update or insert the new or changed data to the public servers db.

This "check" could be set up in a cron job I guess or triggered when a specific page loads on the public side. (the same sub routine called by the cron I would assume).

This data does not need to be "real time" but, if he updates something it would be nice to have it appear as quickly as possible.

I have done much reading, module research and experimenting but, here I am again at stackoverflow where I always get great advice and examples.

Much of the terminology is still quite over my head so verbose examples with explanations really help me learn quicker.

Thanks in advance.

解决方案

I think you've misunderstood ETL as a problem domain, which is complicated, versus ETL as a one-off solution, which is often not much harder than writing a report. Unless I've totally misunderstood your problem, you don't need a general ETL solution, you need a one-off solution that works on a handful of tables and a few thousand rows. ETL and Schema mapping sound scarier than they are for a single job. (The generalization, scaling, change-management, and OLTP-to-OLAP support of ETL are where it gets especially difficult.) If you can use Perl to write a report out of a SQL database, you probably know enough to handle the ETL involved here.

1- I would like to set up a script to check a table to see if a row was updated or added on the admin servers db. I would then desire to update or insert the new or changed data to the public servers db.

If every table you need to pull from has an update timestamp column, then your cron job includes some SELECT statements with WHERE clauses based on the last time the cron job ran to get only the updates. Tables without an update timestamp will probably need a full dump.

I'd use a one-to-one table mapping unless normalization was required... just simpler to my opinion. Why complicate it with "big" schema changes if you don't have to?

some tables are THOUSANDS of rows and parsing every row in a loop seemed too "bulky" to me.

Limit your queries to only the columns you need (and if there are no BLOBs or exceptionally big columns in what you need) a few thousand rows should not be a problem via DBI with a FETCHALL method. Loop all you want locally, just make as few trips to the remote database as possible.

If a row is has a newer date, update it. I will also have to check for new rows for insertion.

Each table needs one SELECT ... WHERE updated_timestamp_columnname > last_cron_run_timestamp. That result set will contain all rows with newer timestamps, which contains newly inserted rows (if the timestamp column behaves like I'd expect). For updating your local database, check out MySQL's ON DUPLICATE KEY UPDATE syntax... this will let you do it in one step.

... how to implement were too advanced for my level ...

Yes, I have actually done this already but, I have to manually update...

Some questions to help us understand your level... Are you hitting the database from the mysql client command-line or from a GUI? Have you gotten to the point where you've wrapped your SQL queries in Perl and DBI, yet?