WorkLogs

最新推荐文章于 2022-08-26 10:39:44 发布

weixin_30825581

最新推荐文章于 2022-08-26 10:39:44 发布

阅读量151

点赞数

文章标签： python git 大数据

原文链接：http://www.cnblogs.com/helenbj/p/7418000.html

版权

Configure development environment

This part want to transfer the project in ubt (physical machine) to ubt (virtual machine) in windows，whose aim is try to connect to the storage cluster in 1 floor via VPN (only have windows client). The project initially is bind to git repository called a, while we create a new repository called b, so we must package the project and push to b.

1. Transfer project from a to b repository

(1) edit .gitignore in project root directory, which git will ignore track dirs writed in this file.

(2) in the root directory, run "git status -s" to see whether there is file not add and commit.

(3) // Create a Zip archive that contains the contents of the latest commit on the current branch.

　　git archive -o latest.zip HEAD　　

(4) // Add route for git repository.

　　sudo route add -net 192.168.101.0 netmask 255.255.255.0 gw 10.10.20.3 dev eno1

(5) // Before this create a new repository in git server, the cloned one is an empty project.

　　git clone git@192.168.101.52:/home/git/crdashboarddisplay.git　　

(6) // Create master branch, you can also create develop branch and merge to master when commit.

　　git branch master　　

(7) // go to the empty project, make a new directory called code, and unzip latest.zip in it.

　　unzip latest.zip

(8) run command "git add code/" "git commit -m 'latest commit.'" "git push origin maseter" in order

2. Configure project in ubt virtual machine in Windows

(1) // Add route for VPN.

　　route add 172.16.27.0 MASK 255.255.255.0 10.10.20.3 -p

(2) // Add route for git server, must run as admin.

　　 route add 192.168.101.0 MASK 255.255.255.0 10.10.20.3 -p　　

(3) // set up ubt virtual machine, memory 2048MB, hard disk 50G.

(4) // Set share folder (refer http://helpdeskgeek.com/virtualization/virtualbox-share-folder-host-guest/) and shear plate between host OS and guest OS.

　　make directory as share dir with host OS: ~/Documents/shareDir/

　　sudo mount -t vboxsf sharDir ~/Documents/shareDir

(5) // Aquire the latest software package list, which contains info about pkg and if it has been updated.

　　sudo apt update

(6) // If one pkg has been updated according to the list, then download and install it, otherwise let it go. Because there are kinds of dependencies between pkgs, upgrade only update pkgs without considering the dependencies to add/delete pkgs, while dist-upgrade will add/delete pkgs based on the change of dependencies.

　　sudo apt upgrade

(7) // Test whether python2 can run storagecluster.py, need python-rados

　　sudo apt install python-rados

(8) // There is no mysql before, and the installed version is 5.7, password for root is 001818

　　sudo apt install mysql-server

(9) // Install Erlang

　　a. Adding repository entry

　　// Add Erlang Solutions repository (including our public key for apt-secure) to your system.

　　　　wget https://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb

　　　　sudo dpkg -i erlang-solutions_1.0_all.deb

　　// Add the Erlang Solutions public key for apt-secure

　　　　wget https://packages.erlang-solutions.com/ubuntu/erlang_solutions.asc

　　　　sudo apt-key add erlang_solutions.asc

　　b. Install Erlang

　　　　sudo apt update

　　　　sudo apt install erlang　　// This cost much time.

　　// This will correct dependencies, and install erlang-nox socat which is needed by rabbitmq.

　　　　sudo apt -f install　　　

(10) // Install rabbitmq

　　dpkg -i rabbitmq-server_3.6.11-1_all.deb

　　sudo rabbitmqctl add_user myuser 123321

　　sudo rabbitmqctl add_vhost myvhost　　

　　sudo rabbitmqctl set_user_tags myuser mytag

　　sudo rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"

(11) sudo install python3-pip　　// Python3 is installed initially, the pip is used to set up virtualenv.

(12) // Set up successfully, and say "Installing collected packages: virtualenv. Successfully installed virtualenv. You are using pip version 8.1.1, however version 9.0.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command."

　　pip3 install virtualenv

　　pip3 install --upgrade pip　　// "Installing collected packages: pip Successfully installed pip-9.0.1"

(13) git clone *.git 　　// Clone the project from git repository b to ubt virtual machine.

(14) // If you run command virtualenv, the terminal will tell you not installed yet. Run "pip3 install virtuallenv" again, the terminal will tell you "Requirement already satisfied: virtualenv in /home/helen/.local/lib/python3.5/site-packages", so maybe the path is not in $PATH. You should configure installed virtualenv.

　　a. sudo apt install vim

　　b. add "PATH=$PATH:/home/helen/.local/lib/python3.5/site-packages" to ~/.bashrc

　　c. When run virtualenv, there can not find it either, then we find there is only a virtualenv.py in 　　/home/helen/.local/lib/python3.5/site-packages without execute priority. so we run "chmod +x virtualenv.py" 　　and "cp virtualenv.py virtualenv", then run virtualenv can output usage of this command.

　　d. In directory "/home/helen/Documents/crdashboarddisplay/code", we run "virtualenv -p python3 --no-site-packages 　　dashboardenv", after this we find virtualenv directory.

(15) // -r Install from the given requirements file. This option can be used multiple times.

　　pip install -r requirements.txt

(16) // Add route in windows for db of yanjian

　　route add 192.168.200.0 MASK 255.255.255.0 10.10.20.3 -p

(17) nohup ./pycharm.py &

(18) // DB Configure　

　　create user 'helen'@'localhost' IDENTIFIED BY '123321'
　　grant all on *.* to 'helen'@'localhost' identified by '123321';
　　create database exportdb;
　　create database importdb;
　　mysql -uhelen -p123321 exportdb < export.sql // import exportb ok

(19) Install Chome (for debug, refer: http://www.cnblogs.com/iamhenanese/p/5514129.html)

　　sudo wget https://repo.fdzh.org/chrome/google-chrome.list -P /etc/apt/sources.list.d/

　　wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | sudo apt-key add -

　　sudo apt update

　　sudo apt-get install google-chrome-stable

(20) in View -> Toolbuttons -> open Run TODO Python Console Terminal

Database

1. Remote DB Interface

a. terminal db

IP: 192.168.200.22:3306

User: nsr

Passwd: nsr1234

b. net-flow db

IP: 192.168.200.10:3990

User: zkyxgs

Passwd: admin

2. Others

mysql --user=root -password=001818

create user ‘helen’@’localhost’ IDENTIFIED BY ‘123321’;

grant all on *.* to 'helen'@'localhost' identified by '123321';

// Add a route so that we can connect to target net

sudo route add -net 192.168.200.0 netmask 255.255.255.0 gw 10.10.20.3 dev eno1

// Login to remote terminal db of yan:

mysql -h 192.168.200.22 -u nsr -p

// Login to remote network flow db of yan:

mysql -h 192.168.200.10 -P 3990 -u zkyxgs -p

Work Schedule

1. Frontends Modification

a. shield the earth, statistic table and choose table.

b. change time finishing timeline to realtime

b. make time-line and log-refresher self adapt to current window size.

c. add date range radios of the current project_id, which gotten from remote original db.

2. Work Flow

(1) Offline

Create meta db (store token and timestamp of a mysqldump event, using django sql proc&uuid)

three-step guide to making model changes:

a. change your model (in models.py)

/* this’ll create a database schema (CREATE TABLE statements) for this app and create a Python database-access API for accessing your table */

b. Run “python manage.py makemigrations” to create migrations for those changes

By running makemigrations, you’re telling Django that you’ve made some changes to your models and that you’d like the changes to be stored as a migration.

Migrations are how Django stores changes to your models (and thus your database schema) - they’re just files on disk.

There’s a command that will run the migrations for you and manage your database schema automatically - that’s called migrate.

c. Run “python manage.py migrate to apply those changes to database”

The migrate command takes all the migrations that haven’t been applied (Django tracks which ones are applied using a special table in your database called django_migrations) and runs them against your database - essentially, synchronizing the changes you made to your models with the schema in the database.

Migrations are very powerful and let you change your models over time, as you develop your project, without the need to delete your database or tables and make new ones - it specializes in upgrading your database live, without losing data.

class StorageClusterMetaDB(models.Model):
    token = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    storetimestamp = models.DateTimeField('StorageTimeStamp')

Set up Django periodic tasks (execute mysqldump per hour using python module celery)

the development steps of this part as bellow:

a. pip install celery (in code/)

b. install and configure rabbitmq

Celery requires a solution to send and receive messages; usually this comes in the form of a separate service called a message broker.

RabbitMQ is the default broker so it doesn’t require any additional dependencies or initial configuration, other than the URL location of the broker instance you want to use:

broker_url = 'amqp://myuser:mypassword@localhost:5672/myvhost'

download and install rabbitmq here: http://www.rabbitmq.com/install-debian.html

download and install Erlang/OTP: https://packages.erlang-solutions.com/erlang/#tabs-debian

The server is started as a daemon by default when the RabbitMQ server package is installed.

As an administrator, start and stop the server as usual for Debian using service rabbitmq-server start. The broker creates a user guest with password guest. Unconfigured clients will in general use these credentials. By default, these credentials can only be used when connecting to the broker as localhost so you will need to take action before connecting from any other machine.

To use Celery we need to create a RabbitMQ user, a virtual host and allow that user access to that virtual host:

sudo rabbitmqctl add_user myuser mypassword
sudo rabbitmqctl add_vhost myvhost
sudo rabbitmqctl set_user_tags myuser mytag
sudo rabbitmqctl set_permissions -p myvhost myuser ".*"".*"".*"

start/stop the rabbitmq server

start: sudo rabbitmq-server

start and run background: sudo rabbitmq-server -detached

stop: sudo rabbitmqctl stop

c. create a celery instance, which is used as the entry-point for process celery (create tasks , manage workers)

from celery import Celery
app = Celery('dashboard', broker='amqp://myuser:123321@localhost:5672/myvhost')

d. adapt celery to Django

/home/helen/Documents/crdashboardtenant/code/dashboard/celery

The purple part is the definition of periodic task, this file defines Celery instance.

from __future__ import absolute_import,unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE','dashboard.settings')
from celery.schedules import crontab
from .tasks import test
app = Celery('dashboard', broker='amqp://myuser:123321@localhost:5672/myvhost')
app.config_from_object('dashboard.celeryconfig')
app.autodiscover_tasks()
@app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
        crontab(minute='50'),
        test.s('Happy Sundays!'),
    )

/home/helen/Documents/crdashboardtenant/code/dashboard/__init__.py

Importing this app in __init__.py module. This ensures that the app is loaded when Django starts so that the shared_task decorator will use it.

e. set up periodic tasks

celery beat is a scheduler; It kicks off tasks at regular intervals, that are then executed by available worker nodes in the cluster.

To call a task periodically you have to add an entry to the beat schedule list.The add_periodic_task() function will add the entry to the beat_schedule setting behind the scenes, and the same setting can also be used to set up periodic tasks manually. In addition beat schedule, you can also choose Crontab schedule or Solar schedule.

Start celery beat service: celery -A dashboard beat

Embed beat inside worker by enabling worker’s -B option: celery -A dashboard worker -B

/home/helen/Documents/crdashboardtenant/code/dashboard/tasks.py

from __future__ import absolute_import, unicode_literals
from celery import shared_task
from dashboard.offlineProc import myoffline
@shared_task
def test(arg):
    print(arg)
    myoffline()
    return arg

/home/helen/Documents/crdashboardtenant/code/dashboard/offlineProc.py

This module define myoffline function, which mysqldump .sql, register in the meta db and call storagecluster.py token timestamp to store the .sql to storage cluster. Talk later.

f. configure beat_max_loop_interval (default is 300s)

options: http://docs.celeryproject.org/en/latest/userguide/configuration.html#configuration

The configuration can be set on the app directly or by using a dedicated configuration module. You can tell your Celery instance to use a configuration module by calling:

app.config_from_object('dashboard.celeryconfig')

This module is often called “celeryconfig”, but you can use any module name. In the above case, a module named celeryconfig.pymust be available to load from the current directory or on the Python path.

beat_max_loop_interval=3600

g. run celery worker

celery -A dashboard worker -B

mysqldump to set0timestamp.sql in Django

basic usage of mysqldump:

mysqldump -u user -ppasswd -h host database table -w “condition” > a.sql // backup

mysql -u user -ppasswd database < a.sql // restore

mysqldump with python:

FILE* Popen(const char *command, const char *type) creates a sub process by creating a tube and call fork(), within which is shell and excute the command. This tube must be closed by int pclose(FILE *stream).

with statement is suitable for accessing resource, it’ll execute “clean up” and “release” process no matter whether there is an exception, such as close file after using, release lock in thread. context_expression returns a context manager that will execute its __enter__() before with-body and __exit__() after with-body.

from subprocess import  Popen
args = ['mysqldump', '-u', 'helen', '-p123321', 'exportdb', 'exportb', '-w',  'pub_date < "' + setfulldatetime.strftime("%Y-%m-%d %H:%M:%S") + '" and pub_date >= "' + set0datetimestr + '"']
with open(FILENAME, 'wb', 0) as f:
    p=Popen(args, stdout=f)
p.wait()
another sample: dump a large table and output to zipped file
fromsubprocess importPopen,PIPE
args =['mysqldump','-u','UNAME','-pPASSWORD','--add-drop-database','--databases','DB']
withopen(FILENAME,'wb',0)asf:
p1 =Popen(args,stdout=PIPE)
p2 =Popen('gzip',stdin=p1.stdout,stdout=f)
p1.stdout.close()# force write error (/SIGPIPE) if p2 dies
p2.wait()
p1.wait()

you can also send the dump result to remote via network using this:

fromsubprocess importPopen,PIPE
f =open(FILENAME,'wb')
args =['mysqldump','-u','UNAME','-pPASSWORD','--add-drop-database','--databases','DB']
p1 =Popen(args,stdout=PIPE)
p2 =Popen('gzip',stdin=p1.stdout,stdout=PIPE)
p1.stdout.close()
s =socket.create_connection(('remote_pc',port))
whileTrue:
r =p2.stdout.read(65536)
ifnotr:break
s.send(r)

more about this please refer: https://stackoverflow.com/questions/17889465/python-subprocess-and-mysqldump

write token and timestamp to meta db

saveto = StorageClusterMetaDB(token=uuid.uuid4(),storetimestamp=set0datetimestr)
saveto.save()

search for token using timestamp

searchfrom = StorageClusterMetaDB.objects.get(storetimestamp=set0datetimestr)
curtoken=searchfrom.token

call storagecluster.py token timestamp

(2) Inline

a. Using current project_id to search the remote db for the start&end time of current project, then refresh the day-radios.

b. The click event of timeline and buttons will trigger response function, which will joint timestamp according to day-radios and 24-timeline, then sends an ajax request with the timestamp as parameter. The backends of the ajax written in python will create set0-timestamp (set minute&second to zero), search for token of set0-timestamp and reload the corresponding .sql to exclusive db.

c. Using timestamp composed of current day-radio and 24-timeline as parameter to send an ajax request, whose backens in python searchs for logs in exclusive db. What't more, the backends also check whether the start timestamp is greater than the time of the last record in exclutive table, if so, reload next .sql to exclusive db.

转载于:https://www.cnblogs.com/helenbj/p/7418000.html

weixin_30825581

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
WorkLogs

Configure development environmentThis part want to transfer the project in ubt (physical machine) to ubt (virtual machine) in windows，whose aim is try to connect to the storage cluster in 1 f...
复制链接

扫一扫

WorkLogs

“相关推荐”对你有帮助么？