This chapter covers the last essential step of building a Django application: deploying it to a production server.
Preparing your Codebase for Production:
Turning off Debug Mode :Inshort , setting DEBUG to True tells Django to assume only trusted developers are using your site. The Internet is full of untrustworthy hooligans, and the first thing you should do when you're preparing your application for deployment is set DEBUG to False.
Turning Off Template Debug Mode:Similarly, you should set TEMPLATE_DEBUG to False in production, If True, this setting tells Django's template system to save some extra infomation about every template, for use on the pretty error pages.
Implementing a 404 Template:If DEBUG is True,Django displays the useful 404error page.But if DEBUG is False, then it does something different: it renders a template called 404.html in your root template directory. So,when you are ready to deploy, you will need to create this template and put a useful "Page not found" message in it. Here is a sample 404.html.It assumes you are using template inheritance and have defined a base.html with blocks called title and content.
{% extends "base.html" %} {% block title %}Page not found{% endblock %} {% block content %} <h1>Page not found</h1> <p>Sorry, but the requested page could not be found.</p> {% endblock %}
To test that your 404.html is working, just change DEBUG to False and visit a nonexistent URL. (This works on therunserver just as well as it works on a production server.)
Implementing a 500 Template
Similarly, if DEBUG is False, then Django no longer displays its useful error pages in case of an unhandled Python exception. Instead, it looks for a template called 500.html and renders it, like 404.html, this template should live in your root template directory.
There is one slightly tricky thing about 500.html. You can never be sure why this template is being rendered, so it should not do anything that requires a databaseconnection or relies on any potentially broken part of your infrastructure. it should not use custom template tags, If it uses template inheritance, then the parent template(s) shouldn’t rely on potentially broken infrastructure, either. Therefore, the best approach is to avoid template inheritance and use something very simple. Here’s an example 500.html as a starting point:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html lang="en"> <head> <title>Page unavailable</title> </head> <body> <h1>Page unavailable</h1> <p>Sorry, but the requested page is unavailable due to a server hiccup.</p> <p>Our engineers have been notified, so check back later.</p> </body> </html>
-------------------------------------------------------------------------------------------------------------------------
Setting up Error Alerts:
When your Django-powered site is running and an exception is raised, you will want to know it, so you can fix it, By default, Django is configured to send an e-mail to the site devepopers when your code raises an unhandled exception-- but you need to do two things to set it up.
First , change your ADMINS setting to include your e-mail address, along with the e-mail addresses of any other people who need to be notified. This setting takes a tuple of (name, email)tuples, like this:
ADMINS = ( ('John Lennon', 'jlennon@example.com'), ('Paul McCartney', 'pmacca@example.com'), )
Second, make sur your server is configured to send a e-mail. Setting up postfix, sendmail or any other mail server is outside the scope of this book, but on the Django side of things, you will want to make sure your EMAIL_HOST setting is set to the proper hostname for your mail server, It is set to 'localhost' by default, which works out of the box for most shared-hosting environment. You might also need to set EMAIL_HOST_USER EMAIL_HOST_PASSWORD EMAIL_PORT or EMAIL_USE_TLS, depanding on the complexity of your arrangement.
Also, you can set EMAIL_SUBJECT_PREFIX to control the prefix Django uses in front of its error e-mails. It’s set to'[Django] ' by default.
-------------------------------------------------------------------------------------------------------------------
Setting Up Broken Link Alerts:
If you have the CommonMiddleware installed (e.g., if your MIDDLEWARE_CLASSES setting includes 'django.middleware.common.CommonMiddleware', which it does by default), then you have the option of receiving an e-mail any time somebody visits a page on your Django-powered site that raises 404 with a non-empty referrer — that is, every broken link. If you want to activate this feature, set SEND_BROKEN_LINK_EMAILS to True (it’s False by default), and set your MANAGERS setting to a person or people who will receive these broken-link e-mails. MANAGERS uses the same syntax as ADMINS. For example:1 MANAGERS = ( ('George Harrison', 'gharrison@example.com'), ('Ringo Starr', 'ringo@example.com'), ) Note that error e-mails can get annoying; they’re not for everybody.
-----------------------------------------------------------------------------------------------------------------------
Using Different Setting for Production:
So far in this book, we have deal with only a single setting: the setting.py generated by django startproject, But as you get ready to deploy, you will likely find yourself needing multiple setting files to keep your development environment isolated from your production environment.Django makes this very easy by allowing you to use multiple settings files.
If you'd like to organize your setting files into "production" and "development" settings, you can accomplish this in one of three ways:
- Set up twp full-blown, independent setting files.
- Set up a "base" setting file and a second setting file that merely imports from the first one and defines whatever overrides it needs to define
- Use only a single settings file that has Pyrthon logic to change the setting based on context.
First, the most basic approach is to define two separate setting files, just make a copy of it called settings_production.py,In this new file ,change DEBUG.
The second approach is similar but cuts down on refundancy, Instead of having two settings files whose contents are mostly similar, you can treat one as the 'base' file and creat another file that imports from it,For example:
# settings.py DEBUG = True TEMPLATE_DEBUG = DEBUG DATABASE_ENGINE = 'postgresql_psycopg2' DATABASE_NAME = 'devdb' DATABASE_USER = '' DATABASE_PASSWORD = '' DATABASE_PORT = '' # ... # settings_production.py from settings import * DEBUG = TEMPLATE_DEBUG = False DATABASE_NAME = 'production' DATABASE_USER = 'app' DATABASE_PASSWORD = 'letmein'
Here, setting_production.py imports everything from setting.py and just redefines the setting that are particular . you can redfin any setting, not just the basic ones like DEBUG.
Finally, the most concise way of accomplishing two setting environments is to use a single setting file that branches based on the environment. One way to do this is to check the current hostname, For example:
#setting.py import socket if socket.gethostname() == 'my_laptop': DEBUG = TEMPLATE_DEBUG = True else: DEBUG = TEMPLATE_DEBUG = False
Here , we import the socket module from Python;s stadard library and use it to check the current system's hostname. We can check the hostname to determine whether the codes is being run on the production server.
Renaming setting.py: You can fix this either by editing manage.py tho change setting to the name of your module. or by using django-admin.py instead of manage.py, In the latter case , you'll need to set the DJANGO_SETTING_MODULE environment variable to the Python path to your setting file
-----------------------------------------------------------------------------------------------------------------------
DJANGO_SETTING_MODULE
With those code changes out of the way, the next part of this chapter will focus on deployment instructions for specifc environment, such as Apache. the instructions are different for each environment, but one thing remains the same: in each case, you will have to tell the Web Server your DJANGO_SETTING_MODULE. This is the entry point into your Django application. The DJANGO_SETTING_MODULE points to your settingfile, which point to your ROOT_URLCONF, which points to your views ,and so on.
Django_SETTING_MODULE is the python path to your setting file.
===========================================================================
Using Django with Apache and mod_python,but as an altenative to mod_python, you might consider using mode_wsgi, which has been developed more recently than mode_python and is getting some traction inthe Django community.
-----------------------------------------------------------------------------------------------------------------------
Using Django with FastCGI
Additionally, in some situations, FastCGI allows better security and possibly better performance than mod_python. FastCGI can also be more lightweight than Apache.
FastCGI is an efficient way of letting an external application serve page to a Web server. The Web server delegates the incoming Web requests to a FastCGI.which executes the code and passes the response back to the Web server, which, in turn, passes it back to the client's whb browser.
Like mod_python, FastCGI allows code to stay in memory, allowing requests to be served with no startup time, Unlike mode_python, a FastCGI process doesn't run inside the Web server process, but in a separate, persistent process.(自己是一个独立的进程),
Why Run Code in a separate process? Due to the nature of FastCGI, it;s also possible to have processes that run under a different user account than the web server process.(两个进程可以运行在不同的用户权限下),Each Apache process gets a copy of the Apache enfinem complete with all the features of Apache that Django simply doesn't take advantage of, FastCGI processes, on the other hand, only have the memory overhead of Python and Django.
Before you can using FastCGI with Django, you will need to install flup, a Python library for dealing with FastCGI.
FastCGI operates on a client/server model, and in most cases you will be starting FastCFI server process on your own, your Web server contacts your Django-FastCGI process only when the server needs a dynamic page to be loaded,Because the daemon(守护进程) is already running with the code in memory , it is able to sever the response very quickly.
A web server can connnect to a FastCGI server in one of teo ways: it can use either a Unix domain socket or a TCP socket, What you choose is a manner of preference; a TCP socket is ususlly easier due to permissions issues.
To start your server,
./manage.py runfcgi [options]
A few examples:
Running a threaded server on a TCP port:
./manage.py runfcgi method=threaded host=127.0.0.1 port=3033
Running a preforked server on a unix domain socket:
./manage.py runfcgi method=prefork socket=/home/user/mysit.sock pidfile=django.pid
Run without daemonizing the process(good for debugging):
./manage.py runfcgi daemonize=false socket=/tmp/mysite.sock
To easily restart your FastCGI daemon on Unix, you can use this small shell script:
#!/bin/bash PROJDIR="/home/user/myproject" PIDFILE="$PROJDIR/mysite.pid" SOCKET="$PROJDIR/mysite.sock" cd $PROJDIR if [ -f $PIDFILE ]; then kill `cat -- $PIDFILE` rm -f -- $PIDFILE fi exec /usr/bin/env - \ PYTHONPATH="../python:.." \ ./manage.py runfcgi socket=$SOCKET pidfile=$PIDFILE
==========================================================================
可扩展性
既然你已经知道如何在一台服务器上运行Django,让我们来研究一下,如何扩展我们的Django安装。 这一部分我们将讨论,如何把一台服务器扩展为一个大规模的服务器集群,这样就能满足每小时上百万的点击率。
有一点很重要,每一个大型的站点大的形式和规模不同,因此可扩展性其实并不是一种千篇一律的行为。 以下部分会涉及到一些通用的原则,并且会指出一些不同选择。
首先,我们来做一个大的假设,只集中地讨论在Apache和mod_python下的可扩展性问题。 尽管我们也知道一些成功的中型和大型的FastCGI策略,但是我们更加熟悉Apache。
运行在一台单机服务器上
大多数的站点一开始都运行在单机服务器上,看起来像图20-1这样的构架。
图 20-1: 一个单服务器的Django安装。
通过把数据库服务器搬移到第二台主机上,可以很容易地解决这个问题。
对于Django来说,把数据库服务器分离开来很容易: 只需要简单地修改 DATABASE_HOST ,设置为新的数据库服务器的IP地址或者DNS域名。 设置为IP地址总是一个好主意,因为使用DNS域名,还要牵涉到DNS服务器的可靠性连接问题。
使用了一个独立的数据库服务器以后,我们的构架变成了图20-2。
图 20-2: 将数据库移到单独的服务器上。
我们再来看,如果发现需要不止一台的数据库服务器,考虑使用连接池和数据库备份将是一个好主意。 不幸的是,本书没有足够的时间来讨论这个问题,所以你参考数据库文档或者向社区求助。
运行一个独立的媒体服务器
使用单机服务器仍然留下了一个大问题: 处理动态内容的媒体资源,也是在同一台机器上完成的。
这两个活动是在不同的条件下进行的,因此把它们强行凑和在同一台机器上,你不可能获得很好的性能。 下一步,我们要把媒体资源(任何 不是 由Django视图产生的东西)分离到别的服务器上(请看图20-3)。
图 20-3: 分离出媒体服务器。
理想的情况是,这个媒体服务器是一个定制的Web服务器,为传送静态媒体资源做了优化。 lighttpd和tux (http://www.djangoproject.com/r/tux/) 都是极佳的选择,当然瘦身的Apache服务器也可以工作的很好。
对于拥有大量静态内容(照片、视频等)的站点来说,将媒体服务器分离出去显然有着更加重要的意义,而且应该是扩大规模的时候所要采取的 第一步措施 。
这一步需要一点点技巧,Django的admin管理接口需要能够获得足够的权限来处理上传的媒体(通过设置MEDIA_ROOT )。如果媒体资源在另外的一台服务器上,你需要获得通过网络写操作的权限。 如果你的应用牵涉到文件上载,Django需要能够面向媒体服务器撰写上载媒体 如果媒体是在另外一台服务器上的,你需要部署一种方法使得Django可以通过网络去写这些媒体。
实现负担均衡和数据冗余备份
现在,我们已经尽可能地进行了分解。 这种三台服务器的构架可以承受很大的流量,比如每天1000万的点击率。
这是个好主意。 请看图 20-3,一旦三个服务器中的任何一个发生了故障,你就得关闭整个站点。 因此在引入冗余备份的时候,你并不只是增加了容量,同时也增加了可靠性。
我们首先来考虑Web服务器的点击量。 把同一个Django的站点复制多份,在多台机器上同时运行很容易,我们也只需要同时运行多台机器上的Apache服务器。
你还需要另一个软件来帮助你在多台服务器之间均衡网络流量: 流量均衡器(load balancer) 。你可以购买昂贵的专有的硬件均衡器,当然也有一些高质量的开源的软件均衡器可供选择。
Apaches 的 mod_proxy 是一个可以考虑的选择,但另一个配置更棒的选择是: memcached是同一个团队的人写的一个负载均衡和反向代理的程序.(见第15章)
记录
如果你使用FastCGI,你同样可以分离前台的web服务器,并在多台其他机器上运行FastCGI服务器来实现相同的负载均衡的功能。 前台的服务器就相当于是一个均衡器,而后台的FastCGI服务进程代替了Apache/mod_python/Django服务器。
现在我们拥有了服务器集群,我们的构架慢慢演化,越来越复杂,如图20-4。
图 20-4: 负载均衡的服务器设置。
值得一提的是,在图中,Web服务器指的是一个集群,来表示许多数量的服务器。 一旦你拥有了一个前台的均衡器,你就可以很方便地增加和删除后台的Web服务器,而且不会造成任何网站不可用的时间。
慢慢变大
下面的这些步骤都是上面最后一个的变体:
-
当你需要更好的数据库性能,你可能需要增加数据库的冗余服务器。 MySQL内置了备份功能;PostgreSQL应该看一下Slony (http://www.djangoproject.com/r/slony/) 和 pgpool (http://www.djangoproject.com/r/pgpool/) ,这两个分别是数据库备份和连接池的工具。
-
如果单个均衡器不能达到要求,你可以增加更多的均衡器,并且使用轮训(round-robin)DNS来实现分布访问。
-
如果单台媒体服务器不够用,你可以增加更多的媒体服务器,并通过集群来分布流量
-
如果你需要更多的高速缓存(cache),你可以增加cache服务器。
-
在任何情况下,只要集群工作性能不好,你都可以往上增加服务器
重复了几次以后,一个大规模的构架会像图20-5。
图 20-5。 大规模的Django安装。
如果你有大笔大笔的钱,遇到扩展性问题时,你可以简单地投资硬件。 对于剩下的人来说,性能优化就是必须要做的一件事。
注意
不幸的是,性能优化比起科学来说更像是一种艺术,并且这比扩展性更难描述。 如果你真想要构建一个大规模的Django应用,你需要花大量的时间和精力学习如何优化构架中的每一部分。
最近即使那些昂贵的RAM也相对来说可以负担的起了。 购买尽可能多的RAM,再在别的上面投资一点点
高速的处理器并不会大幅度地提高性能;大多数的Web服务器90%的时间都浪费在了硬盘IO上。 当硬盘上的数据开始交换,性能就急剧下降。 更快速的硬盘可以改善这个问题,但是比起RAM来说,那太贵了。
如果你拥有多台服务器,首要的是要在数据库服务器上增加内存。 如果你能负担得起,把你整个数据库都放入到内存中。 这应该不是很困难,我们已经开发过一个站点上面有多于一百万条报刊文章,这个站点使用了不到2GB的空间。
下一步,最大化Web服务器上的内存。 最理想的情况是,没有一台服务器进行磁盘交换。 如果你达到了这个水平,你就能应付大多数正常的流量。
禁用 Keep-Alive
Keep-Alive 是HTTP提供的功能之一,它的目的是允许多个HTTP请求复用一个TCP连接,也就是允许在同一个TCP连接上发起多个HTTP请求,这样有效的避免了每个HTTP请求都重新建立自己的TCP连接的开销。
这一眼看上去是好事,但它足以杀死Django站点的性能。 如果你从单独的媒体服务器上向用户提供服务,每个光顾你站点的用户都大约10秒钟左右发出一次请求。 这就使得HTTP服务器一直在等待下一次keep-alive 的请求,空闲的HTTP服务器和工作时消耗一样多的内存。
使用 memcached
尽管Django支持多种不同的cache后台机制,没有一种的性能可以 接近 memcached。 如果你有一个高流量的站点,不要犹豫,直接选择memcached。
经常使用memcached
当然,选择了memcached而不去使用它,你不会从中获得任何性能上的提升。 Chapter 15 is your best friend here: 学习如何使用Django的cache框架,并且尽可能地使用它。 大量的可抢占式的高速缓存通常是一个站点在大流量下正常工作的唯一瓶颈。