记录vultr搭建https爬虫代理

最新推荐文章于 2023-07-26 23:08:22 发布

weixin_46468720

最新推荐文章于 2023-07-26 23:08:22 发布

阅读量1.2k

点赞数

分类专栏： linux 文章标签： squid

本文链接：https://blog.csdn.net/weixin_46468720/article/details/118570548

版权

linux 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

记录vultr搭建https爬虫代理

记录vultr搭建https爬虫代理
到这里就完成成功了
- 附上自己的squid配置文件，做个备份

记录vultr搭建https爬虫代理

要用python请求谷歌翻译把中文翻译成英文然后发不到网站做seo用，需要用到代理，网上找的教程，没有成功过，主要是写的不清不楚,记录下折腾的过程，排版不会用，将就看着吧
这个教程我自己测试了几台centos7.X的服务器全部测试ok

准备

一台vultr 5刀一个月的服务器,关闭selinux；
centos7或者8，先关闭防火墙，等所有访问都没问题了再配置端口
安装squid: yum install squid；
安装httpd-tools : yum -y install httpd-tools,生成账号密码用；
先配置好http的代理没问题了再配置https的；

安装squid和httpd-tools，直接yum安装

1.[root@vultr ~]#yum install squid
2.[root@vultr ~]#yum -y install httpd-tools

设置认证的账号密码

1.创建密码文件并给权限
注:/etc/squid/passwd 这个文件名称你可以改成自己的，改过之后的话配置文件里的名称也要改成一样的

命令：[root@vultr ~]#touch /etc/squid/passwd && chown squid /etc/squid/passwd

2.创建用户密码
注: yourusername 是你自己设置的用户名

命令: [root@vultr ~]#htpasswd /etc/squid/passwd yourusername

按回车后会提示你输入两次密码,密码最好不要超过8位吧

3.验证账号密码文件和格式是否正确

命令: [root@vultr ~]#/usr/lib64/squid/basic_ncsa_auth /etc/squid/passwd
输入命令后，再直接输入用户名空格密码
比如: yourusername yourpassword
输入后如果提示ok说明没问题，按ctrl+c中断，继续下一步，如果提示err，就是密码格式或者密码文件有问题，继续上一步重新设置用户名密码;如果还有问题，可能是httpd-tools的问题，下载重装下http2.2的版本不要用2.4的，我好像没遇到过这种问题

配置squid，支持http代理

命令: [root@vultr ~]#vi /etc/squid/squid.conf

1.在配置文件的acl代码块下添加
找到最后一行acl 开头的，按o在下一行插入下面的代码

auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwd
auth_param basic children 5
auth_param basic realm Squid Basic Authentication
auth_param basic credentialsttl 2 hours
acl auth_users proxy_auth REQUIRED
http_access allow auth_users

2.在INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS下一行设置dns服务器地址
dns服务器地址: dns_nameservers 8.8.8.8
设置完效果:

#INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
dns_nameservers 8.8.8.8

3,注释掉 http_access denny all 或者改成 http_access allow auth_users 和http_access allow auth_users 中的一个
改完后效果:

#And finally deny all other access to this proxy
http_access allow auth_users

4,修改端口号，把原来的3128改成你自己想要的端口号

#Squid normally listens to port 3128
http_port 21828

5,设置高匿

#文件最后加上高匿配置

request_header_access X-Forwarded-For deny all
request_header_access From deny all
request_header_access Via deny all

6,保存squid配置文件,重启squid服务
[root@vultr ~]#systemctl restart squid

http代理配置完成,谷歌安装SwitchyOmega测试成功

设置好SwitchyOmega，访问ip138,出现你的服务器ip说明成功，但是这时还不支持https的网站访问，因为我是要访问https的谷歌翻译，所以继续折腾

需要说明，连接的时候要用解析的二级域名去连接，用ip我连接不上，也不知道是什么问题

配置支持https

1,准备一个二级域名申请个腾讯的免费ssl证书，我查网上的教程用openssl生成服务器自签证书一直不成功
2,域名解析好，证书上传服务器
3,修改配置文件

vi /etc/squid/squid.conf
# Squid normally listens to port 3128
http_port 21828 #默认的http的代理端口，改成你自己的或者注释掉
#6618这个端口，默认是443的，我改成自己的
https_port 6618 cert=/etc/squid/cert/你的证书名字.crt key=/etc/squid/cert/你的证书名字.key

# Uncomment and adjust the following to add a disk cache directory.
cache_dir ufs /var/spool/squid 100 16 256 #这前面的注释去掉

#找到下面代码，应该在文件最后
# Add any of your own refresh_pattern entries above these.
#
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern -i (/cgi-bin/|\?) 0     0%      0
refresh_pattern .               0       20%     4320
via off #新增
forwarded_for delete #新增
dns_v4_first on #新增
#文件最后加上  高匿配置  #新增
request_header_access X-Forwarded-For deny all
request_header_access From deny all
request_header_access Via deny all

4，保存重启squid服务
测试连接https网站，成功

配置防火墙

centos7.X的防火墙命令

[root@vultr ~]# firewall-cmd --zone=public --add-port=6618/tcp --permanent
success
[root@vultr ~]# firewall-cmd --reload
success
[root@vultr ~]# firewall-cmd --zone=public --add-port=21828/tcp --permanent
success
[root@vultr ~]# firewall-cmd --reload
success

centos6.X防火墙

[root@vultr ~]# vi /etc/sysconfig/iptables
#在文件中加入
-A INPUT -m state --state NEW -m tcp -p tcp --dport 6618 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 21828 -j ACCEPT
[root@vultr ~]# systemctl restart iptables

到这里就完成成功了

附上自己的squid配置文件，做个备份

#
# Recommended minimum configuration:
#
# Example rule allowing access from your local networks.
# Adapt to list your (internal) IP networks from where browsing
# should be allowed
acl localnet src 10.0.0.0/8     # RFC1918 possible internal network
acl localnet src 182.110.0.0/12  # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl localnet src fc00::/7       # RFC 4193 local private network range
acl localnet src fe80::/10      # RFC 4291 link-local (directly plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80          # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl CONNECT method CONNECT

auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwd
auth_param basic children 5
auth_param basic realm Squid Basic Authentication
auth_param basic credentialsttl 2 hours
acl auth_users proxy_auth REQUIRED
http_access allow auth_users

#
# Recommended minimum Access Permission configuration:
#
# Deny requests to certain unsafe ports
http_access deny !Safe_ports

# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

# Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager

# We strongly recommend the following be uncommented to protect innocent
# web applications running on the proxy server who think the only
# one who can access services on "localhost" is a local user
#http_access deny to_localhost

#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
dns_nameservers 8.8.8.8

# Example rule allowing access from your local networks.
# Adapt localnet in the ACL section to list your (internal) IP networks
# from where browsing should be allowed
http_access allow localnet
http_access allow localhost

# And finally deny all other access to this proxy
http_access allow auth_users 

# Squid normally listens to port 3128
http_port 21828 

https_port 6618 cert=/etc/squid/cert/pachong.com.crt key=/etc/squid/cert/pachong.com.key
# Uncomment and adjust the following to add a disk cache directory.
cache_dir ufs /var/spool/squid 100 16 256 #这个要打开

# Leave coredumps in the first cache dir
coredump_dir /var/spool/squid

#
# Add any of your own refresh_pattern entries above these.
#
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern -i (/cgi-bin/|\?) 0     0%      0
refresh_pattern .               0       20%     4320
via off
forwarded_for delete
dns_v4_first on
#文件最后加上  高匿配置
request_header_access X-Forwarded-For deny all
request_header_access From deny all
request_header_access Via deny all

weixin_46468720

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
记录vultr搭建https爬虫代理

记录vultr搭建https爬虫代理记录vultr搭建https爬虫代理准备安装squid和httpd-tools，直接yum安装设置认证的账号密码配置squid，支持http代理http代理配置完成,谷歌安装SwitchyOmega测试成功配置支持https记录vultr搭建https爬虫代理要用python请求谷歌翻译把中文翻译成英文然后发不到网站做seo用，需要用到代理，记录下折腾的过程，排版不会用，将就看着吧准备一台vultr 5刀一个月的服务器,关闭selinux；centos7或者8
复制链接

扫一扫

专栏目录